Page du profil de Oualid MISSAOUI
hors ligne
9 ans
5 ans
7 ans
Mes informations
Envoyer un message
New York, USA
Promo 2000
Quantitative Research Scientist
Pipeline Financial Group, Inc. New York
Machine Learning, Data Mining, Computational Finance, Econophysics
Generalized Multi-stream Hidden Markov Models
Hidden Markov Models, Multi-stream HMM, Information fusion, Model level fusion, Machine learning, Pattern recognition, supervised learning, unsupervised learning
In most real world applications, sequences need to be represented by several features to achieve reliable classification results. For complex classification systems, data is usually gathered from multiple sources of information that have varying reliability. In fact, assuming that the different sources have the same relevance in describing all the data might lead to an erroneous behavior.
This error accumulates and can be more severe for temporal data where each point is represented by multiple sequences (possibly multidimensional). This multiplicity is mostly due to different interpretations or multiple views of the raw data. In this dissertation, it is assumed that the multistream temporal data is generated by independent and synchronous streams. For temporal data, Hidden Markov Models (HMMs) have emerged during the past decades as a powerful paradigm for modeling. In fact, originally, HMMs have been applied to the domain of speech recognition, and
became the dominating technology. In recent years, they have attracted growing interest in diverse applications such as bioinformatics, landmine detection, handwritten character/word recognition, face recognition and other computer vision applications. To handle multi-stream sequences modeling, the standard HMM has been extended to more complex structures such as Factorial HMM, Coupled
HMM, Product HMM and multi-stream HMM. Since we assume independence and synchronicity of sequences, the focus of this dissertation is the multi-stream HMMs.
In fact, in the context of hidden Markov models, and for most real world applications, di erent modalities or streams can contribute to the generation of the sequence. In this case, the feature space generated by the di erent streams can be very sparse, and standard probability density
function estimation may not be effective. Consequently, HMMs cannot model the data as sequences that are dense in a subspace can have low average density in the original space.

In this context, multi-stream continuous HMM structure have been introduced in the literature.
In these structures, the feature space is partitioned into di erent subspaces, and a probability density function (pdf) is learned for each subspace. The relevance weights for each subspace could be xed a priori by an expert, or learned separately via Minimum Classification Error/Generalized Probabilistic Descent (MCE/GPD) since the derivation of maximum likelihood equations is not possible
unless the model is restricted to include only one Gaussian component per state. Thus, the focus of this dissertation is two fold. First, the multi-stream approach has been extended to the discrete case, and novel structure
are introduced for the multi-stream continuous HMM that allow for a maximum likelihood estimation. For the discrete case, we propose two new approaches to generalize the discrete HMM.
The first one combines unsupervised learning, feature discrimination, standard Discrete HMMs and weighted distances to learn the codebook with feature-dependent weights for each symbol. The second approach consists of extending the standard discrete Baum-Welch learning algorithm to include a feature discrimination component. For the continuous HMM, we introduce a new approach
based on the linearization of the probability density function. We generalize the continuous Baum-
Welch learning algorithm to accommodate these changes, and we derive the necessary conditions for updating the model parameters.
Second, the MCE/GPD discriminative training is generalized to handle the proposed multistream HMM structures. Accordingly, the proposed work attempts to overcome this limitation by adapting the HMM to di erent subspaces of the original feature space.
The proposed discrete and continuous HMM are tested on synthetic data sets. They are also validated on benchmarking data sets such as Australian Sign Language data, audio classification data and face vs non-face data. However, they are mainly tested and applied to the problem of landmine detection using ground penetrating radar data. For all cases, we show that considerable improvement
can be achieved compared to the standard HMM and the state of the art multi-stream HMM.
Master d'Ingenierie mathematique
Laboratoire d'Informatique Algorithmique, Fondements et Applications (LIAFA), Paris.
Data structures for symbolic reachability analysis in automata based models for timed systems with parametric bounds
Formal verification, timed automata, data structures
A novel data structure HCube is suggested for formal verification of extended automata. HCube represents sets of system configuration of, e.g., timed automata. HCubes are
canonical representation of Boolean function: interval formula, and allow for their efficient
manipulation. They also enable the extrapolation procedure which is extremely difficult by
other data structures. We present the methods necessary for our approach, compare its results
to another similar verification technique and analyze the extended PHcube for the verification
of parametric timed automata in order to achieve the extrapolation purpose.
Stage ingénieur
Bio-Data Carthago
Stage ouvrier
STEG Rades