Ko, Albert Hung-Ren (2007). Static and dynamic selection of ensemble of classifiers. Thèse de doctorat électronique, Montréal, École de technologie supérieure.
Prévisualisation |
PDF
Télécharger (7MB) | Prévisualisation |
Résumé
Nous présentons dans cette thèse plusieurs solutions novatrices pour tenter de solutionner trois problèmes fondamentaux reliés à la conception des ensembles de classifieurs: la génération des classificateurs, la sélection et la fusion.
Une nouvelle fonction de fusion (Compound Diversity Function - CDF) basée sur la prise en compte de la performance individuelle des classificateurs et de la diversité entre pairs de classificateurs.
Une nouvelle fonction de fusion basée sur les matrices de confusions "pairwise" (PFM), mieux adaptée pour la fusion des classificateurs en présence d'un grand nombre de classes.
Une nouvelle méthode pour générer des ensembles de Mo- dèles de Markov Cachés (Hidden Markov Models - EoHMM) pour la reconnaissance des caractères manuscrits.
Une solution novatrice repose sur le concept des Oracles associés aux données de la base de validation (KNORA).
Une nouvelle approche pour la sélection des sous-espaces de représentation à partir d'une mesure de diversité évaluée entre les paires de partitions.
Titre traduit
Sélection statique et dynamique des ensembles de classificateurs pour la reconnaissance de chiffres manuscrits
Résumé traduit
This thesis focuses on different techniques of ensemble of classifier (EoC) methods that will help improve pattern recognition results.
Pattern recognition can, in general, be regarded as a problem of classification, where different patterns are presented and we need to classify them into specified classes. We create classifiers to perform the classification task. One way to improve the recognition rates of pattern recognition tasks is to improve the accuracy of individual classifiers, and another is to apply ensemble of classifiers (EoC) methods. EoC methods use multiple classifiers and combine their outputs. In general, the combined results of these multiple classifiers can be significantly better than those of the single best classifier. In this thesis, we only look into the techniques that improve EoC accuracy and not those that improve the accuracy of a single classifier.
Three major topics are associated with EoCs: ensemble creation, ensemble selection and classifier combination. In this thesis, we propose a new ensemble creation method for an ensemble of hidden markov models (EoHMM), three methods for ensemble selection for different circumstances, and a classifier combination method.
First and foremost, we propose compound diversity functions (CDF), which combine diversities with the performance of each individual classifier, and show that there is a strong correlation between the proposed functions and ensemble accuracy. We will demonstrate that most compound diversity functions are better than traditional diversity measures.
We also propose a pairwise fusion matrix (PFM) transformation, which produces reliable probabilities for the use of a classifier combination and can be amalgamated with most existing fusion functions for combining classifiers. The PFM requires only crisp class label outputs from classifiers, and is suitable for high-class problems or problems with few training samples. Experimental results suggest that the performance of a PFM can be a notch above that of the simple majority voting rule (MAJ), and that a PFM can work on problems where a Behavior Knowledge Space (BKS) might not be applicable.
Also proposed here is a new scheme for the optimization of codebook sizes for HMMs and the generation of HMM ensembles. By using a pre-selected clustering validity index, we show that HMM codebook size can be optimized without training HMM classifiers. Moreover, the proposed scheme yields multiple optimized HMM classifiers, and each individual HMM is based on a different codebook size.
Two other alternative ensemble selection methods are also proposed here: a dynarnic ensemble selection method, and a classifier-free ensemble selection method. The former applies different ensembles for test patterns, and the experimental results suggest that in some cases it performs better than both static ensemble selection and dynamic classifier selection. The latter explores the idea of "data diversity" for data subset selection. We try to select adequate feature subsets for Random Subspaces, and only use the select data subsets to create classifiers.
The main objective of the proposed methods is to offer applicable approaches that might advance the state of the art. But EoC optimization is a very complex issue and is related to a number of varied processes, and our contribution is intended merely to provide an improved understanding of the use of EoCs.
Type de document: | Mémoire ou thèse (Thèse de doctorat électronique) |
---|---|
Renseignements supplémentaires: | "A thesis presented to the École de technologie supérieure in partial fulfillment of the thesis requirement for the degree of the Ph.D. engineering". Bibliogr. : f. [237]-246. |
Mots-clés libres: | Chiffre, Classificateur, Combinaison, Conception, Diversite, Dynamique, Ensemble, EoC, Fusion, Generation, Manuscrit, Reconnaissance, Selection, Statique |
Directeur de mémoire/thèse: | Directeur de mémoire/thèse Sabourin, Robert |
Programme: | Doctorat en génie > Génie |
Date de dépôt: | 11 avr. 2011 14:45 |
Dernière modification: | 30 nov. 2016 22:18 |
URI: | https://espace.etsmtl.ca/id/eprint/581 |
Gestion Actions (Identification requise)
Dernière vérification avant le dépôt |