* PR *ML *STAT *KDD *

Pattern Recognition


Pattern recognition is a branch of machine learning . Applications include handwriting and speech recognition, medical diagnosis, fingerprint verification and face recognition.

What is a pattern? Can machines recognize patterns?

The concept of pattern has emerged from sensorial perception. A set of perceptual measurements of the visual or auditory system that is "easily" recognizable is traditionally referred to as a pattern. Images of random pixels would not be considered "patterns" while images of simple line shapes like characters would. Some pattern recognition systems attempt to substitute themselves to humans or assist humans in recognition tasks. For example, the recognition of zip codes on postal envelopes can be automated. Another example is the recognition of spoken digits in an automated phone answering system.
Pattern recognition applied to image or speech processing has reach the level of maturity necessary to automate printed text reading and to use handwriting and speech as a computer interface (to some extend). Humans still outperform machines in elaborate tasks, but machines are useful substitutes. Simplifying the task of the machine is sometimes required to obtain levels of accuracy required by given applications. For example, hand held computers like the Palm Pilot use a simplified alphabet of unistroke characters.
But, pattern recognition is not limited to sets of measurements resembling perceptual patterns easily identifiable by humans. Physico-chemical separation techniques such as spectrometry's, electrophoresis, chromatography, and more recent assays like DNA microarrays provide patterns that often leave the human eye clue less. Automatic pattern recognition provides ways of categorizing such patterns. When examples of known categories are given (e.g. examples of disease patient and normal patients), it is possible to device a system that will correctly categorize new examples, even when it is a non-trivial task for humans.

What does the pattern recognition process consist of?

The so-called "raw data" is the set of measurements provided by a sensor (e.g. the pixels of an image provided by a digital camera). The first steps of the pattern recognition process are pre-processing and feature extraction that may include some signal processing such as smoothing and noise filtering and the extraction of higher level features for which human knowledge about the task is essential.
The next step in the process is to choose and "train" a recognizer. While it is possible to hand-craft a recognizer, benchmark results have proved that better recognition accuracy is obtained with systems that have tunable parameters that can be adjusted to classify correctly a set of given examples (called training examples). The category of the training examples (e.g. disease or normal patients) must be known. A recognizer is a particular learning machine. Depending on the nature and quantity of data available an appropriate learning machine must be selected. We have experience with a wide variety of learning machines, ranging from classical statistical methods like "nearest neighbors" to sophisticated structured neural networks and support vector machines.
For some recognition tasks, the recognizer can be complemented by a systems that takes contextual information into account. For example, in handwriting or speech recognition applications, language models or grammars are often incorporated in the postprocessing stage. We have experience with statistical grammars to improve the recognition process. When possible, we will globally optimize all the parameters of the system, including preprocessor, recognizer and postprocessor to improve the accuracy of your system.
Pattern recognition also involves many other sub tasks that are problem specific, including data cleaning and automatic feature selection that are important in system optimization as well as data understanding.


ClopiNet


955 Creston road
Berkeley, CA 94708, USA
1+ (510) 524-6211
Email: info at clopinet.com