Pattern recognition is a branch of machine
learning . Applications include handwriting and speech recognition,
medical diagnosis, fingerprint verification and face recognition.
What is a pattern? Can machines recognize patterns?
The concept of pattern has emerged from sensorial perception. A set of
perceptual measurements of the visual or auditory system that is "easily"
recognizable is traditionally referred to as a pattern. Images of random
pixels would not be considered "patterns" while images of simple line shapes
like characters would. Some pattern recognition systems attempt to substitute
themselves to humans or assist humans in recognition tasks. For example,
the recognition of zip codes on postal envelopes can be automated. Another
example is the recognition of spoken digits in an automated phone answering
system.
Pattern recognition applied to image or speech processing has reach
the level of maturity necessary to automate printed text reading and to
use handwriting and speech as a computer interface (to some extend). Humans
still outperform machines in elaborate tasks, but machines are useful substitutes.
Simplifying the task of the machine is sometimes required to obtain levels
of accuracy required by given applications. For example, hand held computers
like the Palm Pilot use a simplified alphabet of unistroke characters.
But, pattern recognition is not limited to sets of measurements resembling
perceptual patterns easily identifiable by humans. Physico-chemical separation
techniques such as spectrometry's, electrophoresis, chromatography, and
more recent assays like DNA microarrays provide patterns that often leave
the human eye clue less. Automatic pattern recognition provides ways of
categorizing such patterns. When examples of known categories are given
(e.g. examples of disease patient and normal patients), it is possible
to device a system that will correctly categorize new examples, even when
it is a non-trivial task for humans.
What does the pattern recognition process consist of?
The so-called "raw data" is the set of measurements provided by a sensor
(e.g. the pixels of an image provided by a digital camera). The first steps
of the pattern recognition process are pre-processing and feature extraction
that may include some signal processing such as smoothing and noise filtering
and the extraction of higher level features for which human knowledge about
the task is essential.
The next step in the process is to choose and "train" a recognizer.
While it is possible to hand-craft a recognizer, benchmark results have
proved that better recognition accuracy is obtained with systems that have
tunable parameters that can be adjusted to classify correctly a set of
given examples (called training examples). The category of the training
examples (e.g. disease or normal patients) must be known. A recognizer
is a particular learning machine. Depending on the
nature and quantity of data available an appropriate learning machine must
be selected. We have experience with a wide variety of learning machines,
ranging from classical statistical methods like
"nearest neighbors" to sophisticated structured neural networks and support
vector machines.
For some recognition tasks, the recognizer can be complemented by a
systems that takes contextual information into account. For example, in
handwriting or speech recognition applications, language models or grammars
are often incorporated in the postprocessing stage. We have experience
with statistical grammars to improve the recognition process. When possible,
we will globally optimize all the parameters of the system, including preprocessor,
recognizer and postprocessor to improve the accuracy of your system.
Pattern recognition also involves many other sub tasks that are problem
specific, including data cleaning and automatic feature selection that
are important in system optimization as well as data understanding.