Syed Haseeb Baber

Tuesday, March 29, 2011

Neural Networks and Perceptron

Biological Definition:
Neural Network can be defined as the bonding of neural cell(s) in the form of chain which responds and react according to the information it received and then transmit it to next cell.In Human Brain the neural network is highly complex and well-connected which results in the storage and retrieval of information.

Mathematical Definition:
An artificial neural network (ANN), usually called neural network (NN), is a mathematical model or computational model that is inspired by the structure and/or functional aspects of biological neural networks. A neural network consists of an interconnected group of artificial neurons, and it processes information using a connectionist approach to computation. In most cases an ANN is an adaptive system that changes its structure based on external or internal information that flows through the network during the learning phase. Modern neural networks are non-linear statistical data modeling tools. They are usually used to model complex relationships between inputs and outputs or to find patterns in data.

[1] Perceptrons are the easiest data structures to learn for the study of Neural Networking. Think of a perceptron as a node of a vast, interconnected network, sort of like a binary tree, although the network does not necessarily have to have a top and bottom. The links between the nodes not only show the relationship between the nodes but also transmit data and information, called a signalor impulse. The perceptron is a simple model of a neuron (nerve cell).

Since linking perceptrons into a network is a bit complicated, let's take a perceptron by itself. A perceptron has a number of external input links, one internal input (called a bias), a threshold, and one output link. To the right, you can see a picture of a simple perceptron. It resembles a neuron.

Usually, the input values are boolean (that is, they can only have two possible values: on and off, 1 or 0, true or false), but they can be any real number. The output of the perceptron, however, is always boolean. When the output is on (has the value 1), the perceptron is said to be firing (the name comes from biology: when neurons send a signal in the brain, they are said to be firing).

All of the inputs (including the bias) have weights attached to the input line that modify the input value. The weight is just multiplied with the input, so if the input value was 4 and the weight was -2, the weighted input value would be -8.

The threshold is one of the key components of the perceptron. It determines, based on the inputs, whether the perceptron fires or not. Basically, the perceptron takes all of the weighted input values and adds them together. If the sum is above or equal to some value (called thethreshold) then the perceptron fires. Otherwise, the perceptron does not. So, it fires whenever the following equation is true (where wrepresents the weight, and there are n inputs):

The threshold is like a wall: if the "signal" has enough "energy" to jump over the wall, then it can keep going, but otherwise, it has to stop. Traditionally, the threshold value is represented either as the Greek letter theta (the symbol inside the circle in the picture above) or by a graphical symbol that looks like a square S:

The main feature of perceptrons is that they can be trained (or learn) to behave a certain way. One popular beginner's assignment is to have a perceptron model (that is, learn to be) a basic boolean function such as AND or OR. Perceptron learning is guided, that is, you have to have something that the perceptron can imitate. So, the perceptron learns like this: it produces an output, compares the output to what the outputshould be, and then adjusts itself a little bit. After repeating this cycle enough times, the perceptron will have converged (a technical name for learned) to the correct behavior.

This learning method is called the delta rule, because of the way the perceptron checks its accuracy. The difference between the perceptron's output and the correct output is assigned the Greek letter delta, and the Weight i for Input i is altered like this (the i shows that the change is separate for each Weight, and each weight has its corresponding input):

Change in Weight i = Current Value of Input i × (Desired Output - Current Output)

This can be elegantly summed up to:

The delta rule works both if the perceptron's output is too large and if it is too small. The new Weight i is found simply by adding the change forWeight i to the current value of Weight i.

Interestingly, if you graph the possible inputs on different axes of a mathematical graph, with pluses for where the perceptron fires and minuses where the perceptron doesn't, the weights for the perceptron make up the equation of a line that separates the pluses and the minuses.

For instance, in the picture above, the pluses and minuses represent the OR binary function. With a little bit of simple algebra, you can transform that equation in the diagram to the standard line form in which the weights can be seen clearly. (You get the following equation of the line if you take the firing equation and replace the "greater than or equal to" symbol with the equal sign).

This equation is significant, because single perceptron can only model functions whose graphical models are linearly separable. So, if there is no line (or plane, or hyperplane, etc. depending on the number of dimensions) that divides the fires and the non-fires (the pluses and minuses), then it isn't possible for the perceptron to learn to behave with that pattern of firing. For instance, the boolean function XOR is not linearly separable, so you can't model this boolean function with only one perceptron. The weight values just keep on shifting, and the perceptron never actually converges to one value.

So, by themselves, perceptrons are a bit limited, but that is their appeal. Perceptrons enable a pattern to be broken up into simpler parts that can each be modeled by a separate perceptron in a network. So, even though perceptrons are limited, they can be combined into one powerful network that can model a wide variety of patterns, such as XOR and many complex boolean expressions of more than one variable. These algorithms, however, are more complex in arrangement, and thus the learning function is slightly more complicated. For many problems (specifically, the linearly separable ones), a single perceptron will do, and the learning function for it is quite simple and easy to implement. The perceptron is an elegantly simple way to model a human neuron's behavior. All you need is the first two equations shown above.

*[1] Perceptrons and basic neural networks BY Eric Suh...

Wednesday, May 26, 2010

Pattern Recognition Using PCA

Pattern Recognition:

     Pattern recognition deals in variety of daily life activities which we usually practice. This is the quality of human brain which can distinguish and differentiate between any change in pattern and even memorizing new pattern. For example if you are going in a market you get an idea about each n every shop and things present and if you continue to go again n again Your Brain exactly memorize each n every pattern you observe.IN case of any change ,the brain intimate the change and soon in no time you realize what the change is.
    Such a capability of human brain is termed as pattern recognition. This term is specially termed for the computers science when there evolved need for the recognition through computers. The main reason of choosing the computers for this purpose is the fast processing and vast memory availability.
Computers really can’t memorize anything like human brain but it’s the algorithms and techniques which make them able to do this. Moreover, we are not dealing in data storage like we do with computers.
   Most widely used technique for pattern recognition is PCA based pattern recognition. Let me mention PCA is statistical technique which is used for the classification and dimension reduction of very large amount of data.

Daily Life Application(s):
In the following fields, the PCA based technique is widely used:

Face recognition technology
Cloud classification technology
Palm based recognition
IRIS based recognition
Population data statistics
DNA based Pattern recognition

What is PCA?
If you had read the last blog you will came to know about the use and benefits of using PCA in daily life applications. Moving onward it’s time to describe the PCA in detail.

Face Database:

The very first task in PCA is collecting faces of different persons. One can download the Face collection from database uploaded for free use by CASIA
(http://www.cbsr.ia.ac.cn/english/index.asp)
The database contains images of more than 50 persons with frontal and side wise positions and there are 10 pictures each person’s covering every possible position.

Image Space:
Image Space is created by following these steps:

Read the binary image i.e. BMP in 2 dimensional array.
Concatenate each row in image making only one row i.e. the rows of the image are placed each beside one another which is termed as vector or row vector.
Loop through (I) and (II) until all images are concatenated in separate row.
Now each row represents a face or image making a heavily populated 2 dimensional array.
The newly created 2 dimensional array is denoted as Image Space or Face Space.

Face Space:
The image space is not an optimal space for face description. We need to build a face space, which better describes the face. The basis vector of this space is called the principal components.
Compute the average mean face by adding each corresponding column entity and then dividing the total by total number of face images i.e. Let X is the training set which is actually the face images so we would have:
Xn where n=total number of faces
Each image Xi is converted into column vector i.e. Yi
Add each Yi where i=1 to n

Divide the total by n resulting in mean value of each column.

Mean Face Reduction:
Ψ represent the mean face extracted from all images. Now subtract this mean face from
each row to get a new set of vectors.
Фi = Yi – Ψ
Ф is termed as covariance matrix in which common features of each face is eliminated.

Covariance Matrix:
As the covariance matrix is (m*n) dimensional array so next step is to find its determinant.
In real world mathematics, finding the determinant of such a hug matrix is not simple.
The equation for finding the determinant of such matrix is stated below:
[IMAGE]

Computing Eigenvectors:
Computing the eigenvectors of [A*(A^T)] is not feasible computationally. We can determine the eigenvectors by first solving the much smaller M x M matrix problem and taking linear combination of the resulting vectors:

[IMAGE]

These vectors are subjected to principle component analysis which finds a set of M orthogonal

and their eigenvalues to describe the distribution of data. So, we get M eigenvectors and
Eigenvalues.

Keeping Eigenvectors:
Keep only these eigenvectors which describes the image best as explained before. In normal practice, first 20 values are termed as eigenvectors which be used in future for the recognition of the faces used in whole process.

Recognition Procedure Using Eigen faces:
Given an unknown face image of the same size like training image, this image is projected into“face space” by following operations.
Where the M’ is eigenvectors selected. These weights form a vector Ωt = [w1, w2, .……., wk]
This describes the contribution of each eigenface in representing the input face image, treating
the eigenfaces as basis set for face images. The vector is used to find the face images. The vector issued to find which of the number of predefined face classes if any best describes the face. The simplest method used for this purpose is to find Euclidean distance where θє is
some chosen threshold.

FRR and FAR:
False Rejection Rate (FRR) and False acceptance Rate are two standard performance evaluation metrics. They are inversely proportional metrics. Generally the biometrics systems are based on the principle of threshold. Therefore a variable threshold is used for the customers to strike a balance,the other metrics are learning time, execution time, the number of samples required in training.

Tuesday, May 25, 2010

Face Recognition Technology

Biometrics based recognition is always being the most versatile and demanding method for the security implementation.since 50s scientist working on many human factors which could be used for the authentication.Majorly,the face photograph is used for the verification of the person and the major areas of use are National Identity Card,Passport,Driving Licenses etc etc.

In the late 90s when computer has been evolved upto great extent that it covers almost every field of life,authentication and accurate verification also became a great need of each & every department and organization.

After the success of Fingerprint recognition,The next Step was the Face Recognition technology.

Many approaches are introduced for the face detection and recognition on which many of the research department spent a huge amount to get the best out if it.

In computer science,we are lucky enough to use mathematical and statistical methods for the face detection & recognition.One of the most widely used technique is PCA based Face recognition which clams to give the 70-90% of the accurate results.

PCA stands for the Principal Component analysis which is basically used for the dimensional reduction or compression of data into meaningful parameters.Typically,when dealing with very large amount of data,PCA is always preferred over other technique due to its simplicity and accuracy.

Another remarkable benefit of this technique is its less demand for processing power if we are using computers for the calculation and even in case new entity or person is required to add in training set,it provides a robust method which results in fast training and then recognition.

In other techniques,if new person is required to add,then whole network need to calculate the parameters from 1st person to last one which required lots of processing and time.

Friday, February 20, 2009

face recognition technology

The security of information and physical property is very important and difficult in today’s world. The credit card frauds and other miscellaneous activities by hacker’s and security breakings in government and other organizations has been one of the problems of modern days.

In the year 1998, sophisticated cyber crooks caused well over US $ 100 Million in loses (Reuters,

1999 [7]. our typical access control system will grant access through passwords, pin-codes etc.

while this strategy does not verify who we are,which off course does not guarantee our

security.

Biometrics technology is solution to such problems because it ensures the authentication

based on true identity of an individual.

Biometrics techniques are divided into two categories: Physiological and Psychological

.Physiological methods are more stable e.g.Face, DNA, Fingerprints. They usually sustain

for whole life except in case of injuries.

The psychological biometrics is based on functional aspects such as voice, signature and handwriting

etc. These functional aspects (behavioral) may vary due to stress, illness. However people are

easier with behavioral ID’s.

So biometrics system is essentially pattern recognition system which makes a personal

identification by determining the authenticity of specific physiologic and psychological

characteristics of an individual.

Face recognition is one of the methods of biometrics which ensures our security and

people are comfortable with it. Numerous algorithms have been proposed for face

recognition. One of those algorithms for fast recognition is eigenfaces.

Despite the intense research in this field these are the problems that still remain open for

research.

Syed Haseeb Baber

Blog Page(s)