Thursday, September 3, 2009

Activity 16 - Neural Networks

For the third time after Activity 14 and 15, grouping objects belonging to different classes is done in this acitvity. The artificial neural network (ANN) algorithm is implemented here to classify these sets of objects.

Again, the features of a sample which are the pixel area, normalized chromaticity values in red, green, and blue obtained from Activity 14 are used in this activity. This time, 3 classes from Activity 14 which are Snacku rice crackers, Superthin biscuits, and Nova multigrain snack with 10 samples each are accounted here unlike from Activity 15 which considers only 2 classes. These are shown below.



ANN recognizes patterns through learning by example. This is similar to how the human brain works in classifying objects. Once it has learned after some training cycles, ANN can quickly and accurately operate pattern processing.
The model of an artificial neuron which is the basic unit of ANN is illustrated as follows.



From the model above, the weighted inputs x_1, x_2, ..., x_i multiplied to their corresponding synaptic strengths or weights w_1, w_2, ..., w_i of a neuron are summed to produce a which then acts to the activation function g. Then, the output z is transmitted to other neurons as part of the ANN's learning period.
ANN has 3 layers namely the input layer, hidden layer, and the output layer. The input layer consists the features that are then transmitted to every neuron of the hidden layer noting that all neurons of the preceding layer are interconnected to every neurons of the succeeding layer. The results are then passed on to the output layer which determines the class to where the input features belong. The following is an illustration of a simple ANN.



Using the ANN_Toolbox_0.4.2 loaded in Scilab, built-in functions ann_FF_init, ann_FF_Std_online, and ann_FF_run are employed to simulate an ANN to classify 15 test samples. The first function ann_FF_init initializes the network which has l number of neurons in the input layer, m number of neurons in the hidden layer, and n number of neurons in the output layer. This then becomes the weight of ann_FF_Std_online training period. The features of the training set, their corresponding class matrix, the network, its learning rate, and the number of training cycles are other arguments of the second function. The argument learning rate is together with the threshold of the error tolerated by the network which is usually set to 0. The last built-in function ann_FF_run with weight as the result of the second function is finally applied to the features of the 15 test samples. This also has the argument of the network same in the first two functions. It returns the classes to where each sample belongs.
Note that the 30 samples are again divided into two where the first serves as the training set with 5 samples in each class for the ANN to learn and the second represents the test set to be classified, again with 5 samples in each class. The samples of the training set have class matrix [0, 0, 0, 0, 0, 0.5, 0.5, 0.5, 0.5, 0.5, 1, 1, 1, 1, 1] because it is observed that the built-in functions in the ANN toolbox of Scilab operate only between 0 and 1. However, the output classes are made to return rounded integer values of 0 for Snacku, 1 for Superthin, and 2 for Nova. Furthermore, it is ascertained that these functions cannot recognize features greater than 1. Since the pixel area feature is obviously greater than 1, it is normalized for the training and test samples. Meanwhile, the network used is composed of 4 neurons in the input layers where each neuron represents a feature, 8 neurons in the hidden layer which may be altered to multiples of 4, and 1 neuron in the output layer corresponding to the class assigned to a test sample.
The tables below are the summary of the test samples' classification where the training parameters are varied to observe the ANN's behavior (click the tables for larger view).




The first table above has 500 training cycles while the table below it has 1000 training cycles. Both are tested with learning rates of 1.0 and 5.0. The 15 test samples are classified correctly for all different training parameters used. Moreover, it can be observed that as the number of training cycles and the learning rate are increased, the outputs approach 0, 0.5 and 1 and thus approach 0 for Snacku, 1 for Superthin, and 2 for Nova.

The order of the test samples are also randomized to see if they are still to be grouped correctly. Here, 1000 training cycles and learning rate of 5.0 are employed. The results are summarized in the table as follows (click the table for larger view).



Apparently, all test samples are still grouped appropriately to their corresponding classes even these inputs are in different order. Thus, ANN's class recognition does not require inputs in arranged manner.

I grade myself 10/10 in this activity because I have successfully implemented ANN for class recognition and all test samples are grouped correctly.

Thanks to Miguel who sent a working version of the ANN toolbox for Scilab 4.1.2 for the whole class.

Appendix
The following Scilab code is utilized in this activity. This is based from the code of Mr. Cole Fabros in his blog (http://cole-ap186.blogspot.com) where it is mentioned that the code is originally made by Mr. Jeric Tugaff (Both are 2008 AP 186 students).

snacku_training = fscanfMat('snacku_training.txt');
superthin_training = fscanfMat('superthin_training.txt');
nova_training = fscanfMat('nova_training.txt');
snacku_test = fscanfMat('snacku_test.txt');
superthin_test = fscanfMat('superthin_test.txt');
nova_test = fscanfMat('nova_test.txt');

snacks_training = [snacku_training; superthin_training; nova_training];
snacks_training(:,1) = snacks_training(:,1)/max(snacks_training(:,1));
snacks_training = snacks_training';
snacks_test = [snacku_test; superthin_test; nova_test];
snacks_test(:,1) = snacks_test(:,1)/max(snacks_test(:,1));
snacks_test = snacks_test';

rand('seed', 0);

network = [4, 8, 1];
groupings = [0 0 0 0 0 0.5 0.5 0.5 0.5 0.5 1 1 1 1 1];
learning_rate = [5, 0];
training_cycle = 1000;

index = [1:15]';
random_index = grand(1, 'prm', index);

random_snacks = [];
for i = 1:length(index)
random_snacks(:,i) = snacks_test(:,random_index(i));
end

training_weight = ann_FF_init(network);
weight = ann_FF_Std_online(snacks_training, groupings, network, training_weight, learning_rate, training_cycle);
class = ann_FF_run(random_snacks, network, weight);

for j = 1:length(index)
if class(j) < 0.2
class(j) = class(j);
elseif 0.4 < class(j) & class(j) < 0.6
class(j) = 0.5 + class(j);
else
class(j) = 1.0 + class(j);
end
end

rclass = round(class);

No comments:

Post a Comment