Artificial Neural Network
This week we are seeing how to run an artificial neural network on Arduino Uno. We will learn about machine learning and the basics of AI. This neural network we're working on is a feed-forward backpropagation network. We are essentially training a neural network model. Although this concept is not new and based on complicated mathematics, it is not required to understand it to grasp how the network functions.
What is a neural network?
A neural network is a method in artificial intelligence that teaches computers to process data in a way that is inspired by the human brain. It is a type of machine learning process, called deep learning, that uses interconnected nodes or neurons in a layered structure that resembles the human brain. It creates an adaptive system that computers use to learn from their mistakes and improve continuously. Thus, artificial neural networks attempt to solve complicated problems, like summarizing documents or recognizing faces, with greater accuracy.
How does it work?
An artificial neural network is made of artificial neurons that work together to solve a problem. Artificial neurons are software modules, called nodes, and artificial neural networks are software programs or algorithms that, at their core, use computing systems to solve mathematical calculations.
Neural Network Architecture
A basic neural network has interconnected artificial neurons in three layers:
Input Layer :
Information from the outside world enters the artificial neural network from the input layer. Input nodes process the data, analyse or categorise it, and pass it on to the next layer.
Hidden Layer :
Hidden layers take their input from the input layer or other hidden layers. Artificial neural networks can have a large number of hidden layers. Each hidden layer analyses the output from the previous layer, processes it further, and passes it on to the next layer.
Output Layer :
The output layer gives the final result of all the data processing by the artificial neural network. It can have single or multiple nodes. For instance, if we have a binary (yes/no) classification problem, the output layer will have one output node, which will give the result as 1 or 0. However, if we have a multi-class classification problem, the output layer might consist of more than one output node.
In the sketch in this article features a set of training inputs and outputs which map the seven segments of an LED numerical display to the corresponding binary number. The network runs through the training data repetitively and makes adjustments to the weightings until a specified level of accuracy is achieved, at this stage the network is said to have been trained.
//Author: Ralph Heymsfeld //28/06/2018 #include <math.h> /****************************************************************** * Network Configuration - customized per network ******************************************************************/ const int PatternCount = 10; const int InputNodes = 7; const int HiddenNodes = 8; const int OutputNodes = 4; const float LearningRate = 0.3; const float Momentum = 0.9; const float InitialWeightMax = 0.5; const float Success = 0.0004; const byte Input[PatternCount][InputNodes] = { { 1, 1, 1, 1, 1, 1, 0 }, // 0 { 0, 1, 1, 0, 0, 0, 0 }, // 1 { 1, 1, 0, 1, 1, 0, 1 }, // 2 { 1, 1, 1, 1, 0, 0, 1 }, // 3 { 0, 1, 1, 0, 0, 1, 1 }, // 4 { 1, 0, 1, 1, 0, 1, 1 }, // 5 { 0, 0, 1, 1, 1, 1, 1 }, // 6 { 1, 1, 1, 0, 0, 0, 0 }, // 7 { 1, 1, 1, 1, 1, 1, 1 }, // 8 { 1, 1, 1, 0, 0, 1, 1 } // 9 }; const byte Target[PatternCount][OutputNodes] = { { 0, 0, 0, 0 }, { 0, 0, 0, 1 }, { 0, 0, 1, 0 }, { 0, 0, 1, 1 }, { 0, 1, 0, 0 }, { 0, 1, 0, 1 }, { 0, 1, 1, 0 }, { 0, 1, 1, 1 }, { 1, 0, 0, 0 }, { 1, 0, 0, 1 } }; /****************************************************************** * End Network Configuration ******************************************************************/ int i, j, p, q, r; int ReportEvery1000; int RandomizedIndex[PatternCount]; long TrainingCycle; float Rando; float Error; float Accum; float Hidden[HiddenNodes]; float Output[OutputNodes]; float HiddenWeights[InputNodes+1][HiddenNodes]; float OutputWeights[HiddenNodes+1][OutputNodes]; float HiddenDelta[HiddenNodes]; float OutputDelta[OutputNodes]; float ChangeHiddenWeights[InputNodes+1][HiddenNodes]; float ChangeOutputWeights[HiddenNodes+1][OutputNodes]; void setup(){ Serial.begin(9600); randomSeed(analogRead(3)); ReportEvery1000 = 1; for( p = 0 ; p < PatternCount ; p++ ) { RandomizedIndex[p] = p ; } } void loop (){ /****************************************************************** * Initialize HiddenWeights and ChangeHiddenWeights ******************************************************************/ for( i = 0 ; i < HiddenNodes ; i++ ) { for( j = 0 ; j <= InputNodes ; j++ ) { ChangeHiddenWeights[j][i] = 0.0 ; Rando = float(random(100))/100; HiddenWeights[j][i] = 2.0 * ( Rando - 0.5 ) * InitialWeightMax ; } } /****************************************************************** * Initialize OutputWeights and ChangeOutputWeights ******************************************************************/ for( i = 0 ; i < OutputNodes ; i ++ ) { for( j = 0 ; j <= HiddenNodes ; j++ ) { ChangeOutputWeights[j][i] = 0.0 ; Rando = float(random(100))/100; OutputWeights[j][i] = 2.0 * ( Rando - 0.5 ) * InitialWeightMax ; } } Serial.println("Initial/Untrained Outputs: "); toTerminal(); /****************************************************************** * Begin training ******************************************************************/ for( TrainingCycle = 1 ; TrainingCycle < 2147483647 ; TrainingCycle++) { /****************************************************************** * Randomize order of training patterns ******************************************************************/ for( p = 0 ; p < PatternCount ; p++) { q = random(PatternCount); r = RandomizedIndex[p] ; RandomizedIndex[p] = RandomizedIndex[q] ; RandomizedIndex[q] = r ; } Error = 0.0 ; /****************************************************************** * Cycle through each training pattern in the randomized order ******************************************************************/ for( q = 0 ; q < PatternCount ; q++ ) { p = RandomizedIndex[q]; /****************************************************************** * Compute hidden layer activations ******************************************************************/ for( i = 0 ; i < HiddenNodes ; i++ ) { Accum = HiddenWeights[InputNodes][i] ; for( j = 0 ; j < InputNodes ; j++ ) { Accum += Input[p][j] * HiddenWeights[j][i] ; } Hidden[i] = 1.0/(1.0 + exp(-Accum)) ; } /****************************************************************** * Compute output layer activations and calculate errors ******************************************************************/ for( i = 0 ; i < OutputNodes ; i++ ) { Accum = OutputWeights[HiddenNodes][i] ; for( j = 0 ; j < HiddenNodes ; j++ ) { Accum += Hidden[j] * OutputWeights[j][i] ; } Output[i] = 1.0/(1.0 + exp(-Accum)) ; OutputDelta[i] = (Target[p][i] - Output[i]) * Output[i] * (1.0 - Output[i]) ; Error += 0.5 * (Target[p][i] - Output[i]) * (Target[p][i] - Output[i]) ; } /****************************************************************** * Backpropagate errors to hidden layer ******************************************************************/ for( i = 0 ; i < HiddenNodes ; i++ ) { Accum = 0.0 ; for( j = 0 ; j < OutputNodes ; j++ ) { Accum += OutputWeights[i][j] * OutputDelta[j] ; } HiddenDelta[i] = Accum * Hidden[i] * (1.0 - Hidden[i]) ; } /****************************************************************** * Update Inner-->Hidden Weights ******************************************************************/ for( i = 0 ; i < HiddenNodes ; i++ ) { ChangeHiddenWeights[InputNodes][i] = LearningRate * HiddenDelta[i] + Momentum * ChangeHiddenWeights[InputNodes][i] ; HiddenWeights[InputNodes][i] += ChangeHiddenWeights[InputNodes][i] ; for( j = 0 ; j < InputNodes ; j++ ) { ChangeHiddenWeights[j][i] = LearningRate * Input[p][j] * HiddenDelta[i] + Momentum * ChangeHiddenWeights[j][i]; HiddenWeights[j][i] += ChangeHiddenWeights[j][i] ; } } /****************************************************************** * Update Hidden-->Output Weights ******************************************************************/ for( i = 0 ; i < OutputNodes ; i ++ ) { ChangeOutputWeights[HiddenNodes][i] = LearningRate * OutputDelta[i] + Momentum * ChangeOutputWeights[HiddenNodes][i] ; OutputWeights[HiddenNodes][i] += ChangeOutputWeights[HiddenNodes][i] ; for( j = 0 ; j < HiddenNodes ; j++ ) { ChangeOutputWeights[j][i] = LearningRate * Hidden[j] * OutputDelta[i] + Momentum * ChangeOutputWeights[j][i] ; OutputWeights[j][i] += ChangeOutputWeights[j][i] ; } } } /****************************************************************** * Every 1000 cycles send data to terminal for display ******************************************************************/ ReportEvery1000 = ReportEvery1000 - 1; if (ReportEvery1000 == 0) { Serial.println(); Serial.println(); Serial.print ("TrainingCycle: "); Serial.print (TrainingCycle); Serial.print (" Error = "); Serial.println (Error, 5); toTerminal(); if (TrainingCycle==1) { ReportEvery1000 = 999; } else { ReportEvery1000 = 1000; } } /****************************************************************** * If error rate is less than pre-determined threshold then end ******************************************************************/ if( Error < Success ) break ; } Serial.println (); Serial.println(); Serial.print ("TrainingCycle: "); Serial.print (TrainingCycle); Serial.print (" Error = "); Serial.println (Error, 5); toTerminal(); Serial.println (); Serial.println (); Serial.println ("Training Set Solved! "); Serial.println ("--------"); Serial.println (); Serial.println (); ReportEvery1000 = 1; } void toTerminal() { for( p = 0 ; p < PatternCount ; p++ ) { Serial.println(); Serial.print (" Training Pattern: "); Serial.println (p); Serial.print (" Input "); for( i = 0 ; i < InputNodes ; i++ ) { Serial.print (Input[p][i], DEC); Serial.print (" "); } Serial.print (" Target "); for( i = 0 ; i < OutputNodes ; i++ ) { Serial.print (Target[p][i], DEC); Serial.print (" "); } /****************************************************************** * Compute hidden layer activations ******************************************************************/ for( i = 0 ; i < HiddenNodes ; i++ ) { Accum = HiddenWeights[InputNodes][i] ; for( j = 0 ; j < InputNodes ; j++ ) { Accum += Input[p][j] * HiddenWeights[j][i] ; } Hidden[i] = 1.0/(1.0 + exp(-Accum)) ; } /****************************************************************** * Compute output layer activations and calculate errors ******************************************************************/ for( i = 0 ; i < OutputNodes ; i++ ) { Accum = OutputWeights[HiddenNodes][i] ; for( j = 0 ; j < HiddenNodes ; j++ ) { Accum += Hidden[j] * OutputWeights[j][i] ; } Output[i] = 1.0/(1.0 + exp(-Accum)) ; } Serial.print (" Output "); for( i = 0 ; i < OutputNodes ; i++ ) { Serial.print (Output[i], 5); Serial.print (" "); } } }
Once the code has been successfully download, must plug the Arduino Uno into the system ensuring that the port is correctly selected and upload the code. Open the serial monitor to display the training cycles.
The first training cycle managed to finish at 5584. |
The second training cycle was finished at 5216. |
XOR Function
const int PatternCount = 4;
const int InputNodes = 2;
const int HiddenNodes = 0;
const int OutputNodes = 1;
const byte Input[PatternCount][InputNodes] = {
{ 0, 0 }, // 0
{ 0, 1 }, // 1
{ 1, 0 }, // 2
{ 1, 1 }, // 3
const byte Target[PatternCount][OutputNodes] = {
{ 0 },
{ 1 },
{ 1 },
{ 0},
https://www.the-diy-life.com/running-an-artificial-neural-network-on-an-arduino-uno/
https://en.wikipedia.org/wiki/XOR_gate
Comments
Post a Comment