Views 
  

 Open Access -   Download full article: 

Implementation of Artificial Neural Network Training Data in Micro-Controller Based Embedded System

Jnana Ranjan Tripathy1, Hrudaya Kumar Tripathy2, S.S.Nayak3

1Department of Computer Science & Engineering, Biju Pattnaik University of Technology, Orissa Engineering College, Bhubaneswar, Odisha-752050, (India).

2KIIT University Chandaka Industrial Estate, Patia, Bhubaneswar, Odisha 751024

3Centurion University of Technology Paralakhemundi, Gajapati, Odisha (India)

Article Publishing History
Article Received on :
Article Accepted on :
Article Published : 01 Jul 2014
Article Metrics
ABSTRACT:

The Neural Network Trainer (NNT) was originally developed as a tool for training neural networks for use on a PC or comparable computing machine. NNT originally produced for the user an array of weights that corresponded to the weights in a neural network architecture designed by that user. From this point, it is was the user's responsibility to create a neural network that could utilize these weights. This paper transforms this original tool into a complete neural network implementation package for microcontrollers. This software package includes the trainer, an assembly language based generic neural network for the PIC 18 series microcontroller, 8-bit neural network simulator, a microcontroller communication interface for testing embedded neural networks, and a C implemented neural network for any microcontroller with a C compiler.

KEYWORDS: NNT;Neural networks; Simulator; Microcontroller

Copy the following to cite this article:

Tripathy J. R, Tripathy H. K, Nayak S. S. Implementation of Artificial Neural Network Training Data in Micro-Controller Based Embedded System. Orient. J. Comp. Sci. and Technol;7(1)


Copy the following to cite this URL:

Tripathy J. R, Tripathy H. K, Nayak S. S. Implementation of Artificial Neural Network Training Data in Micro-Controller Based Embedded System. Orient. J. Comp. Sci. and Technol;7(1). Available from: http://computerscijournal.org/?p=779


INTRODUCTION

Neural networks are employed in various areas, but their effective use in real-world applications requires efficient hardware implementations. As compared to analog implementations, digital realizations of neural networks can provide advantageous features such as dynamic range, accuracy, modularity, scalability, and programmability. From the architecture design view, digital implementations of neural networks can be classified into three general categories:

custom implementations ,systolic-based implementations, and SIMD/MIMD-based implementations. Custom and systolic Implementations. Benefit from the high performance but suffer from little inflexibility.Programmable implementations such as MIMD/SIMD-based implementations offer more flexibility, but they cannot achieve the performanc. of a well-designed custom realization.The architecture proposed in this work is a programmable stream processor for neural networks. The central idea of stream processing is to organize computations of an application into streams of data. The idea is recently employed in multimedia applications . This software allows the user to create, train, test, and implement a neural network on a microcontroller for his purpose in an automated process.

2. NEURAL NETWORK TRAINER

The user interface for the Neural Network Trainer is shown in Figure 1. The user will first notice there is an empty plot on the left side of the trainer where the iterations versus means squared error are displayed as well as training parameters on the right hand side. The user must follow a few simple steps before training a network. He must prepare an input file that contains the training data and an architecture file that describes the network connections, and then set the training parameters.

Figure 1: Front end of Neural Network Trainer (NNT) Figure 1: Front end of Neural Network Trainer (NNT):

Click here to View Figure

 

Table1 Table 1:

Click here to View table

 

3. TRAINING DATA

The user must create a training file with all the data sets required to train the neural network. This data may be created in various ways such as by hand, spread sheet, or directly through Matlab. A simple parity-3 problem will be used for demonstration purposes. This demonstration will use bipolar neurons so the extremes for data will be +1 and -1. The training data for parity-3 is represented by the following matrix:

As with any parity-N problem there are 2N possible outcomes. As the top row indicates the first three columns are the inputs and the last column is the output for that row. The top row of the matrix is for demonstration purposes only but is not needed in the actual data file. This data is then copied to a text file and saved with the file extension .dat. Delimiters other than white space are not required. Once the data file is finished it can be referenced by numerous architecture input files.

4. INPUT FILE

The input file contains the network architecture, neuron models, data file reference, and optional initial weights. Each input file will be unique to a specific architecture but not necessarily to each data set. In other words, the same data set can be used for several different architectures simply by creating a new input file. The input file contains 3 sections: the architecture, model parameters, and data file definition. The following is an example of an input file for the parity-3 problem discussed in the previous section.

\\ Parity-3 input file (parity3.in) n 4 mbip 1 2 3
n 5 mbip 1 2 3
n 6 mbip 1 2 3 4 5
W 5.17 20.08 -10.01 -4.23 W 1.0 10.81 2.20 19.84
model mbip fun=bip, der=0.01
model mu fun=uni, der=0.01
model mlin fun=lin, der=0.05 datafile=parity3.dat

The first line is a comment. Either a double backslash, as in C, or a percent sign, as in Matlab, is acceptable as a comment delimiter. After the comment comes the network architecture for a 3-neuron fully-connected network as shown in Figure 2.

Figure 2: Three Neuron architecture for parity-3 problem Figure 2: Three Neuron architecture for parity-3 problem 

Click here to View Figure

 

The neurons are listed in a net list type of layout that is very similar to a SPICE program. This way of listing the layout is node based. The first nodes are reserved for the input nodes. The first character of the line is an N to signify that this line describes a neuron. The N is followed by the neuron output node number. Looking at Figure 16, the first neuron is neuron 4 because it is the first available number after the three inputs, and it is connected to nodes 1, 2, and 3, which are inputs. The same is true for neuron 5 or the second neuron, which is also connected to all three inputs. The output node is slightly different but it follows the same concept. It is connected to all three inputs as well as to the output of the first two neurons. Based on this it should be straightforward to see the connection between the input file listed above and Figure 2.

Also, listed on the line of each neuron is the model of the neuron, which allows the user to specify a unique model for each neuron. This network is designed to solve a parity-3 problem using three bipolar neurons. This is not the minimal architecture for this problem, but it serves as a good demonstration of the tool.

Following the architecture of the network are the optional starting weights. If no starting weights are given the trainer will choose random weights. The weights need to be listed in the same format as the architecture. Each line of weights starts with the capital letter W. The biasing weight goes in place of the output node of the neuron. In other words, the first weight listed for a particular neuron is the biasing weight followed by the remaining input weights in their respective order.

The user specifies a model for each neuron and these models are defined on a single line. The user has the ability to specify the activation function and neuron type (unipolar, bipolar or linear) for each model. The user may include neurons with different activation functions in the same network. The final line of the input file includes a reference to the data file. This line simply needs to read datafile followed by the file name. In this example, it is parity3.dat which can be seen on the last line of the example input file.

5. TRAINING PARAMETERS

Once the network architecture has been decided and the input files created the next step is to select the training algorithm and parameters. When NNT is loaded there is an orange panel full of adjustable parameters on the right side of the window. These parameters change for each algorithm so they will be addressed accordingly in the following section. There are several independent algorithms that can be used for training neural networks.

5.1. IMPLEMENTED ALGORITHMS

The algorithm is chosen from the pull down menu in the training parameters. Four of the parameters are the same for all algorithms. They are: Print Scale, Max. Iterations, Max. Error, and Gain. The Print Scale refers to how often the mean squared error is printed to the Matlab command window. This can be important because in certain situations the longest calculation time is that of displaying the data, so increasing this number can significantly decrease training time. Max iterations is the number of times the algorithm will attempt to solve the problem before it is considered a failure. An iteration is defined as one adjustment of the weights, which includes calculating the error for every training pattern and adjusting at the end. The Max Error is the mean squared error that the user considers to be an acceptable value. When this number is reached the algorithm stops calculating and displays the final weights.

5.2 ERROR BACK PROPAGATION (EBP)

This algorithm is the traditional EBP with the ability to handle fully connected neural networks. The Alpha parameter is the learning constant. This value is a multiplier that acts as the numerical value of the step size in the direction of the gradient. If alpha is too big the algorithm can oscillate instead of reducing the error. However, if alpha is too small the algorithm can move toward the solution too slowly and prematurely level off. This parameter should be adjusted by the user until an optimal value is found which has some oscillation that diminishes while the error continues to decrease.

5.3.Neuron By Neuron (NBN)

NBN is a modified Levenberg-Marquardt algorithm for arbitrarily connected neural networks. It has two training parameters, μ and μ Scale. The learning parameter of the LM algorithm is μ. Its use can be seen in Equation 1. In Equation 1 the w describes the weights, the J is the Jacobian matrix, and the I is the identity matrix.

Formula1

Formula1

If μ = 0 then the algorithm becomes the Gauss-Newton method. For very large

Values μ of the algorithm becomes the steepest descent method or EBP. The μ parameter is automatically adjusted at each iteration to insure convergence. The amount it is adjusted each time is μ Scale which is the last parameter for the NBN algorithm.

5.4. Self-Aware (SA)

The SA algorithm is a modification of NBN. It evaluates the progression of the algorithm’s training and determines if the algorithm is failing to converge. If the algorithm begins to fail, the weights are reset and another trial is attempted. In this situation the program displays its progress to the user as a dotted red line on the display and begins again. The algorithm continues to attempt to solve the problem until either it is successful or the user cancels the process. The SA algorithm uses the same training parameters as NBN

5.5. Enhanced Self Aware algorithm (ESA)

ESA is also a modification of the NBN algorithm and is used in order to increase chances for convergence. The modification was made to the Jacobian matrix in order to allow the algorithm to be much more successful in solving very difficult problems with deep local minima. The algorithm also is aware of its current solving status and will reset when necessary.
The ESA algorithm uses a fixed value of 10 for the μ Scale parameter and allows the user to adjust the LM parameter. The LM parameter is essentially a scale factor applied to the Jacobian matrix before it is used in calculating the weight adjustment. This scale factor is typically a positive number between 1 and 10 or possibly greater. The more local minima the problem has, the larger the LM factor should be.

5.6. Evolutionary Gradient

Evolutionary Gradient is a newly developed algorithm, which evaluates gradients from randomly generated weight sets and uses gradient information to generate a new population of weights. This is a hybrid algorithm which combines the use of random populations with an approximated gradient approach. Like standard methods of evolutionary computation, the algorithm is better suited for avoiding local minima when compared to common gradient methods such as EBP. What sets the method apart is the use of an approximated gradient which is calculated with each population. By generating successive populations in the gradient direction, the algorithm is able to converge much faster than other forms of evolutionary computation. This combination of gradient and evolutionary methods essentially offers the best of both worlds. The training parameters are very different than the LM based algorithms previously discussed. They include Alpha, Beta, Min. Radius, Max Radius, and Population.

6. NNT Adaptations

The trainer was adapted to aid in the process of creating neural networks on the embedded level. NNT trains the neural network as it would any neural network and then the embedded network verifications begins. The trainer then makes a forward calculation on the network using the 8-bit neural network simulator.

It essentially does all of the arithmetic that the neural network would do for one pattern. At every step of the way it rounds all of the digits in the same way the 8-bit microcontroller does. This calculation is performed as a sanity check and debugging step for the system. Step by step results from beginning to end of the network calculation are stored in hex and decimal format in an organized text file for the user.

After the training process, the trainer generates the weights, the architecture, and other parameters into assembly and C files for microcontroller implementation. These files can be directly copied and pasted into the microcontroller
IDE and then immediately assembled or compiled, respectively.

The trained and verified network can then be further tested on the embedded level using the neural network communication software. This software is used to communicate via a serial port with the microcontroller. This allows the user to simulate data that the neural network would receive from an external source, like an analog to digital converter. The microcontroller then performs the network forward calculation and sends the data back through the serial port for verification, simulating a network output such as a value for a pulse width modulation module. At this step the user can test as many test patterns as necessary to validate proper performance with hardware in the loop simulation.

7. 8-Bit Neural Network Simulator

The simulator is written in Matlab to operate in the same fashion as an 8-bit microcontroller. The simulator introduces rounding errors in the appropriate places to function in the same manner as the microcontroller. This was accomplished by creating a set of functions that operate identically to the PIC microcontroller. These instructions include commands that round, multiply, add, subtract, and perform the tanh approximation, all using the pseudo floating point arithmetic. There are also special instructions to detect any overflows.

8.PIC Simulator Software (PicSim)

The simulator and verification tool designed was just as important to this overall project as the microcontroller implementation itself. The simulator engine was previously described but this section will discuss how that engine is interfaced with the user as well as the microcontroller. The PicSim software was generated before the assembly version to verify that is was possible to use the 8-bit math and obtain useable results.

The software is designed to plot a two-input system with one or more outputs. PicSim has two main input requirements: a network architecture and a weight array text file. To simplify the process for the user PicSim reads the same input file used by NNT for the training process and the weight file generated by NNT. The user simply has to point the software to these files. There are four graphs displayed in the user interface. The figures in the top left corner is always the network being used as the reference and the top right is the network being calculated. The bottom two figures are the differences between the top two surfaces. The one on the left is on the same scale as the top two and the one on the right is a tighter axis to show the more specific location of the error.

The user has a few other options as far as what type of network to simulate. The first option is the ideal neural network, which is a neural network on a PC using standard IEEE 754 floating point precision. This allows the user to compare the quality of the trained network to that of the training patterns before any error from the microcontroller is introduced. The user can simulate the error produced by the microcontroller by selecting the simulation button. This then compares the simulated network to the ideal network. The simulator engine discussed previously uses a configurable number of patters for testing. At this point any possible overflows or other errors should be caught before hardware is introduced.

The last option is the hardware in the loop setup. This option is the final stage of testing for the embedded neural network. It allows the user to program the microcontroller and test it but still use the features of Matlab for verifying the data.

Matlab still produces the test patterns and then sends them via the serial port to the microcontroller. This simulates data the microcontroller would gain from another source like the analog to digital converter. Once the embedded network has all of the inputs it needs it then does the neural network forward calculations and produces one or more outputs. These outputs would typically drive an external source such as a pulse width modulator, but in this instance are transmitted back to the PC via the serial port. This allows the user to send and receive data from the microcontroller in real time. In addition the user can verify the hardware calculations and the amount of time being required. The data can easily be verified by using Matlab’s graphing tools. This mode can be used for embedded networks using assembly or C. The difference is the format of the test data being sent and received, but both operate under the same principle. This is the final step before the microcontroller is configured to operate in its embedded application with real inputs and outputs but at this point the neural network operation has been thoroughly verified.

The tools created to build the neural network on the microcontroller resulted in an equally challenging project as the embedded network. However, creating and debugging the assembly version of the neural network would never have been possible without the tools. Now with the automated system almost any trained network can be implemented on the microcontroller in a matter of seconds.

9. Hardware Implementation

Implementing neural networks on an 8-bit microcontroller with limited computing power presents several programming challenges. In order for the network to perform as quickly as possible, creating the software at the assembly level was chosen. Writing the software in assembly allows a level of customization that cannot be achieved with C. However, the need for hardware portability was also a motivating factor and a more generic C implementation was also created. It was also very important to manually manage the very limited amount of data memory. Several assembly routines were created with this purpose in mind. A pseudo floating point arithmetic protocol was created exclusively for neural network calculations along with a multiplication routine for multiplying large numbers. A tanh compatible activation function was also needed. The final procedure is capable of implementing any neural network architecture on a single operating platform. This robust base removes the need to modify the structure of the software to make network architecture changes.

10. Matlab’s Peaks Surface

The next example is generated by the common Matlab function peaks. The surface has several peaks and valleys and is a rather complicated nonlinear control surface. This complicated surface requires significantly more neurons to solve to a comparable accuracy. The training surface is shown in Figure 3 and the architecture in Figure 4. The network architecture is somewhat of a hybrid between the common MLP networks and the cascade network shown in the last example. The architecture has two hidden layers but all neurons are connected directly to the inputs and all preceding layers.

Figure 3: Training data used for Matlab's peaks surface.

Figure 3: Training data used for Matlab’s peaks surface.

Click here to View Figure

 

Figure 4: Eight neuron network used for solving the Matlab peaks surface.

Figure 4: Eight neuron network used for solving the Matlab peaks surface. 

Click here to View Figure

 

11. Experimental Data Summary

After comparing the results of the two different implementations of neural networks, it was obvious that the C version is much more accurate; however this accuracy does not come without a decrease in performance. The C version was significantly slower due to its complexity of calculations and its necessity to store all weights and nodes in program memory because they are too large to put in RAM. The accuracy and speeds can be seen in Table 1:

Table1 Table 1:

Click here to View table

 

Conclusion

This paper presents a solution for embedded neural networks across many types of hardware and for many applications. The software package presented here allows the user to develop a neural network for a desired application, train the network, embed it in any platform, and verify its functionality. This software package is a complete embedded neural network solution.

This package offers the user the ability to use far superior neural network architectures than in other training software. The user has the freedom to customize his network for his application. He can use traditional multi-layer perceptron networks or the superior arbitrarily connected networks including fully connected and cascade networks. Most other software and research only trains with error back propagation or other first order algorithms. This dissertation gives the user their choice of traditional EBP as well as the faster and more efficient second order algorithms such as the Neuron by Neuron algorithm and the Enhanced Self Aware algorithm.

The software offers the user the option of installing the network on a Microchip’s 18Fxxxx series microcontroller using custom made neural network software written in assembly language and optimized for both the microcontroller and the neural network application. This version offers a very fast and accurate solution on a very inexpensive microcontroller.

References

  1. B. K. Bose, “Neural Network Applications in Power Electronics and MotorDrives—An Introduction and Perspective,” IEEE Transactions on Industrial Electronics, vol. 54, pp. 14-33, 2007.
    CrossRef
  2. M. A. El-Sharkawi, “Neural network application to high performance electricdrives systems,” in Proc. IEEE IECON 21st Int Industrial Electronics, Control,and Instrumentation Conf, 1995, pp. 44-49.
  3. L. M. Grzesiak and B. Ufnalski, “Neural stator flux estimator with dynamical signal pre-processing,” in Proc. 7th AFRICON Conf AFRICON in Africa, 2004,pp. 1137-1142.
  4. Y. Yusof and A. H. M. Yatim, “Simulation and modeling of stator flux estimator for induction motor using artificial neural network technique,” in Proc. National Power Engineering Conf. PECon 2003, 2003, pp. 11-15.
  5. A. Ba-Razzouk, A. Cheriti, G. Olivier, and P. Sicard, “Field-oriented control of induction motors using neural-network decouplers,” Proc. IEEE IECON 21st Int Industrial Electronics, Control, and Instrumentation Conf, vol. 12, pp. 752-763, 1997.
    CrossRef
  6. S. M. Gadoue, D. Giaouris, and J. W. Finch, “Sensorless Control of Induction Motor Drives at Very Low and Zero Speeds Using Neural Network Flux Observers,” IEEE Transactions on Industrial Electronics, vol. 56, pp. 3029-3039, 2009.
    CrossRef
  7. C. Hudson, N. S. Lobo, and R. Krishnan, “Sensorless control of single switch based switched reluctance motor drive using neural network,” in Industrial Electronics Society, 2004. IECON 2004. 30th Annual Conference of IEEE, 2004, pp. 2349-2354 Vol. 3.
    CrossRef
  8. J. F. Martins, P. J. Santos, A. J. Pires, L. E. B. da Silva, and R. V. Mendes, “Entropy-Based Choice of a Neural Network Drive Model,” IEEE Transactions on Industrial Electronics, vol. 54, pp. 110-116, 2007.
    CrossRef
  9. H. Zhuang, K.-S. Low, and W.-Y. Yau, “A Pulsed Neural Network With On-Chip Learning and Its Practical Applications,” IEEE Transactions on Industrial Electronics, vol. 54, pp. 34-42, 2007.
    CrossRef
  10. J. Mazumdar and R. G. Harley, “Recurrent Neural Networks Trained With Backpropagation Through Time Algorithm to Estimate Nonlinear Load Harmonic Currents,” Industrial Electronics, IEEE Transactions on, vol. 55, pp. 3484-3491, 2008.
    CrossRef

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.