柯勇
中冶南方工程技术有限公司 电气自动化设计所 湖北武汉 430023
Abstract
Controllers with no learning capabilities give the same tracking error for each trial and it is a waste of the wealth of information in the inputs, outputs and error signals in previous trials. To use this information in order to improve the performance of a control system, scholars proposed the concepts of learning-type control like Repetitive Control (RC) and Iterative Learning Control (ILC). ILC and RC are widely used in industrial control since they came out and in recent years the applications of ILC and RC in health care are developing very fast. One of the main applications in health care for ILC and RC is stroke restoration to rehabilitate or partly rehabilitate the function of the patients’ limbs due to the evidence that the intensity of practice of a task and feedback are important from the research into conventional therapy and motor learning theory. For those people who are suffering a stroke, rehabilitation robots set up with iterative learning control or repetitive control algorithms are used to train their sick limbs by repeating the same action following a task trajectory.
This thesis is a study into a ready developed ILC algorithm implemented in iterative model reference adaptive control (MRAC) with the discussion of key parameters which would affect the performance.
Key words: Iterative Learning Control, Repetitive Control, Model Reference
1 Introduction
Nowadays, a large amount of people suffer a stroke and some of them are permanently disabled. Stroke is a rapid loss of brain functions because of the disturbance in the blood supply to the brain or leakage of blood. [1] As a result, some of the connecting nerve cells die and the person commonly suffers partial paralysis on one side of the body, [2] termed hemiplegic. However, the brain has some space capacity that new connections can be made by learning new skills though the cells killed cannot re-grow. Hence, the main task in the treatment of stroke is to train the patients by repeating a specific movement which would probably be learnt and remembered by the brain to build new connection that could possibly rehabilitate or partially rehabilitate the function of the paralysed body.
In traditional treatment of stroke rehabilitation, medical workers can only design the training programme by evidence which is difficult to monitor the performance. The training effect is unpredictable and some failed attempts may discourage the confidence of the patients which would psychologically weaken the training effect.
In recent years, research into conventional therapy and motor learning theory provides evidence that the intensity of practice of a task and feedback are important. [3]-[5] In this situation, techniques based on iterative learning control (ILC) and repetitive control (RC) have been introduced in this area. According to the research, the rehabilitation robots set up by ILC and RC algorithms are designed and applied in stroke restoration. Methods of iterative learning and repetitive control are applied to update the control input using the previous data collected from the previous attempts and exploit the repeating nature of the patients’ tasks to improve performance. [2] Different from other intelligent algorithms, the ILC and RC algorithms never change the structure of the system but the input signal step by step.
As the rehabilitation robots can track the performance of the patients to the tasks, the error between the patients’ moving trajectories and the ideal trajectories are observable. Hence, we could design training programme individually in different situations for different patients according to their own state. This would avoid exercise in futility, develop an efficient way of the rehabilitation training and give the staff a clear view whether the training is useful or not. Also, to the patients, a suitable task would be easier for them to complete which can highly encourage them in psychology and improve their confidence that may help to obtain a better training result.
This thesis is a study into iterative learning and repetitive control applied in limb rehabilitation mainly referenced by [6]. In this thesis an iterative model reference adaptive control (MRAC) framework with an advanced ILC algorithm reported in [6] are implemented. The objective is to test the performance of the algorithm, study how parameters affect the performance.
2 Background
The concept of repetitive control was originally developed in 1981 [7, 8] and mainly used in continuous processes for tracking or rejecting known periodic exogenous signals. [9] In 1984, S. Arimoto et al first introduced the method of iterative learning control (ILC) [10] in English. ILC is mainly focused on batch processes [9] and used in robot system control by repeating the same task trial-to-trial which can deal with the problems related to dynamic systems with high uncertainty in a very simple way. In industrial control, learning-type control would always be the first choice and methods of ILC and RC are widely used in this area.
2.1 Iterative learning control
Iterative learning control (ILC) is a method of tracking control for systems that works in repetitive mode and the objective is to improve the performance from trial-to-trial by using the information from previous trials in the construction of the current trial input. Usually the ILC algorithms construct the sequence of input functions to reduce the error and improve the performance of real output gradually with each successful trial and it normally converges adequately in just a few iterations [11]. The main difference between ILC and other learning control is that ILC changes the control input rather than the controller.
One of the typical and most used linear ILC algorithms is P-type ILC which uses the proportion of error signal to generate the next time step input signals. As the ILC can be realized by using the PID controller, obviously there exists a D-type ILC algorithm which uses the differential of error signal to compute the next input signals. And also a much more complexity PID-type ILC can be applied in some cases. Apart from these, another common type of linear ILC is to utilize the error at advanced time steps which is called phase-lead ILC.
Consider a system with the input on trial k denoted as in a finite duration T. Let represent the desired signal which does not depend on the trial number. The error signal for trial k is: [2]
![](/userUpload/2(49064).png)
(2.6)
In fact, in most of the situations, iterative learning control can be considered as optimization problems in algorithm. The objective is to obtain a convergence condition on error signals as follows:
(2.7)
Or equivalent as:
(2.8)
A li
![](/userUpload/4(23387).png)
(2.13)
2.2 Repetitive control
The theoretical foundation of repetitive control (RC) is the internal model principle (IMP) proposed by Francis and Wonham. [12] Consider a linear system:
![](/userUpload/5(18828).png)
For nonlinear systems, RC can be designed by two major methods: Linearization Approach and Adaptive-control-like Approach. [13] Feedback linearization has been used in Linearization Approach to transform a nonlinear system to a linear system and then apply the existing linear RC design methods. Adaptive-control-like Approach is to transform a nonlinear tracking problem to a rejection problem of nonlinear error dynamics [13] which includes two main methods called Lyapunov-based approach [14, 15] and evaluation-function-based approach [16, 17].
3 Experiment
3.1 Modelling
![](/userUpload/6(16134).png)
Figure 3-1 is the model reference adaptive controller framework used in this thesis. It is modelled as a two-link manipulator system. This models the ideas of the Equilibrium Plant Hypothesis (EPH) and is controlled using an Iterative Learning Controller to learn the dynamics of the system to generate the desired input after a finite number of iterations [6].
3.1.1 The ideal trajectory model
Given a task to reach, the ideal trajectory model is used to generate a signal that the CNS (central nervous systems) desires the body to process. Here, the ideal trajectory model is represented by the first order transfer function below:
(3.1)
The ideal trajectory model is used to generate the desired input r for the reference model. In this case, as r is a 2-dimensional input signal it remains zero in the x direction and assumed to be a first order step response in the y direction.
3.1.2 The reference model
![](/userUpload/7(13888).png)
3.2 The controller algorithm
In [6], the objective of the simulation is to find a desired input u for which the output can track the ideal dynamics generated by the reference model. Here we can rewrite the reference model into a state-space equation as:
(3.4)
In this equation, T is the finite time interval for each trial. The plant system with the disturbance at each iteration can be written as:
![](/userUpload/8(12290).png)
3.3 Simulation
For all the simulation in this thesis the simulating time is from 0 to 600 ms and no disturbance is considered in the simulation.
We generate a 2-D plant model which is a double input double output system represented by the transfer function below:
(3.10)
In really application the plant model is actually a linear approximation of the nonlinear system of the patient’s CNS model which is usually written in the state-space model as (3.3). The parameters in (3.3) are unknown but can be evaluated by experiment. In the simulation, the 2-D plant model is an estimation of the CNS system.
The initial output of this 2-D model is the response of r come through the plant model which we used to compute the initial error e0.
The following parts are included in the 2-D simulation:
A.Performance of the algorithm and how initial values affect the performance
Implement the introduced algorithm in the 2-D simulation with the initial values defined in equation (3.8) and both original and reformed control laws are applied here to compare their performance.
To find how different initial values of K affect the performance, three groups of initial values are used:
i.The initial values defined in (3.8) with;
4 Results and discussion
4.1 Results simulated by the introduced algorithm with different initial values of K
We first use in the simulation.
Firstly, the original control law (3.6) was implemented in the 2-D system. The output trajectory after 50 iterates is shown in Figure 4-1 as follows:
![](/userUpload/10(8876).png)
Here the blue pachytene is the ideal trajectory generated by the reference model while the black and red curves represent the trajectories before and after training respectively. It can be found that the final point of the trajectory after 50 iterates is still far from the target in y-direction.
![](/userUpload/11(9387).png)
Figure 4-2. Squared errors before and after training (using law 3.6)
When considering the squared error (the square of the distance between the real and ideal trajectories at the same point on time scale) in Figure 4-2 above, the blue curve is the error of the trajectory before training and the red curve is the error after 50 iterates. Though the squared error reduced which means the performance improved after training, the error of the end point is still large, between 0.15 and 0.20.
![](/userUpload/12(8244).png)
Here we use the initial value computed from equation (3.8). Figure 4-3 above shows the trajectories. The blue pachytene is the ideal trajectory, the black curve is the initial plant output before training and the red curve represents the trajectory after training by the ILC for 50 iterates. From this figure we can find that the trajectory after training is closer to the ideal trajectory than that before training. To make it clear, errors are shown in Figure 4-4.
In Figure 4-4, the blue curve represents the squared error before training while the red curve is the squared error after 50 iterates. It is clear that the error after training is much smaller than that before training which means the training set up by the introduce algorithm do make sense to reduce the error and improve the performance..
Comparing Figure 4-1 with Figure 4-3, though the reformed law (4.1) doesn’t perform a better result in the x-direction than the original law (3.6) it has far better performance in y-direction as can be found from the comparison of Figure 4-2 and Figure 4-4, the squared error at the end point after 100 iterates using the reformed law is under 0.05 which is much smaller than that using the original law.
![](/userUpload/13(7699).png)
In figure 4-5, the error after training is considerably small that nearly approaches to zero which is much smaller than the error shown in Figure 4-4. Hence, with different values of β we choose the performance of the algorithm changes largely. Thus, β is the key parameter in the algorithm that largely determines the performance.
With different initial values applied in the simulation, the performances of the algorithm are different. The following figures figure 4-6 and figure 4-7 show the squared errors provided with different initial values.
![](/userUpload/14(7341).png)
Figure 4-7. Squared errors after 50 iterates with different initial Ks using
The representations of curves in each colour are the same as in figure 4-6 and it is clear to find that with the initial values defined in equation (3.8) the performance is much better than the other two groups with an error at the end point very close to zero.
According to the simulation results, as the initial values defined in equation (3.8) are correlated to the value of β used in the updating law, they have the variability to fit different values of β rather than those fixed initial values so the initial values defined in this way would probably give better performance than the fixed ones.
4.2 Results simulated with the increasing value of β
From the previous simulations we can find that in the updating equations β is the key parameter that directly affects the feedback gain and with different β the algorithm performs some differences. The below figure shows the errors for three different values of β (increasing one-by-one) each after 10 iterates:
5 Conclusion
The given algorithm could reduce the error by iterates but doesn’t have a very good performance for which the final error we could get is still somewhat large. However, when we reform the control law (3.6) to law (4.1) the performance of the algorithm is greatly improved as the error converges to a very small value.
In comparison of different initial values of the feedback gains, it is found that with different initial values the algorithm show differences in performance which means suitable initial values of the feedback gains may have positive effect on the training in improving the accuracy and it is proved that the initial values defined in equation (3.8) would be better than those fixed values.
In the updating law, β is the key parameter that different values of β applied in the algorithm directly affect the number of iterates and the accuracy. A suitable value of β could possibly improve the performance largely. Also, with a large β we could get large feedback gains and the number of iterates would be less than that with a smaller β. However, large feedback gains cause high cost so the key fact to improve the performance of the algorithm is to find the point that has the highest cost-effective which is reasonable in both theory and applications.
Phase-lead type ILC is normally applied in non-minimum phase systems and in this sort of problem with a minimum phase system the phase-lead type ILC doesn’t make sense.
Bibliography
[1]Sims NR, Muyderman H (September 2009). "Mitochondria, oxidative metabolism and cell death in stroke".?Biochimica et Biophysica Acta?1802?(1): 80–91.
[2]T. Freeman, E. Rogers, A.-M. Hughes, J. H. Burridge K. L. Meadmore “Iterative learning control in health care: electrical stimulation and robotic-Assisted upper-Limb stroke rehabilitation” IEEE Control Systems, Volume 32, Issue 1 pp. 18-43, 2012.
[3]C. J. Winstein, D. K. Rose, S. M. Tan, R. Lewthwaite, H. C. Chui, and S. P. Azen, “A randomized controlled comparison of upper-extremity rehabilitation strategies in acute stroke: A pilot study of immediate and long-term outcomes,” Arch. Phys. Med. Rehab., vol. 85, pp. 620–628, 2004
[4]J. R. De Kroon, M. J. Ijzerman, J. J. Chae, G. J. Lankhorst, and G. Zilvold, “Relation between stimulation characteristics and clinical outcome in studies using electrical stimulation to improve motor control of the upper extremity in stroke,” J. Rehab. Med., vol. 37, no. 2, pp. 65–74, 2005.
[5]R. A. McGill, Motor Learning - Concepts and Applications. New York: McGraw-Hill, 1998.
[6]Shou-Han Zhou, Denny Oetomo, Ying Tan, Etienne Burdet , Iven Mareels, Human Motor Learning Through IterativeModel Reference Adaptive Control, Preprints of the 18th IFAC World Congress Milano (Italy) August 28 - September 2, 2011.
[7]T. Inoue, S. Iwai, M. Nakano, High accuracy control of a proton synchrotron magnet power supply, in: Proc. 8th IFAC World Congress, 1981, Part 2, pp.3137–3142.
[8]T. Inoue, M. Nakano, S. Iwai, High accuracy control of servomechanism for repeated contouring, in: Proceedings of the 10th Annual Symposium on Incremental Motion Control Systems and Devices, 1981, pp. 258–292.
[9]Youqing Wang, Furong Gao, Francis J. Doyle III, Survey on iterative learning control, repetitive control, and run-to-run control, Journal of Process Control 19 (2009) 1589–1600
[10]S. Arimoto, S. Kawamura, and F. Miyazaki, “Bettering operations of robots by learning,” J. Robot. Syst., vol. 1, pp. 123–140, 1984
[11]A. Bristow, M. Tharayil & A.G. Alleyne “A survey of iterative learning control” Control Systems, IEEE, volume: 26, Issue: 3, pp. 96 – 114, 2006.
[12]B.A. Francis, W.M. Wonham, Internal model principle for linear-multivariable regulators, Appl. Math. Optim. 2 (1975) 170–194.
[13]Quan Quan, Kai-Yuan Cai. “A Survey of Repetitive Control for Nonlinear Systems”.
[14]Dixon W E, Zergeroglu E, Dawson D M, Costic B T. Repetitive learning control: a Lyapunov-based approach [J]. IEEE Transaction on Systems, Man, and Cybernetics, Part B-Cybernetics, 2002, 32(4): 538-545.
[15]Xu J X, Yan R. On repetitive learning control for periodic tracking tasks [J]. IEEE Transactions on Automatic Control, 2006, 51(11): 1842-1848.
[16]Xu J X, Tian Y-P. A composite energy function-based learning control approach for nonlinear systems with time-varying parametric uncertainties [J]. IEEE Transactions on Automatic Control, 2002, 47(11): 1940-1945.
[17]Tayebi A, Chien C-J. A Unified adaptive iterative learning control framework for uncertain nonlinear systems [J]. IEEE Transactions on Automatic Control, 2007, 52(10): 1907-1973.
作者简介:柯勇(1989-),男,湖北武汉人,硕士,工程师,主要从事工业自动化及过程控制工作.