Oberon.Loop
Posted: Sun May 16, 2021 6:07 am
The following applies to both Project Oberon and Embedded Oberon. Maybe I am preaching to the choir here, and everyone has made the fix on their systems already, it must be a known issue. I am documenting it here with the suggestion to consider the fix for Embedded Oberon, which already uses a substantially customised Oberon.Loop.
Oberon.Loop uses this code for activating tasks:
As soon as CurTask.nextTime becomes negative, but Kernel.Time() is still positive, the task might be run before the task's period has passed, as 't >= CurTask.nextTime' will be true upon the very next check in the Loop.
If I have done the math right, Kernel.Time() overflows at close to 25 days of uptime on a 32 bit system. So depending on the type of the system, and how it is being used, the overflow never happens to begin with. But consider a server system, or a control or monitoring system, which are supposed to be running non-stop, and the overflow will occur. Kernel.Time() reads a hardware counter ('cnt1' in EO's RISK5Top.v), which is not reset upon processor reset, so a system reset will not prevent the issue. The FPGA board needs to be power-cycled, or reprogrammed.
With tasks having short periods, and "stray runs" not being consequential, as the garbage collector, the impacts might be minor or negligible. But consider an hourly backup task, and "stray runs" can become impactful. The longer the task's period, the higher the odds for occuring.
A better solution would be:
Note that I have renamed 'nextTime' to 'lastTime' to be more meaningful.
Allow me to demo the issue using simple test cases. For simplicity, the assumptions in the test cases are that the task's run-time is shorter than one tick of Kernel.Time(), and that the check is run on every tick.
1) Current Oberon.Loop:
Case 1-1: the baseline, with both Kernel.Time() and CurTask.nextTime staying positive. All good. We see an execution pattern as expected, ie. 't >= CurTask.nextTime' is true every CurTask.period number of ticks of Kernel.Time().
Case 1-2: CurTask.nextTime becomes negative while Kernel.Time() is still positive.
The task is run on every tick of t, until t becomes negative as well. Thereafter we again see the regular pattern. Taking the above hourly task example, if it is executed at, say, 59 minutes before the overflow, it will be activated upon each and every check for 't >= CurTask.nextTime' for this task, for close to one hour.
2) Now the proposed Oberon.Loop:
Case 2-1: again, the baseline, with both Kernel.Time() and CurTask.lastTime staying positive. We see the execution pattern as expected, ie. 't - CurTask.lastTime >= CurTask.period' is true every CurTask.period ticks of Kernel.Time().
Case 2-2: CurTask.lastTime stays positive while Kernel.Time() becomes negative.
The execution pattern is as expected also across the overflow point of Kernel.Time().
To be complete, there's a second potential issue with the current Oberon.Loop. If CurTask.nextTime is still just positive, but Kernel.Time() has run into negative space before it can check for the task to be run, eg. due to system load, 't >= CurTask.nextTime' will never be true again until Kernel.Time() has "run around" into positive space up to CurTask.nextTime. This potential problem is also resolved with the proposed solution. I spare you the demo data.
Oberon.Loop uses this code for activating tasks:
Code: Select all
CurTask := CurTask.next; t := Kernel.Time();
IF t >= CurTask.nextTime THEN
CurTask.nextTime := t + CurTask.period; (* ... *)
END
If I have done the math right, Kernel.Time() overflows at close to 25 days of uptime on a 32 bit system. So depending on the type of the system, and how it is being used, the overflow never happens to begin with. But consider a server system, or a control or monitoring system, which are supposed to be running non-stop, and the overflow will occur. Kernel.Time() reads a hardware counter ('cnt1' in EO's RISK5Top.v), which is not reset upon processor reset, so a system reset will not prevent the issue. The FPGA board needs to be power-cycled, or reprogrammed.
With tasks having short periods, and "stray runs" not being consequential, as the garbage collector, the impacts might be minor or negligible. But consider an hourly backup task, and "stray runs" can become impactful. The longer the task's period, the higher the odds for occuring.
A better solution would be:
Code: Select all
CurTask := CurTask.next; t := Kernel.Time();
IF t - CurTask.lastTime >= CurTask.period THEN
CurTask.lastTime := t; (* ... *)
END
Allow me to demo the issue using simple test cases. For simplicity, the assumptions in the test cases are that the task's run-time is shorter than one tick of Kernel.Time(), and that the check is run on every tick.
1) Current Oberon.Loop:
Case 1-1: the baseline, with both Kernel.Time() and CurTask.nextTime staying positive. All good. We see an execution pattern as expected, ie. 't >= CurTask.nextTime' is true every CurTask.period number of ticks of Kernel.Time().
Code: Select all
CurTask.period = 4
t := Kernel.Time()
CurTask.nextTime := t + CurTask.period when the task runs
t CurTask.nextTime t >= CurTask.nextTime
7FFFFFF1 2147483633 7FFFFFF0 2147483632 true
7FFFFFF2 2147483634 7FFFFFF5 2147483637 false
7FFFFFF3 2147483635 7FFFFFF5 2147483637 false
7FFFFFF4 2147483636 7FFFFFF5 2147483637 false
7FFFFFF5 2147483637 7FFFFFF5 2147483637 true
7FFFFFF6 2147483638 7FFFFFF9 2147483641 false
7FFFFFF7 2147483639 7FFFFFF9 2147483641 false
7FFFFFF8 2147483640 7FFFFFF9 2147483641 false
7FFFFFF9 2147483641 7FFFFFF9 2147483641 true
7FFFFFFA 2147483642 7FFFFFFD 2147483645 false
7FFFFFFB 2147483643 7FFFFFFD 2147483645 false
7FFFFFFC 2147483644 7FFFFFFD 2147483645 false
Code: Select all
CurTask.period = 4
t := Kernel.Time()
CurTask.nextTime := t + CurTask.period when the task runs
t CurTask.nextTime t >= CurTask.nextTime
7FFFFFFC 2147483644 7FFFFFFB 2147483643 true
7FFFFFFD 2147483645 80000000 -2147483648 true
7FFFFFFE 2147483646 80000001 -2147483647 true
7FFFFFFF 2147483647 80000002 -2147483646 true
80000000 -2147483648 80000003 -2147483645 false
80000001 -2147483647 80000003 -2147483645 false
80000002 -2147483646 80000003 -2147483645 false
80000003 -2147483645 80000003 -2147483645 true
80000004 -2147483644 80000007 -2147483641 false
80000005 -2147483643 80000007 -2147483641 false
80000006 -2147483642 80000007 -2147483641 false
80000007 -2147483641 80000007 -2147483641 true
2) Now the proposed Oberon.Loop:
Case 2-1: again, the baseline, with both Kernel.Time() and CurTask.lastTime staying positive. We see the execution pattern as expected, ie. 't - CurTask.lastTime >= CurTask.period' is true every CurTask.period ticks of Kernel.Time().
Code: Select all
CurTask.period = 4
t := Kernel.Time()
CurTask.lastTime := t when the task runs
t CurTask.lastTime t - CurTask.lastTime >= CurTask.period
7FFFFFF1 2147483633 7FFFFFF0 2147483632 false
7FFFFFF2 2147483634 7FFFFFF0 2147483632 false
7FFFFFF3 2147483635 7FFFFFF0 2147483632 false
7FFFFFF4 2147483636 7FFFFFF0 2147483632 true
7FFFFFF5 2147483637 7FFFFFF4 2147483636 false
7FFFFFF6 2147483638 7FFFFFF4 2147483636 false
7FFFFFF7 2147483639 7FFFFFF4 2147483636 false
7FFFFFF8 2147483640 7FFFFFF4 2147483636 true
7FFFFFF9 2147483641 7FFFFFF8 2147483640 false
7FFFFFFA 2147483642 7FFFFFF8 2147483640 false
7FFFFFFB 2147483643 7FFFFFF8 2147483640 false
7FFFFFFC 2147483644 7FFFFFF8 2147483640 true
Code: Select all
CurTask.period = 4
t := Kernel.Time()
CurTask.lastTime := t when the task runs
t CurTask.lastTime t - CurTask.lastTime >= CurTask.period
7FFFFFFB 2147483643 7FFFFFFA 2147483642 false
7FFFFFFC 2147483644 7FFFFFFA 2147483642 false
7FFFFFFD 2147483645 7FFFFFFA 2147483642 false
7FFFFFFE 2147483646 7FFFFFFA 2147483642 true
7FFFFFFF 2147483647 7FFFFFFE 2147483646 false
80000000 -2147483648 7FFFFFFE 2147483646 false
80000001 -2147483647 7FFFFFFE 2147483646 false
80000002 -2147483646 7FFFFFFE 2147483646 true
80000003 -2147483645 80000002 -2147483646 false
80000004 -2147483644 80000002 -2147483646 false
80000005 -2147483643 80000002 -2147483646 false
80000006 -2147483642 80000002 -2147483646 true
To be complete, there's a second potential issue with the current Oberon.Loop. If CurTask.nextTime is still just positive, but Kernel.Time() has run into negative space before it can check for the task to be run, eg. due to system load, 't >= CurTask.nextTime' will never be true again until Kernel.Time() has "run around" into positive space up to CurTask.nextTime. This potential problem is also resolved with the proposed solution. I spare you the demo data.