Watchdog prevents start of timetable Watchdog times out on timetable prerun phase
#1
Posted 02 February 2019 - 03:22 AM
My laptop is not the fastest computer in the world, and the route I'm running is quite large and I am running a very extensive, full 24 hour timetable. Starting a train later in the day, OR takes several minutes to start. But, alas, the watchdog function has no patience for this and aborts the program on time-out of the loader process.
Each of the main process threads sends a signal to the watchdog each time the process loop ends. The watchdog checks on these reports and aborts the program if, after a given time, a report is missing, assuming the process is 'hanging' in an endless loop.
The problem which occurs when starting a timetable is due to the 'prerun' process (method PreRunAI in AI.cs).
When a train is started, the prerun phase runs through the time upto the start of this train.
The prerun phase consists of three loops. The first loop runs from 00:00 (midnight) upto the activation of the first train. This process is limited, it only places trains which are started (e.g. in pools), and checks if any train must activated.
As soon as a train is activated, the second and main loop starts. This loop performs a full normal update, from the activation of the first train upto the defined (booked) start time of the player train. This is the normal update process, except there are no graphics and it runs with a higher time interval. But for large routes with lots of signals and many trains, it will take a long time, in my case several minutes. Unlike the normal updates, the prerun does not exit this loop allowing the 'top' method of the route to send a signal to the watchdog. So during the prerun, no signals are send and this will cause the watchdog to time out.
The third loop is only run if the player train is not ready at its booked time. The processing of the third loop is similar as for the second loop, and it runs until the player train is started, or when the end of the timetable has been reached. This third loop can also cause a time out.
Looking at my own private version, I avoided this problem by setting the time out for the watchdog to 100000 or so seconds, thus practically disabling the watchdog. Obviously this is no solution for the proper version.
What needs to be done is that signals need to be send to the watchdog from within the prerun loops, but I do not know enough of this logic to know if this is possible or how it can be done.
So I hope someone can help to sort this out.
If not, it's back to my private version again.
Regards,
Rob Roeterdink
#2
Posted 02 February 2019 - 06:35 AM
I'm since ever quite convinced that the watchdog causes problems more than it avoids them. It's a case, but just yesterday I generated a version of OR MG where I added a general options checkbox allowing to disable the watchdog.
Coming to your problem, I see that method PrerunAI (the one for timetable) has already within its main loop a watchdog call that should restart watchdog count, that is
if (cancellation.IsCancellationRequested) return; // ping watchdog process
Doesn't this call work?
#3
Posted 02 February 2019 - 10:13 AM
roeter, on 02 February 2019 - 03:22 AM, said:
What needs to be done is that signals need to be send to the watchdog from within the prerun loops, but I do not know enough of this logic to know if this is possible or how it can be done.
You should pass the CancellationToken "cancellation" into any methods that could take too long; they can then call "cancellation.IsCancellationRequested" which serves two purposes:
- Allows the code to exit faster when the user has tried to close the application
- Pings the watchdog to indicate things are still alive
However, there are some strict requirements for calling this documented on the "Ping" method itself:
Quote
The requirements on when this should be or not be called are important: failure to call this (directly or indirectly) during a long but guaranteed-to-terminate loop will cause unnecessary hang reports, and calling this during a potentially infinite loop will result in hangs that are not reported.
So as long as the code is running a loop that is known to finish (even in the face of completely bogus values from the user or content), you can check the cancellation status at the start of each iteration.
#4
Posted 02 February 2019 - 10:57 AM
Csantucci, on 02 February 2019 - 06:35 AM, said:
:(
Do you know what cases it is causing problems with? If it is crashing the program unexpectedly, the code is probably in need of checking cancellation too so it exits faster.
#5
Posted 03 February 2019 - 01:54 PM
Csantucci, on 02 February 2019 - 06:35 AM, said:
if (cancellation.IsCancellationRequested) return; // ping watchdog process
Doesn't this call work?
The call is in the second loop. The problem occured on a testrun for a freighttrain which was formed of multiple entries. The starttimes for these entries were not yet properly set, and when I tried to start the last 'leg' of this train, the start was delayed. Therefor the third loop was started and as this has no watchdog reset, it timed out before the train was started.
So, an additional watchdog reset must be inserted into the third loop as well.
Quote
I have not had any (further) troubles with the wachtdog in normal running, but the intervention of the wachtdog can be quite frustrating when debugging.
When stepping through code with 'step' (F10 or F11), one can suddenly find oneself stepping through the watchdog instead of the code which was being debugged. That's not only quite confusing but it's easy to loose track that way, which means starting all over again.
When stopping often in debug the watchdog can actually time out. And if an actual loop has accidently been created, it is clearly very frustrating when the watchdog kills the program when you are debugging the code to find to cause of the loop.
So, I would like to make a strong plea to disable the watchdog when the debugger is attached.
This can be done quite easy by replacing the lines at the head of the wachtdog process :
void WatchdogThread() { Profiler.SetThread(); Game.SetThreadLanguage(); while (true) {
with these lines :
void WatchdogThread() { Profiler.SetThread(); Game.SetThreadLanguage(); while (true && !Debugger.IsAttached) {
Regards,
Rob Roeterdink
#6
Posted 03 February 2019 - 02:26 PM
roeter, on 03 February 2019 - 01:54 PM, said:
When stepping through code with 'step' (F10 or F11), one can suddenly find oneself stepping through the watchdog instead of the code which was being debugged. That's not only quite confusing but it's easy to loose track that way, which means starting all over again.
I don't think I've ever seen this happen while debugging, so I'm curious: what exactly is it jumping to, which line of code? And which version of Visual Studio are you using?
roeter, on 03 February 2019 - 01:54 PM, said:
The watchdog won't kill the program when a debugger is attached, but it will pause in the debugger ("Debugger.Break()") after logging details (line 140, just down from the code you quoted).
roeter, on 03 February 2019 - 01:54 PM, said:
I'd rather not, because then people will simply start creating code that triggers the watchdog but only for users and only know to fix it once bug reports start rolling in.
If it is the "Debugger.Break()" that is causing you issue while debugging, we can look for alternatives, but I do not want it to become silent. Developers do need to be alerted to possible hangs.
We might be able to figure out a way to only count time when not paused in the debugger, if that's not the case currently, so it does not trigger just because you stopped on a breakpoint or similar.
#7
Posted 03 February 2019 - 11:50 PM
SystemInfo.WriteSystemDetails(Console.Out);
because - at least in my quite slow computer - that call may last many seconds.
#8
Posted 04 February 2019 - 06:39 AM
James Ross, on 03 February 2019 - 02:26 PM, said:
You mean to say that developpers always run OR with debug attached?
I certainly don't - it's way too slow. Ofcourse I can't speak for other developpers but this seems a bit far fetched.
Anyway - even if it's so, it doesn't prevent developpers from switching it off. That's the problem with developpers - they know how to develop the program, so they also know how to alter things they don't like.
The first things I do after getting 'official' code is to switch to 'console application' and disable the watchdog for debugging.
Regards,
Rob Roeterdink
#9
Posted 04 February 2019 - 07:17 AM
that does interest me. I assume you switch to console application in order to see log messages in real time. How do you perform the switch?
#10
Posted 04 February 2019 - 11:30 AM
Csantucci, on 04 February 2019 - 07:17 AM, said:
that does interest me. I assume you switch to console application in order to see log messages in real time. How do you perform the switch?
Hello Carlo,
that is indeed exactly why.
It's easy : in Visual Studio, select "Project" (make sure RunActivity is set as Project), select "Run Activity Properties", then "Application", and in application window, for "Output type" select 'console application' instead of 'windows application'.
Does not change anything w.r.t. building or running the program except that is shows the console window so you can see any messages the moment they are generated.
Regards,
Rob Roeterdink