Elvas Tower: x2620 - Error: ORTS.Processes.ThreadHangException - Elvas Tower

Jump to content

  • 3 Pages +
  • 1
  • 2
  • 3
  • You cannot start a new topic
  • You cannot reply to this topic

x2620 - Error: ORTS.Processes.ThreadHangException Thread 'Loader Process Rate Topic: -----

#21 User is offline   disc 

  • Foreman Of Engines
  • Group: Private - Open Rails Developer
  • Posts: 818
  • Joined: 07-October 12
  • Gender:Male
  • Simulator:OpenRails
  • Country:

Posted 04 November 2014 - 11:15 AM

View PostJames Ross, on 04 November 2014 - 11:01 AM, said:

A hang reported once the game has started is probably very bad; the render, updater and sound processes certainly should get nowhere near the limit and the loaded isn't doing nearly as much as during initial start-up. I'd like to see some logs from runtime hang reports, preferably in bug reports.

When that happened nothing unusal was written to the log. Then i changed the path of the problematic service(just removed some passing paths), and the hang turned to the usual dictionary exception. When the hang happened the game started the pre-run, "Run AI : 182 01:00 02:00 03:00 04:00 05:00 06:00" and after this, nothing more, everything stopped here, so i've killed runactivity after 10 min. Same happened if i choose another service that starts before the that happened. When the ingame clock hit ~6:33 then everything stopped, just as happened in pre-run.

#22 User is offline   Csantucci 

  • Member, Board of Directors
  • Group: Status: Elite Member
  • Posts: 7,000
  • Joined: 31-December 11
  • Gender:Male
  • Country:

Posted 04 November 2014 - 11:30 PM

View PostJames Ross, on 04 November 2014 - 11:01 AM, said:

...
As for the problems during loading, I will be increasing the limit either to the loader generally or at least during start-up to 30s+ later this week. Feel free to try changing that locally and seeing how well it works [1]. Long-term, I'd like to see us work towards having all the pre-run stuff happen as normal (but high-time-step) updates.

[1] In Source\RunActivity\Processes\WatchdogProcess.cs on line 152 change "MaximumLaxnessS" to be whatever limit you wish to test. "MinimumLaxnessS" doesn't have to be adjusted but performance may be a little better if it is closer to "MaximumLaxnessS" (i.e. min/max of 20s/30s is better than 2s/30s).

As not everyone has the capabilities to generate modified releases of OR, I attach here a version of Runactivity.exe and RunactivityLAA.exe derived from release 2625 2626 2627 with MaximumLaxnessS set to 40s, so we have feedback from a broader range of users. Simply unzip and replace the two original .exe files.
Attached File  RunActivity.zip (1.14MB)
Number of downloads: 178

#23 User is offline   JohnS 

  • Fireman
  • Group: Status: Active Member
  • Posts: 231
  • Joined: 03-July 12
  • Gender:Male
  • Location:Portage, IN
  • Simulator:OR, MSTS, Railworks
  • Country:

Posted 05 November 2014 - 03:54 AM

It solved the loading problem for me on MLT LS&I route. Thank you Much

#24 User is offline   elvasleis 

  • Fireman
  • Group: Status: First Class
  • Posts: 236
  • Joined: 30-June 08
  • Gender:Male
  • Location:Dubbo
  • Country:

Posted 05 November 2014 - 04:46 AM

Version X2618 seems to run OK on my PC.
Not any of the versions above X2618 run.
With the patch added to X2625 it is still not reliable to run for me.

#25 User is offline   roeter 

  • Vice President
  • Group: Status: Elite Member
  • Posts: 2,424
  • Joined: 25-October 11
  • Gender:Male
  • Country:

Posted 05 November 2014 - 03:23 PM

Some further points where the watchdog causes serious problems during start-up.

  • Initial loading of tiles and world-files around player train.
    This process is not part of any start-up logic, but of the standard viewer routines.
    The process checks the required tiles and world-files, then checks this list against tiles and world-files already loaded, and loads those still required. All this is done in a simple, single loop statement.
    On start-up, that obviously means all files must be loaded within that loop.
    If max. view distance is set (10 km), that means 5 tiles / world-files all round (in a square), so that can be a maximum of 11x11 = 121 files. The actual number depends on the route configuration, but worst case 121 tiles and 121 world-files must be loaded.
    Many systems will not be able to do this within 10 secs. Clearly, how long this will take depends on both route configuration and system performance.
    As this process is not part of the start-up logic, any additional 'pings' would always be active.

  • In debug mode, the processing of the signal-script is much slower as in normal mode. Again depending on configuration, this can also takes quite a bit longer as 10 secs.
    Allthough this is part of the start-up logic, the file is processed as a single item and any additional 'pings' would have to be within the processing of the file, e.g. after each signal-related script.


The first problem was sorted by increasing the time-out value to 30 secs., but for the second problem I had to disable the watchdog action in order to be able to start in debug-mode. That also applies to resume in debug, as this process is run in both full start and resume.

I have worked out where to add additional 'pings' to overcome most start-up problems. However, I will not commit these changes until there are proper solutions to the problems mentioned above, so as to avoid 'false hope' for those users who cannot load their routes at the moment. In particular the first problem will still cause many routes to crash on loading, more so as long as the time-out value is 10 secs.

Regards,
Rob Roeterdink

#26 User is offline   disc 

  • Foreman Of Engines
  • Group: Private - Open Rails Developer
  • Posts: 818
  • Joined: 07-October 12
  • Gender:Male
  • Simulator:OpenRails
  • Country:

Posted 06 November 2014 - 02:23 AM

What about making this thing windows styled? That waits for x seconds, and then a window appears "application not responding" where you can select "wait more", or close.

Or just make the waiting time read from openrails.ini/registry, set a default value which is enough for the bigger routes with the default 2 km viewing distance, and who use longer distance or debug mode, can set the value higher... or scale the value by the viewing distance and debug mode.

#27 User is offline   James Ross 

  • Open Rails Developer
  • Group: Status: Elite Member
  • Posts: 5,491
  • Joined: 30-June 10
  • Gender:Not Telling
  • Simulator:Open Rails
  • Country:

Posted 06 November 2014 - 06:37 AM

View Postroeter, on 05 November 2014 - 03:23 PM, said:

  • The process checks the required tiles and world-files, then checks this list against tiles and world-files already loaded, and loads those still required. All this is done in a simple, single loop statement.



These should have terminated checks and I will be adding them with other updates this evening.

View Postroeter, on 05 November 2014 - 03:23 PM, said:

  • In debug mode, the processing of the signal-script is much slower as in normal mode. Again depending on configuration, this can also takes quite a bit longer as 10 secs.



We could relax the debugger-attached case from breaking in to the debugger to only doing the logging part or the signalling code can set (inside the relevant #ifdef) "special dispensation", a feature I'll be adding this evening, depending on how you see it being used.

View Postroeter, on 05 November 2014 - 03:23 PM, said:

I have worked out where to add additional 'pings' to overcome most start-up problems. However, I will not commit these changes until there are proper solutions to the problems mentioned above, so as to avoid 'false hope' for those users who cannot load their routes at the moment. In particular the first problem will still cause many routes to crash on loading, more so as long as the time-out value is 10 secs.


The timeout is going up tonight and I will be adding documentation on how and when code should be checking for termination so as to get the most from the watchdog without unnecessary hang reports. If all your changes agree with my documentation, I'd like you to commit them. If there's any disagreements with the documentation, let me know here.

#28 User is offline   James Ross 

  • Open Rails Developer
  • Group: Status: Elite Member
  • Posts: 5,491
  • Joined: 30-June 10
  • Gender:Not Telling
  • Simulator:Open Rails
  • Country:

Posted 06 November 2014 - 02:31 PM

View Postroeter, on 05 November 2014 - 03:23 PM, said:

  • Initial loading of tiles and world-files around player train.
    This process is not part of any start-up logic, but of the standard viewer routines.
    The process checks the required tiles and world-files, then checks this list against tiles and world-files already loaded, and loads those still required. All this is done in a simple, single loop statement.



So I thought these cases were already covered by couldn't immediately see how earlier, because I was looking at the hang-specific commits. They are both in fact covered properly and have been since the first hang-related commit, because in an earlier commit I made the following places all check LoaderProcess.Terminated, which implicitly calls Ping():

  • Road car loading per road car
  • Scenery loading per world file and per scenery item
  • Terrain loading per tile file
  • Train car loading per train car


Feel free to check out the code for these items for how I would recommend doing checks elsewhere as well.

I have now added the following in X2629:

  • WatchdogToken.SpecialDispensationFactor - this sets a multiplication factor for the time a thread must not respond for it to be considered as hung. I have set this to 6 for the loader process (giving a default time of 60s) and left it on the default 1 for render, updater and sound processes (so they have the same 10s limit which should be plenty).
  • A bunch of documentation comments for WatchdogToken.Ping and WatchdogToken.SpecialDispensationFactor - specifically when you should and should not call Ping() and how to use SpecialDispensationFactor to temporarily run a long process without hampering hang detection elsewhere.


Please let me know if either 1) the new loader default is insufficient (and where/why it isn't) or 2) you have any comments on the documentation/explanations for SpecialDispensationFactor and Ping().

#29 User is offline   roeter 

  • Vice President
  • Group: Status: Elite Member
  • Posts: 2,424
  • Joined: 25-October 11
  • Gender:Male
  • Country:

Posted 07 November 2014 - 01:32 AM

In version 2630, additional watchdogs pins have now been added to the loops mentioned in post #16 above.

The comment for this is in version 2631 :bigboss: - sorry about that, I was so concentrated on making sure I had all required files I completely forgot to set the comment.

I have not yet looked at the dispensation for the sigscr file in Debug mode.

Odd about loading those tiles - I did have consistent crashes at that point before I set the time-out to 30 secs., but could not reproduce this anymore.

Regards,
Rob Roeterdink

#30 User is offline   James Ross 

  • Open Rails Developer
  • Group: Status: Elite Member
  • Posts: 5,491
  • Joined: 30-June 10
  • Gender:Not Telling
  • Simulator:Open Rails
  • Country:

Posted 07 November 2014 - 04:06 AM

View Postroeter, on 07 November 2014 - 01:32 AM, said:

Odd about loading those tiles - I did have consistent crashes at that point before I set the time-out to 30 secs., but could not reproduce this anymore.


I think I see what could go wrong: we ping for every tile loaded but after loading all the tiles we create the actual terrain from the tiles - this is obviously going to be all tiles at once and I guess it may be possible for that to exceed 10 seconds. Hopefully with the new 60s limit for the loader process this won't be a problem.

  • 3 Pages +
  • 1
  • 2
  • 3
  • You cannot start a new topic
  • You cannot reply to this topic

2 User(s) are reading this topic
0 members, 2 guests, 0 anonymous users