High Frame Rates and Heat!
#21
Posted 10 October 2014 - 12:42 PM
Dennis
#22
Posted 10 October 2014 - 12:52 PM
#23
Posted 10 October 2014 - 04:16 PM
#24
Posted 11 October 2014 - 02:09 AM
Cheers, Markus
#25
Posted 11 October 2014 - 07:07 AM
Genma Saotome, on 10 October 2014 - 09:14 AM, said:
It doesn't work like that at all, luckily for us. :sign_thanks: If you can't make the 60FPS needed (for a 60Hz monitor), you just miss the scans/updates as needed. I.e. you might get a new frame shown every scan for 3, then miss one, then for 3, the miss one, etc. (that would be for around 45FPS). If you turn vsync off, I believe you get the exact same number of unique frames displayed, but you may get tearing as updates happen mid-scan (so you'll potentially see half of one frame and half of another). The vsync output will not look quite as smooth, but won't have any tearing - it's a compromise, as always.
The NVIDIA adaptive option is only useful if you are noticing tearing at FPS below the monitor refresh rate; if you are not seeing tearing, it'll be the same as vsync on.
G-SYNC is more interesting, if you have a graphics card and (more importantly) monitor that supports it, because it allows the monitor's scans to be aligned with the frames being generated. So, if you're making 45FPS, instead of having 3 updates, 1 skip, etc., you can have the monitor update at 45Hz meaning you always get a whole frame with no tearing and no update skips.
Enabling vertical sync in OR would be a sensible change IMHO; people can still change it, and driver-specific control panels will always be able to provide more advanced options (which could never be provided in-game).
#26
Posted 11 October 2014 - 10:28 AM
James Ross, on 11 October 2014 - 07:07 AM, said:
How many buffers does OR use when sending data to the GPU -- two, three... four?
#27
Posted 11 October 2014 - 11:53 AM
Genma Saotome, on 11 October 2014 - 10:28 AM, said:
DirectX (at least on Windows Vista and up) defaults to a maximum frame queue of 3 and Open Rails has an extra frame queued (it prepares one while another is being rendered). But things are never that simple...
Using GPUView (more details and screenshots) we can see that there is never more than a single frame queued up here (running at an easy, smooth 60FPS in Open Rails in a window). I have seen them queued up in other scenarios, though.
http://james-ross.co.uk/temp/orts_122.png
- The vertical blue lines are vsyncs.
- The light green boxes in the "Context CPU Queue" and "Hardware Queue" are the work being created and then processed, respectively.
- The dark green and white boxes are the threads using CPU time.
- The checkered pattern are presents.
- The "Flip Queue" is the frames waiting on the GPU to be shown.
You can see the checkered light green box in RunActivityLAA showing when OR calls "Present", which is immediately followed by the render process (thread 4816) calling graphics APIs, giving the NVIDIA user-mode driver (thread 11644) something to think about, and causing it to spit out GPU tasks (light green boxes in the "Context CPU Queue") which are then executed on the GPU (in the "Hardware Queue").
So, if you have a specific scenario I can test (windowed, fast fullscreen, slow fullscreen + route, settings, etc.) we can figure out how many frames are buffered, but I don't think you can ever just say there is a particular number.
#28
Posted 11 October 2014 - 01:10 PM
James Ross, on 11 October 2014 - 11:53 AM, said:
Oh, I'd kinda like to see something from Goose Island of the Cal-P but I already have a pretty good idea of what both would look like so there is really no need on my part to bother... tho if a high load example would help you in any way I could point out a couple situations of very high loading.
FWIW, after your first reply I went looking for vsync info and found that it is true that fps gets cut in half w/ vsync when only two buffers are used which is why three or four are much better. I could locate the url again if you want to read it.
Speaking of the chart which you kindly posted, do you know if the packets are roughly equivalent to individual draw calls or all calls for a model... or something else?
#29
Posted 11 October 2014 - 01:44 PM
Genma Saotome, on 11 October 2014 - 01:10 PM, said:
I wouldn't say 'much better'. Now the problem is jitter. VSYNC with triple or quad buffering results in frames being presented to the screen at a rate that alternates between 1/60th and 1/30th of a second in order to not overrun the back buffer chain. For example, if the CPU is generating frames at 45 FPS, then the GPU will present one, wait 1/30th of a second, present the next, wait 1/60th of a second, present the next, wait 1/30 of a second, etc for an average FPS of 45. There is no magic. The CPU and graphics logic has to prepare the frame in advance of when it is presented to the user and has to guess when that will be. It assumes the frames are presented at a uniform pace. Use of VSYNC and Triple or Quad buffering results in frames being present earlier or later than intended and jitter is the visual result.
G-Sync is the best solution by far for smooth motion and maximum frame rates.
#30
Posted 11 October 2014 - 02:01 PM
Genma Saotome, on 11 October 2014 - 01:10 PM, said:
I've used the London & Port Stanley Railway below as my typical example where I cannot maintain 60FPS on my normal settings. It was managing 40-45FPS.
Genma Saotome, on 11 October 2014 - 01:10 PM, said:
I'd be interested in the link since more information is always good - this is an area where most people, myself included, have to make plenty of guesses at what is going on.
I do have an image from running Open Rails with vsync on, in a window, with a scene where it cannot manage 60FPS:
http://james-ross.co.uk/temp/orts_123.png
What I see as most interesting here is in the "Flip Queue", where you can clearly see that it is presenting on many but not all vsyncs - but it is nevertheless presenting right on cue at those vsyncs it can. This matches what I said earlier about vsync with <60FPS, though it's entirely possible that different things happen with different OSes, drivers and fullscreen/windowed. Certainly, going full screen (with fast full screen disabled) will be different because the game will be presenting directly to the hardware, instead of going through the desktop composition (called Aero in Windows Vista/7 and always-on in Windows 8+).
Genma Saotome, on 11 October 2014 - 01:10 PM, said:
They're much more than individual draw calls... the scene in the earlier post was drawing ~1000 primitives (~430 shadow, ~570 normal) but there are only 9 light green blocks (plus two for the present). These blocks are whatever the NVIDIA (in my case) driver decides to package up for sending from user-mode to kernel-mode, basically. I don't think there's any requirements on how much or little can go in each packet, except that present has to be separate AFAIK.