Elvas Tower: Run through polys (was part of ESD_Complex) - Elvas Tower

Jump to content

  • 2 Pages +
  • 1
  • 2
  • You cannot start a new topic
  • You cannot reply to this topic

Run through polys (was part of ESD_Complex) Rate Topic: -----

#1 User is offline   Genma Saotome 

  • Owner Emeritus and Admin
  • PipPipPipPipPipPipPipPipPipPipPipPipPip
  • Group: ET Admin Group
  • Posts: 15,651
  • Joined: 11-January 04
  • Gender:Male
  • Location:United States
  • Simulator:Open Rails
  • Country:

Posted 03 October 2013 - 09:22 AM

Another thought on this: I think there are many situations where all of the decisions are best left to the modeler... consider the simple example of two identically sized polys, one facing to the ground and the other to the sky -- a car roof vs floor for example. The modeler will know the ground facing face should have a much shorter LOD than the sky facing one because it's far less likely to be seen at any distance other than rather close up.

Another example is what I call a run thru: You have symmetrical features on opposite sides of something... F3 portholes for instance. It's cheaper (poly-wise) to run thru the locomotive body those faces of the porthole that are at right angles to the body... they simply pop out the other side to be used by the porthole over there. That makes the face rather large but only a small percentage of it is visible outside of the carbody -- it's the GPU that does the culling, not the modeler.

Could the software figure out these conditions? Do they occur frequently enough to do the work to figure them out? I have my doubts.

What would be nice is a spreadsheet page that would have columns of screen resolutions and rows of inches (or cm) with the data in the array showing at what distance from the camera that distance for that screen resolution falls below 1 pixel. IOW, model for 1980x1020, have a 4 inch wide face, run your fingers across the array and see at n meters it disappears.

I seem to recall James posted a formula to calculate that... I should go search for it... see what I can do with it.

#2 User is offline   wacampbell 

  • Member since Nov. 2003
  • Group: Fan: Traction Nuts
  • Posts: 2,415
  • Joined: 22-November 03
  • Gender:Male
  • Location:British Columbia, Canada
  • Country:

Posted 03 October 2013 - 10:14 AM

View PostGenma Saotome, on 03 October 2013 - 09:22 AM, said:

Another example is what I call a run thru: You have symmetrical features on opposite sides of something... F3 portholes for instance. It's cheaper (poly-wise) to run thru the locomotive body those faces of the porthole that are at right angles to the body... they simply pop out the other side to be used by the porthole over there. That makes the face rather large but only a small percentage of it is visible outside of the carbody -- it's the GPU that does the culling, not the modeler.


I see this a lot but its economy is questionable. First you have to guarantee that the "run thru" is rendered after the "body". Failing to do so results in the pixel shader running for all the hidden pixels on the "run thru", looking up the texture, computing color, lighting and shadows. Those pixels will only later be overwritten when the body is rendered. When there is a high ratio of hidden to visible pixels this can have a huge impact on performance relative to just adding the necessary vertices to eliminate the hidden pixels. If you do find a way to get the "run thru" to render last, the gains are still questionable as much of the pixel processor still has to be run just to determine if the pixel is hidden, although it could save the texture lookup and lighting calculations. Just an FYI ...

#3 User is offline   alkomv 

  • Hostler
  • Group: Status: Inactive
  • Posts: 52
  • Joined: 29-June 13
  • Gender:Male
  • Simulator:Open Rails
  • Country:

Posted 03 October 2013 - 11:55 PM

View PostGenma Saotome, on 03 October 2013 - 09:22 AM, said:

Another thought on this: I think there are many situations where all of the decisions are best left to the modeler... consider the simple example of two identically sized polys, one facing to the ground and the other to the sky -- a car roof vs floor for example. The modeler will know the ground facing face should have a much shorter LOD than the sky facing one because it's far less likely to be seen at any distance other than rather close up.

Another example is what I call a run thru: You have symmetrical features on opposite sides of something... F3 portholes for instance. It's cheaper (poly-wise) to run thru the locomotive body those faces of the porthole that are at right angles to the body... they simply pop out the other side to be used by the porthole over there. That makes the face rather large but only a small percentage of it is visible outside of the carbody -- it's the GPU that does the culling, not the modeler.

Could the software figure out these conditions? Do they occur frequently enough to do the work to figure them out? I have my doubts.

What would be nice is a spreadsheet page that would have columns of screen resolutions and rows of inches (or cm) with the data in the array showing at what distance from the camera that distance for that screen resolution falls below 1 pixel. IOW, model for 1980x1020, have a 4 inch wide face, run your fingers across the array and see at n meters it disappears.

I seem to recall James posted a formula to calculate that... I should go search for it... see what I can do with it.


Just the bold bit.

I've also been told that for game models this not much use, the porthole windows on the SOU Tobacco car I'm playing with are just a separate mesh, I can drop them as part of the LOD

#4 User is offline   Genma Saotome 

  • Owner Emeritus and Admin
  • PipPipPipPipPipPipPipPipPipPipPipPipPip
  • Group: ET Admin Group
  • Posts: 15,651
  • Joined: 11-January 04
  • Gender:Male
  • Location:United States
  • Simulator:Open Rails
  • Country:

Posted 04 October 2013 - 11:59 AM

View Postwacampbell, on 03 October 2013 - 10:14 AM, said:

I see this a lot but its economy is questionable. First you have to guarantee that the "run thru" is rendered after the "body". Failing to do so results in the pixel shader running for all the hidden pixels on the "run thru", looking up the texture, computing color, lighting and shadows. Those pixels will only later be overwritten when the body is rendered. When there is a high ratio of hidden to visible pixels this can have a huge impact on performance relative to just adding the necessary vertices to eliminate the hidden pixels. If you do find a way to get the "run thru" to render last, the gains are still questionable as much of the pixel processor still has to be run just to determine if the pixel is hidden, although it could save the texture lookup and lighting calculations. Just an FYI ...


Wayne, where does the pixel shader do its work, CPU or GPU? Is there a way to measure the load? I ask as I do a lot of run thru's in models for urban architecture...factories, lofts, offices, skyscrapers... most have 2 symmetrical sides, many have 4. Doing run-thru's cuts the poly cost of 3d geometry on both sides... removing them and going with a manifold model would be a 50% increase of polys for every item so changed. Added up that can represent a lot of polys.

#5 User is offline   Genma Saotome 

  • Owner Emeritus and Admin
  • PipPipPipPipPipPipPipPipPipPipPipPipPip
  • Group: ET Admin Group
  • Posts: 15,651
  • Joined: 11-January 04
  • Gender:Male
  • Location:United States
  • Simulator:Open Rails
  • Country:

Posted 04 October 2013 - 03:36 PM

ADMIN EDIT

Thread split off from ESD_Complex at this point in time.


#6 User is offline   wacampbell 

  • Member since Nov. 2003
  • Group: Fan: Traction Nuts
  • Posts: 2,415
  • Joined: 22-November 03
  • Gender:Male
  • Location:British Columbia, Canada
  • Country:

Posted 04 October 2013 - 03:54 PM

View PostGenma Saotome, on 04 October 2013 - 11:59 AM, said:

Wayne, where does the pixel shader do its work, CPU or GPU? Is there a way to measure the load? I ask as I do a lot of run thru's in models for urban architecture...factories, lofts, offices, skyscrapers... most have 2 symmetrical sides, many have 4. Doing run-thru's cuts the poly cost of 3d geometry on both sides... removing them and going with a manifold model would be a 50% increase of polys for every item so changed. Added up that can represent a lot of polys.


The pixel shader is run entirely in the GPU. The other key shader is the vertex shader - also in the GPU - which runs for each vertex on your object. So your method cuts the vertex shader load in half, but increases the pixel shader load by many hundreds or possibly thousands ( ie the ratio of hidden to visible pixels ). How that plays out in terms of FPS impact is difficult to predict. Most GPU's have many more pixel shader processors than vertex shader processors so to some extent, vertices are more expensive than pixels. So saving vertices is generally a good idea. But when the tradeoff is potentially 100's of times more pixels, than the gains are lost. Careful testing is the only sure proof, and of course different video cards would respond differently. if as an artist you are looking for some guidance, I would say perhaps 10:1 - ie if the hidden pixel count ( or face area ) is 10 x the visible pixel count, it would negate a 50% gain in vertices.

#7 User is offline   Genma Saotome 

  • Owner Emeritus and Admin
  • PipPipPipPipPipPipPipPipPipPipPipPipPip
  • Group: ET Admin Group
  • Posts: 15,651
  • Joined: 11-January 04
  • Gender:Male
  • Location:United States
  • Simulator:Open Rails
  • Country:

Posted 04 October 2013 - 04:06 PM

Out of curiosity I decided to see what differences might be present in Open Rails with a model containing many poly run-thru's vs. the same model, edited, with all of them removed.

The test environment:
Inside the route editor I drove the camera out to a completely blank tile and placed 1 instance of the model in question -- it's a large warehouse, 42000 square foot footprint, 6 floors having many 3d features in place such as wall pilasters (they look like support posts along the extrior wall), window and door insets, window frames, window sills. As built this is a high poly model, even with utilization of the run thrus. Shapeviewer reports it as:
Attached Image: Render05-1.jpg



I rotated 90d to the right and moved to the next tile where I placed 60 instances of the same model. I then saved RE, launched Open Rails and from the start location of the Activity rolled the camera out to the solo building and from there out the end of the world (there be Dragons beyond) and screenshot the OR data at that point:
Attached Image: Render0.jpg
The lack of work to do is evidenced by the high fps and the ultra low primitive count.



I pulled the camera back from the edge a bit to capture a little bit of terrain, just to see what changes:
Attached Image: Render01.jpg



A bit more for more terrain:
Attached Image: Render02.jpg



I then pulled the camera back to where the building was placed and screen shot that:

Attached Image: Render03.jpg



Rolling right I moved the camera over to where the 60 were placed and did a save. Rerunning OR using the resume command I took this screenshot:
Attached Image: Render05.jpg
My purpose in doing a save & restore was to ensure a like environment to compare with the revised model as in doing that a resume was necessary.

I then edited the model and removed all of the run thrus. Of note, all of the horizontal faces used faces sized at 42000 square feet. For each of those I retained 15 inches around the outer edge and removed everything else -- a 41000 square foot reduction, all of which had previously been hidden inside the walls. The revised model, in Shapeview now shows this (an increase of about a 1200 polys:
Attached Image: Render05-2.jpg



After changing the model name in the .w files I fired up OR again and resumed, as before, taking this screenshot of the modified building:
Attached Image: Render06.jpg



Examining the data I see an increase in fps of 2 for the no run thrus model, no difference in the number of primitives (expected), and an increase of 1 in objects being managed (unexpected and not explained).

So what does it mean? I'm not sure. Might the slight increase in fps be accounted for by the extra polys? Perhaps. What is not evident is any performance difference that I can see that is tied to the work the pixel shader is doing... the surface area of the run-thru model is vastly greater, at least a million square feet in area. I am aware the run thrus can occasional lead to an usual illumination of faces appearing out of a building on the far side of the sun... it's subtle but present. I dis not look closely to see what differences, of any, the revised model might have.

Anybody else able to draw any conclusions from all of this?

**edited to correct image sequence and verbal clarity.

#8 User is offline   wacampbell 

  • Member since Nov. 2003
  • Group: Fan: Traction Nuts
  • Posts: 2,415
  • Joined: 22-November 03
  • Gender:Male
  • Location:British Columbia, Canada
  • Country:

Posted 04 October 2013 - 04:12 PM

View PostGenma Saotome, on 04 October 2013 - 04:06 PM, said:


Examining the data I see an reduction in fps of 2 (run thrus vs. no run thrus), no difference in the number of primitives (expected), and an increase of 1 in objects being managed (unexpected and not explained).



This is pretty good work and clearly refutes any negative impact from the run thrus. My theoretical knowledge is lacking somewhere ( many places probably ) Sorry for misleading you - ... carry on ...

#9 User is offline   Genma Saotome 

  • Owner Emeritus and Admin
  • PipPipPipPipPipPipPipPipPipPipPipPipPip
  • Group: ET Admin Group
  • Posts: 15,651
  • Joined: 11-January 04
  • Gender:Male
  • Location:United States
  • Simulator:Open Rails
  • Country:

Posted 04 October 2013 - 04:16 PM

View Postwacampbell, on 04 October 2013 - 04:12 PM, said:

My theoretical knowledge is lacking somewhere ( many places probably ) Sorry for misleading you - ... carry on ...


Not in the least Wayne. Had you not mentioned it I'd have never done the test. We still are lacking in a good understanding of what is really going on... is the drop in fps due to polys... a change in what the vertex shader has to do... something else we're not even considering? I dunno (I'm just always looking for better fps).

#10 User is offline   alkomv 

  • Hostler
  • Group: Status: Inactive
  • Posts: 52
  • Joined: 29-June 13
  • Gender:Male
  • Simulator:Open Rails
  • Country:

Posted 04 October 2013 - 05:35 PM

View Postwacampbell, on 04 October 2013 - 03:54 PM, said:

The pixel shader is run entirely in the GPU. The other key shader is the vertex shader - also in the GPU - which runs for each vertex on your object. So your method cuts the vertex shader load in half, but increases the pixel shader load by many hundreds or possibly thousands ( ie the ratio of hidden to visible pixels ). How that plays out in terms of FPS impact is difficult to predict. Most GPU's have many more pixel shader processors than vertex shader processors so to some extent, vertices are more expensive than pixels. So saving vertices is generally a good idea. But when the tradeoff is potentially 100's of times more pixels, than the gains are lost. Careful testing is the only sure proof, and of course different video cards would respond differently. if as an artist you are looking for some guidance, I would say perhaps 10:1 - ie if the hidden pixel count ( or face area ) is 10 x the visible pixel count, it would negate a 50% gain in vertices.


A question, how do smoothing groups i.e. number of same, affect the both the pixel and in particular vertex shader in OR [ MSTS ? ]

Thanks

#11 User is offline   captain_bazza 

  • Chairman, Board of Directors
  • PipPipPipPipPipPipPipPipPipPipPip
  • Group: ET Admin Group
  • Posts: 13,931
  • Joined: 21-February 06
  • Gender:Male
  • Location:Way, way, way, South
  • Simulator:MSTS & OR
  • Country:

Posted 04 October 2013 - 07:48 PM

May I suggest you look at the interior of the model and see if the reverse side, or inner facing polys are visible. If so then the modeling prog' hasn't culled these, hence a lot of invisible polys may be rendered anyway.

Cheers Bazza

#12 User is offline   rdamurphy 

  • Open Rails Developer
  • Group: Private - Open Rails Developer
  • Posts: 1,199
  • Joined: 04-May 06
  • Gender:Male
  • Location:Thornton, CO
  • Simulator:MSTS - OR
  • Country:

Posted 05 October 2013 - 01:52 AM

Back in the days of yore, before modern GPU's, a computer game had to draw each and every poly on the screen. It would start with those farthest away, and draw towards you, hopefully rendering the correct ones as it went along, to create your screen.

Then, graphics card manufactures figured out that they could save a lot of clock cycles by simply NOT drawing any pixels that weren't visible, because they were hidden, or pointed at a direction at right angles to the camera.

Which works well... Until you put two polys through each other, say, at right angles, like, as you mentioned, a "run through" pilaster. Now, the GPU has to draw the polys because they are visible, well, a small part of them are, and it has to figure out where in the z-buffer to put them, and then draw the correct polys over top of them. This is multipass rendering, which also helps to draw the shadows in the scene.

Having the pilasters on the outside of the building - completely - means one pass per surface, the wall, the pilaster. When you run them through the building to "save polys" what you're doing is requiring the GPU to sort the polys, drawing the interior portion of the run-through and then the wall, then the exterior, or two passes per surface.

Are you saving GPU time by having less polys? No, because the ones on the opposite side of the building aren't rendered anyway, since they're entirely not visible, or facing away and invisible to begin with. There are two methods of Occlusion Culling known as Occlusion Query and Early-Z Rejection.

Basically, anything that isn't visible, occluded, is simply ignored and not rendered at all. Early-Z is a bit more complicated, since the GPU starts the rasterization process, but as soon as it decides it's not visible, it rejects the poly, and at that point, saves the GPU work and memory by not loading the texture.

Ah, but those run through polys can't be rejected because they are partially visible, and do have to be rendered, and then z-buffered over. If you had "extra polys" on the back side of the building, they're rejected by the GPU's occlusion culling, and the geometry isn't even loaded.

So, it's not surprising at all that you saw a slight rise in FPS, because by using more polys you're reducing the workload on the GPU.

Please, I'm not an "expert" on graphics cards, so don't take this as gospel, someone like James can probably explain it an awful lot better than I can, but it is a pretty basic explanation of why you're getting the results you are.

Robert

#13 User is online   James Ross 

  • Open Rails Developer
  • Group: Posts: Elite Member
  • Posts: 5,508
  • Joined: 30-June 10
  • Gender:Not Telling
  • Simulator:Open Rails
  • Country:

Posted 05 October 2013 - 01:57 AM

View PostGenma Saotome, on 04 October 2013 - 04:06 PM, said:

Anybody else able to draw any conclusions from all of this?


Unfortunately, as Wayne has noted, it is a very complex pipeline of work going on rendering models, so it's pretty tricky to come to any conclusions from the data we have. The 12.4ms down to 12.1ms frame times do indicate something was running quicker without the run-throughs, so it's possible that for this model on this graphics card the extra vertices are quicker than the extra pixels.

The problem is that the cost of individual vertices or pixels is basically unmeasurable, and can vary between GPU architecture/model.

So, I think it basically comes down to a judgement call from the modeller; my advice would be that, if the run-throughs have an area that is significant relative to the whole model, consider not doing the run-through. I don't think we can do better than that.

#14 User is offline   rdamurphy 

  • Open Rails Developer
  • Group: Private - Open Rails Developer
  • Posts: 1,199
  • Joined: 04-May 06
  • Gender:Male
  • Location:Thornton, CO
  • Simulator:MSTS - OR
  • Country:

Posted 05 October 2013 - 02:01 AM

James, according to NVidia, larger numbers of small polys are best for Occlusion Culling, large polys tend to slow down the process.

Robert

#15 User is offline   wacampbell 

  • Member since Nov. 2003
  • Group: Fan: Traction Nuts
  • Posts: 2,415
  • Joined: 22-November 03
  • Gender:Male
  • Location:British Columbia, Canada
  • Country:

Posted 05 October 2013 - 05:28 AM

View Postrdamurphy, on 05 October 2013 - 02:01 AM, said:

James, according to NVidia, larger numbers of small polys are best for Occlusion Culling, large polys tend to slow down the process.


Just to clarify, Open Rails does not use Occlusion Culling. Its a CPU intensive technique with marginal benefits on modern hardware. Open Rails uses Distance Culling, Frustrum Culling, Backface Culling, and for the shaders that allow it, Early Z Culling. Note that the latter depends on draw order of the polygons, where hidden pixels won't be culled if their pixels are drawn before the occluding poly's ( ie its not hugely effective ).

  • 2 Pages +
  • 1
  • 2
  • You cannot start a new topic
  • You cannot reply to this topic

1 User(s) are reading this topic
0 members, 1 guests, 0 anonymous users