OpenSceneGraph Forum Forum Index OpenSceneGraph Forum
Official forum which mirrors the existent OSG mailing lists. Messages posted here are forwarded to the mailing list and vice versa.
 
   FAQFAQ    SearchSearch    MemberlistMemberlist    RulesRules    UsergroupsUsergroups    RegisterRegister 
 Mail2Forum SettingsMail2Forum Settings  ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 
   AlbumAlbum  OpenSceneGraph IRC ChatOpenSceneGraph IRC Chat   SmartFeedSmartFeed 

Dynamic VBO Performance Drop


 
Post new topic   Reply to topic    OpenSceneGraph Forum Forum Index -> General
View previous topic :: View next topic  
Author Message
ravidavi
User


Joined: 06 Nov 2015
Posts: 57

PostPosted: Thu Dec 06, 2018 8:36 pm    Post subject:
Dynamic VBO Performance Drop
Reply with quote

Hello all,

I'm running into a strange performance drop issue when using dynamic VBOs that change frequently. I am measuring performance using framerate with vsync turned off. I know that framerate isn't always the best performance measurement, but my example is simple enough and the performance drop is significant and repeatable, so I feel comfortable using framerate. 


The issue: Suppose I have a Geometry that will hold lots of points (e.g. 100k or more). If I choose to pre-define all points in its vertex array, then a certain framerate is achieved. However, if I choose to add a batch of points during each update traversal, up to the same total number of points, then after all points have been added the framerate is much lower than in the pre-defined model. Note that "much lower" means over 30% lower.



Note that in both cases, the same number of points are being drawn, and the Geometry and its vertex array are created once and modified (I'm not creating new Geometry objects at every update). All that changes is whether I added the points all at once before rendering or a few at a time while rendering.


I wrote a small standalone osg example (attached). Compile, run, and show stats using:
Quote:
 .osgdynamicvbotest.exe --numPoints 100000 --batchSize 100000
   * If batchSize = 100000 (same as numPoints) then you'll see the case where all points are pre-defined.
   * As you reduce batchSize (e.g. 100), it will take longer to add the total number of points, but after all points have been added and the framerate stabilizes, you'll see it is much lower than the pre-defined case above.


My question is, why is this happening? Is it related to intermediate VBOs being kept in memory and slowing down the GPU? All the other forum posts I see on the topic are either about VBOs not displaying properly (not the case here) or about memory usage (not the case here). 
 

Any thoughts on what's going on here would be very much appreciated.


Thank you!
Ravi

------------------
Post generated by Mail2Forum
Back to top
View user's profile Send private message
Wojtek
User


Joined: 02 Nov 2011
Posts: 90

PostPosted: Mon Dec 10, 2018 1:35 pm    Post subject:
Dynamic VBO Performance Drop
Reply with quote

Hi Ravi,
We usually do not make such extensive checks but we were debuging other interesting VBO problem so we also checked yours. Few observations.:


0. I noticed you used multithreaded configuration and switched to SingleThreaded. Multithreaded config creates 2 instances of GL resources and I thought it may affect your measurments so we continued with SingleThreaded later. 


1. Code line where you set DYNAMIC_DRAW is followed by setVertexArray and  setVertexArray resets this to STATIC_DRAW. You will get better results when you setUsage after all arrays were defined (like this, note I made numPoints and batchSize global) :



[...]
  geom->setColorArray(lineColors, osg::Array::BIND_OVERALL);
  geom->addPrimitiveSet(new osg::DrawArrays(osg::PrimitiveSet::LINE_STRIP, 0, 0));


  if ( numPoints > batchSize )
    geom->getOrCreateVertexBufferObject()->setUsage(GL_DYNAMIC_DRAW);
  else
    geom->getOrCreateVertexBufferObject()->setUsage(GL_STATIC_DRAW);

[...]


2. Once we set GL_DYNAMIC_DRAW we see similar performance (on Nvidia GTX 1080 Windows 10) in both versions. 


3. So in your code the VBO was always refreshed with GL_STATIC_DRAW. We suspect that problem is actually related to OpenGL driver memory management. My friend Marcin Hajder  checked the underlying OpenGL calls with CodeXL and both versions made exactly the same calls per frame after updates stopped. And buffer and array sizes were the same too. So we concluded that it must be some memory fragmentation/thrashing issue in OpenGL driver. This suspicion was somewhat confirmed when we checked the memory use. When updates stabilized the dynamic version was still taking 10 MB more GPU/RAM than static version. See attached screenshots from ProcessExplorer. Picture with larger mem use is dynamic, smaller mem use picter is static version. Note MB usage drop in dynamic version after minute or so from the moment updates stopped. I suspect driver compacted the memory when it noticed the resources are no longer updated.








Cheers,
Hope this helps,
Wojtek Lewandowski & Marcin Hajder



czw., 6 gru 2018 o 21:36 Ravi Mathur < (
Only registered users can see emails on this board!
Get registred or enter the forums!
)> napisał(a):

Quote:
Hello all,

I'm running into a strange performance drop issue when using dynamic VBOs that change frequently. I am measuring performance using framerate with vsync turned off. I know that framerate isn't always the best performance measurement, but my example is simple enough and the performance drop is significant and repeatable, so I feel comfortable using framerate. 


The issue: Suppose I have a Geometry that will hold lots of points (e.g. 100k or more). If I choose to pre-define all points in its vertex array, then a certain framerate is achieved. However, if I choose to add a batch of points during each update traversal, up to the same total number of points, then after all points have been added the framerate is much lower than in the pre-defined model. Note that "much lower" means over 30% lower.



Note that in both cases, the same number of points are being drawn, and the Geometry and its vertex array are created once and modified (I'm not creating new Geometry objects at every update). All that changes is whether I added the points all at once before rendering or a few at a time while rendering.


I wrote a small standalone osg example (attached). Compile, run, and show stats using:
Quote:
 .osgdynamicvbotest.exe --numPoints 100000 --batchSize 100000
   * If batchSize = 100000 (same as numPoints) then you'll see the case where all points are pre-defined.
   * As you reduce batchSize (e.g. 100), it will take longer to add the total number of points, but after all points have been added and the framerate stabilizes, you'll see it is much lower than the pre-defined case above.


My question is, why is this happening? Is it related to intermediate VBOs being kept in memory and slowing down the GPU? All the other forum posts I see on the topic are either about VBOs not displaying properly (not the case here) or about memory usage (not the case here). 
 

Any thoughts on what's going on here would be very much appreciated.


Thank you!
Ravi


_______________________________________________
osg-users mailing list
(
Only registered users can see emails on this board!
Get registred or enter the forums!
)
http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org


------------------
Post generated by Mail2Forum
Back to top
View user's profile Send private message
ravidavi
User


Joined: 06 Nov 2015
Posts: 57

PostPosted: Wed Dec 12, 2018 6:48 am    Post subject:
Dynamic VBO Performance Drop
Reply with quote

Thank you so much for the detailed response Wojtek! I had incorrectly assumed that getOrCreateVertexBufferObject() "assigns" the new VBO to the Geometry, similarly to what happens with getOrCreateStateSet() for a Node. But after reading your response, digging into the OSG code, and reading up more on VBOs, I understand why I was mistaken. A VBO is naturally associated with an array so only gets created when the first array is added to the Geometry.

After moving the setUsage code after array assignment as you suggested, I did see a slight increase in framerate for the batched version, but not up to par with the preallocated version. I'm using an Nvidia GTX 980 on Windows 10. However, it does seem to be a driver issue since memory usage settles down a bit if I leave the application running for a while after all points have been added.


The real lesson I learned here is not to abuse OSG's automatic VBO management by increasing the buffer in small increments at every frame. In my actual application I run the update callback at a fixed rate and increase the arrays by larger chunks by using resize() instead of repeated push_back's. This seems to have fully addressed the issue.


Thanks again!
Ravi


On Mon, Dec 10, 2018 at 8:35 AM Wojciech Lewandowski < (
Only registered users can see emails on this board!
Get registred or enter the forums!
)> wrote:

Quote:
Hi Ravi,
We usually do not make such extensive checks but we were debuging other interesting VBO problem so we also checked yours. Few observations.:


0. I noticed you used multithreaded configuration and switched to SingleThreaded. Multithreaded config creates 2 instances of GL resources and I thought it may affect your measurments so we continued with SingleThreaded later. 


1. Code line where you set DYNAMIC_DRAW is followed by setVertexArray and  setVertexArray resets this to STATIC_DRAW. You will get better results when you setUsage after all arrays were defined (like this, note I made numPoints and batchSize global) :



[...]
  geom->setColorArray(lineColors, osg::Array::BIND_OVERALL);
  geom->addPrimitiveSet(new osg::DrawArrays(osg::PrimitiveSet::LINE_STRIP, 0, 0));


  if ( numPoints > batchSize )
    geom->getOrCreateVertexBufferObject()->setUsage(GL_DYNAMIC_DRAW);
  else
    geom->getOrCreateVertexBufferObject()->setUsage(GL_STATIC_DRAW);

[...]


2. Once we set GL_DYNAMIC_DRAW we see similar performance (on Nvidia GTX 1080 Windows 10) in both versions. 


3. So in your code the VBO was always refreshed with GL_STATIC_DRAW. We suspect that problem is actually related to OpenGL driver memory management. My friend Marcin Hajder  checked the underlying OpenGL calls with CodeXL and both versions made exactly the same calls per frame after updates stopped. And buffer and array sizes were the same too. So we concluded that it must be some memory fragmentation/thrashing issue in OpenGL driver. This suspicion was somewhat confirmed when we checked the memory use. When updates stabilized the dynamic version was still taking 10 MB more GPU/RAM than static version. See attached screenshots from ProcessExplorer. Picture with larger mem use is dynamic, smaller mem use picter is static version. Note MB usage drop in dynamic version after minute or so from the moment updates stopped. I suspect driver compacted the memory when it noticed the resources are no longer updated.








Cheers,
Hope this helps,
Wojtek Lewandowski & Marcin Hajder



czw., 6 gru 2018 o 21:36 Ravi Mathur < (
Only registered users can see emails on this board!
Get registred or enter the forums!
)> napisał(a):

Quote:
Hello all,

I'm running into a strange performance drop issue when using dynamic VBOs that change frequently. I am measuring performance using framerate with vsync turned off. I know that framerate isn't always the best performance measurement, but my example is simple enough and the performance drop is significant and repeatable, so I feel comfortable using framerate. 


The issue: Suppose I have a Geometry that will hold lots of points (e.g. 100k or more). If I choose to pre-define all points in its vertex array, then a certain framerate is achieved. However, if I choose to add a batch of points during each update traversal, up to the same total number of points, then after all points have been added the framerate is much lower than in the pre-defined model. Note that "much lower" means over 30% lower.



Note that in both cases, the same number of points are being drawn, and the Geometry and its vertex array are created once and modified (I'm not creating new Geometry objects at every update). All that changes is whether I added the points all at once before rendering or a few at a time while rendering.


I wrote a small standalone osg example (attached). Compile, run, and show stats using:
Quote:
 .osgdynamicvbotest.exe --numPoints 100000 --batchSize 100000
   * If batchSize = 100000 (same as numPoints) then you'll see the case where all points are pre-defined.
   * As you reduce batchSize (e.g. 100), it will take longer to add the total number of points, but after all points have been added and the framerate stabilizes, you'll see it is much lower than the pre-defined case above.


My question is, why is this happening? Is it related to intermediate VBOs being kept in memory and slowing down the GPU? All the other forum posts I see on the topic are either about VBOs not displaying properly (not the case here) or about memory usage (not the case here). 
 

Any thoughts on what's going on here would be very much appreciated.


Thank you!
Ravi


_______________________________________________
osg-users mailing list
(
Only registered users can see emails on this board!
Get registred or enter the forums!
)
http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org

_______________________________________________
osg-users mailing list
(
Only registered users can see emails on this board!
Get registred or enter the forums!
)
http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org


------------------
Post generated by Mail2Forum
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    OpenSceneGraph Forum Forum Index -> General All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You cannot download files in this forum

Similar Topics
Topic Author Forum Replies Posted
No new posts CPU Performance issues with AMD 2700 ... robertosfield General 10 Mon Apr 01, 2019 2:01 pm View latest post
No new posts Questions concerning Performance issu... Andrea74 General 3 Fri Dec 14, 2018 11:24 am View latest post
No new posts osgUtil::LineSegmentIntersector perfo... Andrea74 General 6 Tue Dec 04, 2018 4:09 pm View latest post
No new posts CPU load and performance drop when ru... particleSim General 1 Fri Oct 05, 2018 9:39 am View latest post
No new posts Re-purposing the performance graph bcolbert General 0 Wed Aug 22, 2018 4:52 pm View latest post


Board Security Anti Bot Question MOD - phpBB MOD against Spam Bots
Powered by phpBB © 2001, 2005 phpBB Group
Protected by Anti-Spam ACP