OpenSceneGraph Forum Forum Index OpenSceneGraph Forum
Official forum which mirrors the existent OSG mailing lists. Messages posted here are forwarded to the mailing list and vice versa.
 
   FAQFAQ    SearchSearch    MemberlistMemberlist    RulesRules    UsergroupsUsergroups    RegisterRegister 
 Mail2Forum SettingsMail2Forum Settings  ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 
   AlbumAlbum  OpenSceneGraph IRC ChatOpenSceneGraph IRC Chat   SmartFeedSmartFeed 

vpbmaster appears to slow down after thousands of tasks completed


 
Post new topic   Reply to topic    OpenSceneGraph Forum Forum Index -> VirtualPlanetBuilder [vpb]
View previous topic :: View next topic  
Author Message
etunko
Newbie


Joined: 31 Aug 2016
Posts: 3

PostPosted: Wed Aug 31, 2016 9:35 am    Post subject:
vpbmaster appears to slow down after thousands of tasks completed
Reply with quote

Hi,

We have setup vpmaster to use a machinepool to distribute the work of making a large terrain db. This works fairly well and we are getting the expected results after about 40 hours of work. However, after a few thousand of 32000 tasks has been completed it appears that the master can't provide tasks fast enought to the other workers. It will drop from ~ 100 running tasks to 3-4 for a long time. We have ~100 processes configured across 6 machines (40-40-8-8-8-8), and when the number of running tasks reported in the vpbmaster output are 3-4 the load on the machines is very low.

See the htop snapshot from the master to see the situation.

After som investigation it appears (and I'm just specualting) that the main vpmaster process writes an enourmous amount of data to the

"terrainname.ive.0.added" file in the output folder.

This files gets an increasing number of file names written to it at a rate of 600 MB in a few seconds.

The lines in the file are like this:
Quote:
/mnt/master/vpb/PlanetSAT150m_Mexico15m_vpbmaster/output/PlanetSAT150m_Mexico15m_vpbmaster_subtile_L3_X1_Y2/PlanetSAT150m_Mexico15m_vpbmaster_subtile_L8_X56_Y77/PlanetSAT150m_Mexico15m_vpbmaster_L12_X905_Y1246_subtile.ive



It appears to add 600 MB worth of these lines every few seconds, which really saturates the disk i/o and keeps one of the processes at 100%

After a few seconds this message is appended

Quote:
"PlanetSAT150m_Mexico15m_vpbmaster.ive.0.added: file truncated"


and the file is set to 0 MB and it starts to write to it again.

If I cancel the vpbmaster run and resubmit the tasks, it will start with normal effiency, but after a few thousand tasks this behaviour starts again.

Has anyone seen this behaviour before? We are using vpbmaster 0.9.11

...


Thank you!

Cheers,
Knut
Back to top
View user's profile Send private message
robertosfield
OSG Project Lead


Joined: 18 Mar 2009
Posts: 10791

PostPosted: Wed Aug 31, 2016 11:51 am    Post subject:
vpbmaster appears to slow down after thousands of tasks completed
Reply with quote

Hi Knut,

It's a number of years since I worked on VirtualPlanetBuilder so I'm
afraid I don't recall details, problems with handling many thousands
of tasks doesn't ring any bells, it could easily be problem that
wasn't picked up during development and testing. I don't recall the
function of the .added file off the top of my head. I also don't have
a cluster available to test things right now.

The best I can recommend would be to put some debugging into the
vpbmaster run to see what file writes are happing and overall memory
consumption on the master. Perhaps if it's just progress logs that
are being generated and this is becoming a bottleneck then this part
could possibly be adapted to avoid the bottleneck or made optional.

Another approach you could take is to change the build so that the
individual tasks take on the generation of more levels so that the
number of tasks reduces below the level that is causing the
bottleneck. This is a bit of hack but might get your further without
the need to.

Robert.


On 31 August 2016 at 10:35, Knut Karlsen <> wrote:
Quote:
Hi,

We have setup vpmaster to use a machinepool to distribute the work of making a large terrain db. This works fairly well and we are getting the expected results after about 40 hours of work. However, after a few thousand of 32000 tasks has been completed it appears that the master can't provide tasks fast enought to the other workers. It will drop from ~ 100 running tasks to 3-4 for a long time. We have ~100 processes configured across 6 machines (40-40-8-8-8-Cool, and when running tasks are few the other machines have almost no load.

See the htop snapshot from the master to see the situation.

After som investigation it appears (and I'm just specualting) that the main vpmaster process writes an enourmous amount of data to the

"terrainname.ive.0.added" file in the output folder.

This files gets an increasing number of task names written to it at a rate of 600 MB in a few seconds.

Tlines are like this:
/mnt/master/vpb/PlanetSAT150m_Mexico15m_vpbmaster/output/PlanetSAT150m_Mexico15m_vpbmaster_subtile_L3_X1_Y2/PlanetSAT150m_Mexico15m_vpbmaster_subtile_L8_X56_Y77/PlanetSAT150m_Mexico15m_vpbmaster_L12_X905_Y1246_subtile.ive


It appears to add 600 MB worth of these lines every few seconds, which really saturates the disk i/o and keeps one of the processes at 100%

At some point this line is added:
PlanetSAT150m_Mexico15m_vpbmaster.ive.0.added: file truncated

And the file is set to 0 MB and it starts to write to it again.

If I cancle the vpbmaster run and resubmit the tasks, it will start with normal effiency, but after a few thousand tasks this behaviour starts again.

Has anyone seen this behaviour before?

...


Thank you!

Cheers,
Knut

------------------
Read this topic online here:
http://forum.openscenegraph.org/viewtopic.php?p=68495#68495




Attachments:
http://forum.openscenegraph.org//files/htop_265.jpg





------------------
Post generated by Mail2Forum
Back to top
View user's profile Send private message
etunko
Newbie


Joined: 31 Aug 2016
Posts: 3

PostPosted: Thu Sep 01, 2016 8:33 am    Post subject:
Reply with quote

Hi Robert,

Thanks for your quick response, that is appreciated!

We will do some investigating on how to get the debug information added and possibly look through the code to see what is going on.

...


Thank you!

Cheers,
Knut
Back to top
View user's profile Send private message
etunko
Newbie


Joined: 31 Aug 2016
Posts: 3

PostPosted: Fri Sep 16, 2016 7:55 am    Post subject:
Reply with quote

Hi,

We have not solved the issue, but we make do with monitoring the situation and do a restart now an then. I'll update if we go any further with debugging.

Thanks!
...


Thank you!

Cheers,
Knut
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    OpenSceneGraph Forum Forum Index -> VirtualPlanetBuilder [vpb] All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You cannot download files in this forum

Similar Topics
Topic Author Forum Replies Posted
No new posts Matrices interpreted wrong in AC load... freqfly Plugins [osgPlugins] 7 Sun Mar 12, 2017 4:37 am View latest post
No new posts Slow down with shared nodes Andre Normann General 18 Tue Mar 07, 2017 8:24 am View latest post
No new posts slow speed of osgDB::writeImageFile()... cbuchner1 General 0 Mon Sep 12, 2016 9:09 am View latest post
No new posts I created 10000 cylinder, osg run ver... calvin General 10 Tue Jul 21, 2015 1:22 am View latest post
No new posts clip/mask with osgdem / vpbmaster maya leonard General 1 Tue Mar 31, 2015 12:02 pm View latest post


Board Security Anti Bot Question MOD - phpBB MOD against Spam Bots
Powered by phpBB © 2001, 2005 phpBB Group
Protected by Anti-Spam ACP