It's a number of years since I worked on VirtualPlanetBuilder so I'm
afraid I don't recall details, problems with handling many thousands
of tasks doesn't ring any bells, it could easily be problem that
wasn't picked up during development and testing. I don't recall the
function of the .added file off the top of my head. I also don't have
a cluster available to test things right now.
The best I can recommend would be to put some debugging into the
vpbmaster run to see what file writes are happing and overall memory
consumption on the master. Perhaps if it's just progress logs that
are being generated and this is becoming a bottleneck then this part
could possibly be adapted to avoid the bottleneck or made optional.
Another approach you could take is to change the build so that the
individual tasks take on the generation of more levels so that the
number of tasks reduces below the level that is causing the
bottleneck. This is a bit of hack but might get your further without
the need to.
On 31 August 2016 at 10:35, Knut Karlsen <> wrote:
We have setup vpmaster to use a machinepool to distribute the work of making a large terrain db. This works fairly well and we are getting the expected results after about 40 hours of work. However, after a few thousand of 32000 tasks has been completed it appears that the master can't provide tasks fast enought to the other workers. It will drop from ~ 100 running tasks to 3-4 for a long time. We have ~100 processes configured across 6 machines (40-40-8-8-8-, and when running tasks are few the other machines have almost no load.
See the htop snapshot from the master to see the situation.
After som investigation it appears (and I'm just specualting) that the main vpmaster process writes an enourmous amount of data to the
"terrainname.ive.0.added" file in the output folder.
This files gets an increasing number of task names written to it at a rate of 600 MB in a few seconds.
Tlines are like this:
It appears to add 600 MB worth of these lines every few seconds, which really saturates the disk i/o and keeps one of the processes at 100%
At some point this line is added:
PlanetSAT150m_Mexico15m_vpbmaster.ive.0.added: file truncated
And the file is set to 0 MB and it starts to write to it again.
If I cancle the vpbmaster run and resubmit the tasks, it will start with normal effiency, but after a few thousand tasks this behaviour starts again.
Has anyone seen this behaviour before?
Read this topic online here:
Post generated by Mail2Forum