View previous topic :: View next topic |
|
Author |
Message |
GMan User
Joined: 23 Jan 2009 Posts: 29
|
Posted: Fri Sep 25, 2009 9:46 pm Post subject: PostProcessGL |
|
|
First of all, great work on bring CUDA to the OSG! I'm just getting schooled up on it right now with the main goal being to duplicate what the PostProcessGL example in the sdk is doing. According to their comments, this requires the following:
1 - render the scene to the framebuffer
2 - copy the image to a PBO (pixel buffer object)
3 - map this PBO so that its memory is accessible from CUDA
4 - run CUDA to process the image, writing to memory mapped from a second PBO
6 - copy from result PBO to a texture
7 - display the texture
I've mainly struggling with steps 1,2, and 3. What's the best way to do this?
Michael Guerrero |
|
Back to top |
|
 |
art (Art Tevs) Site Admin

Joined: 20 Dec 2008 Posts: 414 Location: Saarbrücken, Germany
|
Posted: Sat Sep 26, 2009 10:06 am Post subject: Re: PostProcessGL |
|
|
Hi Michael,
GMan wrote:
|
1 - render the scene to the framebuffer
|
Setup a camera with FRAME_BUFFER_OBJECT as rendering target. Then use the texture you have added to the camera as input into your CUDA environment.
Quote:
|
2 - copy the image to a PBO (pixel buffer object)
3 - map this PBO so that its memory is accessible from CUDA
|
I am not sure if osgCompute gives you this possibility on the fly, however in the osgPPU::Unit::DrawCallback::drawImplementation() method, you can find the functions calls to use in osg to map a texture into CUDA space.
Quote:
|
4 - run CUDA to process the image, writing to memory mapped from a second PBO
|
It is up to you what you implement here
Quote:
|
6 - copy from result PBO to a texture
|
Again you can find in the same method as above the function calls to do this.
Quote:
|
7 - display the texture
|
Render a screen sized quad with the resulting texture.
For the mapping step, upto CUDA2.2, I think, mapping of OpenGL texture into CUDA space is not fast. Only the newest drivers and CUDA implementation allows to do some kind of zero copy to efficiently map textures into cuda space. However, you have to take a look into newest cuda to see if it realy works, like I said.
Next, I think if you just want to postprocess textures it would be easier for you to take a look into osgPPU. It can do exactly the things you want to do manually. Even more the newest osgppu version has CUDA support, so you can process the textures in CUDA and not only in GLSL. Take a look into CUDA examples of osgPPU.
osgCompute is a good library to do general computations inside of OSG's scene graph. So for example you want to process vertices or maybe do some simulation, then this is where you require osgCompute. I am not sure how well osgCompute provides you with support for simple texture processing. I guess, you can use it for this kind of computations, however it could be that you have to take care of specifying inputs, outputs, and render the results.
However, it would be nice to hear from the authors of osgCompute how to solve your needs with this library.
cheers,
art |
|
Back to top |
|
 |
jens.svt User
Joined: 16 Mar 2009 Posts: 30
|
Posted: Mon Sep 28, 2009 1:59 pm Post subject: Re: PostProcessGL |
|
|
Hi Michael,
in osgCompute it is very easy to set up this example. Ok let's go through all the steps you have to do:
GMan wrote:
|
1 - render the scene to the framebuffer
|
Like Art already said you have to render the scene with a camera. Before that you need to attach a osgCuda::Texture2D to it which is nothing more than an extended osg::Texture2D. (If you are not familiar with that stuff you can take a look at the osgprerender example of OSG.)
GMan wrote:
|
2 - copy the image to a PBO (pixel buffer object)
3 - map this PBO so that its memory is accessible from CUDA
|
This is already provided by osgCuda::Textures. You have to call the map()-function available for all osgCompute::Buffers (osgCuda::Texture2D is a osgCompute::Buffer as well). The map function returns a pointer to the GPU memory and hides the memory copy as well as the mapping for you. For all types of buffers (e.g. osgCuda::Geometry) this is exactly the function to call in order to receive a pointer to the device or host memory. This requires an internal memory copy on the GPU and cannot be removed by a zero copy function since you want to map texture memory on the GPU.
GMan wrote:
|
4 - run CUDA to process the image, writing to memory mapped from a second PBO
|
Now you have to call your own kernel module. You can also look at our osgTexDemo example.
GMan wrote:
|
6 - copy from result PBO to a texture
7 - display the texture
|
You need to specify an output osgCuda::Texture2D (You can use the same texture as for your camera if your algorithm allows this). However, to render the result back to your screen you render a screen sized quad with this texture attached. For this step see also our osgTexDemo example.
I hope this answers your question.
Best regards,
Jens
--
SVT Group |
|
Back to top |
|
 |
GMan User
Joined: 23 Jan 2009 Posts: 29
|
Posted: Mon Sep 28, 2009 8:59 pm Post subject: |
|
|
Thanks for the replies guys! My confusion mostly centered around the copies to and from the PBO. I've done my best to modify the texture streaming demo that comes with osgCompute but I'm getting an error on the call to mapStream:
In osgCuda namespace / Texture.cpp
Code:
|
osgCuda::Texture::mapStream()
{
...
/////////////
// MAP PBO //
/////////////
if( NULL == stream._devPtr )
{
cudaError res = cudaGLMapBufferObject( &stream._devPtr, stream._bo );
if( cudaSuccess != res )
{
osg::notify(osg::WARN)
<< "osgCuda::Texture::mapStream() for texture \""<< asObject()->getName()
<< "\": error during cudaGLMapBufferObject() for context \""
<< stream._context->getId()<<"\"."
<< " " << cudaGetErrorString( res ) << "."
<< std::endl;
return NULL;
}
}
...
}
|
So my code never successfully executes cudaGLMapBufferObject.
I know it's difficult to communicate these things but I've included my source files plus the cmake to generate the project (basically a copy of osgTexDemo). I'm going to have a look at osgPPU in the meantime.
Oh, just wanted to add that the reason i'm doing this is for computing the intervisibility between waypoints located in various rendered scenes. This data is then used to determine what waypoints provide the best cover from gunfire. Currently, I'm pulling the rendered texture from the GPU to main memory and doing the computations on the CPU. Unfortunately this is crazy slow so i'm looking to speed up the processing by doing all the computations on the GPU. |
|
Back to top |
|
 |
jens.svt User
Joined: 16 Mar 2009 Posts: 30
|
Posted: Wed Sep 30, 2009 4:53 pm Post subject: |
|
|
Hi Michael,
We could not reproduce the error in the mapStream() function.
However, we found some other parts which might have caused the error.
An error might have occurred due to a "rendertarget" flag. This flag identifies textures that are targets for camera objects. We had to introduce this flag because otherwise we do not know when to copy the texture memory to a PBO.
OpenGL textures in CUDA can only be mapped as linear memory (In the TexDemo example we use an array object for which texture sampling is possible). That is why we changed the cudaBindArrayToTexture() call in "TexStreamer.cu" to cudaBindtexture() in your version. We also had to exchange the tex2D() call within the kernel by a tex1Dfetch() function. We have extended the streamer example by two kernels, one computes a 5x5 gauss filtered image and the other utilizes the sobel operator to detect edges.
You find our files attached. We hope this will help. Please let us know if it works. |
|
Back to top |
|
 |
GMan User
Joined: 23 Jan 2009 Posts: 29
|
Posted: Wed Sep 30, 2009 6:18 pm Post subject: |
|
|
Looks good!
With the sobel filter I get ~360 fps.
With the 5x5 Gaussian blur I get ~300 fps.
The graphics hardware is a GeForce GTX 280. |
|
Back to top |
|
 |
Skylark (Jean-Sébastien Guay) Professional

Joined: 05 Jan 2009 Posts: 2249
|
Posted: Wed Sep 30, 2009 7:01 pm Post subject: PostProcessGL |
|
|
Hello Michael,
Quote:
|
With the sobel filter I get ~360 fps.
With the 5x5 Gaussian blur I get ~300 fps.
The graphics hardware is a GeForce GTX 280.
|
Just curious, to put that into perspective, what frame rate do you get
without filtering on the same scene? (just so we can get an idea of the
overhead it introduces)
J-S
--
______________________________________________________
Jean-Sebastien Guay
http://www.cm-labs.com/
http://whitestar02.webhop.org/
------------------
Post generated by Mail2Forum |
|
Back to top |
|
 |
GMan User
Joined: 23 Jan 2009 Posts: 29
|
Posted: Wed Sep 30, 2009 8:55 pm Post subject: |
|
|
Sure, after removing the filter I was getting ~530 fps. Since fps is nonlinearly related to frame time, here's the breakdown:
530fps = ~1.89ms
360fps = ~2.78ms (sobel)
300fps = ~3.33ms (5x5 guass)
So the cost of applying the slightly more expensive 5x5 filter kernel cost about 3.33ms - 1.89s = 1.44ms.
For reference, 60fps = ~16.67ms
(I assume you already know all of this but it may be of interest to other readers) |
|
Back to top |
|
 |
Skylark (Jean-Sébastien Guay) Professional

Joined: 05 Jan 2009 Posts: 2249
|
Posted: Wed Sep 30, 2009 9:48 pm Post subject: PostProcessGL |
|
|
Hi Michael,
Quote:
|
Sure, after removing the filter I was getting ~530 fps. Since fps is nonlinearly related to frame time, here's the breakdown:
|
Thanks for the breakdown. Since the filtering is a post-process
operation it's basically a fixed cost per frame... 1.44ms is not bad.
J-S
--
______________________________________________________
Jean-Sebastien Guay
http://www.cm-labs.com/
http://whitestar02.webhop.org/
------------------
Post generated by Mail2Forum |
|
Back to top |
|
 |
GMan User
Joined: 23 Jan 2009 Posts: 29
|
Posted: Wed Sep 30, 2009 10:21 pm Post subject: |
|
|
So I've been studying the example code and the first thing that sticks out to me is that they require some default shaders to be set even though they do nothing special. Why doesn't this work with the fixed function pipeline? |
|
Back to top |
|
 |
Mick User
Joined: 11 Mar 2009 Posts: 31
|
Posted: Thu Oct 01, 2009 8:46 am Post subject: |
|
|
Hi Michael,
GMan wrote:
|
...they require some default shaders to be set even though they do nothing special...
|
Of course you don't need to setup the shaders. We used the shaders in addition to the osgCompute just for demonstration purpose.
If you do not like to sample your texture via shader, just use the following code:
Code:
|
osg::Geometry* geom = osg::createTexturedQuadGeometry( llCorner, width, height );
geode->addDrawable( geom );
geode->getOrCreateStateSet()->setTextureAttributeAndModes( 0, targetTexture, osg::StateAttribute::ON );
|
I also modified the example which now runs without shaders (see attachment).
Best regards,
Mick _________________ SVT Group |
|
Back to top |
|
 |
GMan User
Joined: 23 Jan 2009 Posts: 29
|
Posted: Thu Oct 01, 2009 4:34 pm Post subject: |
|
|
Ok, I've discovered the seemingly innocuous problem. When I had tried to comment out all the shader related code in the GetGeode function, I was left with this:
Code:
|
osg::Geode* getGeode( osg::Texture2D& targetTexture )
{
osg::Geode* geode = new osg::Geode;
geode->setName("quad");
osg::Vec3 llCorner = osg::Vec3(-0.5,0,-0.5);
osg::Vec3 width = osg::Vec3(1,0,0);
osg::Vec3 height = osg::Vec3(0,0,1);
//////////
// QUAD //
//////////
osg::Geometry* geom = osg::createTexturedQuadGeometry( llCorner, width, height );
geode->addDrawable( geom );
geode->getOrCreateStateSet()->setTextureAttribute( 0, &targetTexture, osg::StateAttribute::ON );
return geode;
}
|
This code resulted in this:
Instead of this:
The problem was with the use of setTextureAttribute function. I was seeing the osg::StateAttribute: N parameter and was fooled into thinking that was doing something. So substituting setTextureAttributeAndModes did the trick. Lesson learned. Thanks Mick. |
|
Back to top |
|
 |
GMan User
Joined: 23 Jan 2009 Posts: 29
|
Posted: Fri Oct 02, 2009 10:44 pm Post subject: |
|
|
So on to the next hurdle... It seems that the post process example only works for a single render. I didn't notice this at first since nothing was moving in the scene. However, if we make it match what the osgPrerender example does by having an update callback rotate the cow, nothing happens.
What I did was add the following to createPreRenderSubGraph(...)
Code:
|
osg::NodeCallback* nc = new osg::AnimationPathCallback(
loadedModelTransform->getBound().center(),osg::Vec3(0.0f,0.0f,1.0f),osg::inDegrees(45.0f));
loadedModelTransform->setUpdateCallback(nc);
|
To verify that this was not a problem with my code I applied the initially rendered texture to the quad instead of the final cuda processed texture. This consisted of:
Code:
|
///////////
// SCENE //
///////////
osg::Group* scene = new osg::Group;
scene->addChild(getGeode(*sourceTexture));
|
instead of this:
Code:
|
///////////
// SCENE //
///////////
osg::Group* scene = new osg::Group;
scene->addChild(getGeode(*targetTexture));
|
I'm looking into the cause right now but maybe this is something obvious for you guys. |
|
Back to top |
|
 |
Mick User
Joined: 11 Mar 2009 Posts: 31
|
Posted: Mon Oct 05, 2009 12:28 pm Post subject: |
|
|
Hi Michael,
GMan wrote:
|
So on to the next hurdle... It seems that the post process example only works for a single render....
I'm looking into the cause right now but maybe this is something obvious for you guys.
|
additionally, you need to call unmap onto srcArray after you have called the kernel/filter:
Code:
|
void TexStreamer::launch( const osgCompute::Context& context ) const
{
if( isClear() )
return;
_trgBuffer->setMemory( context, 0x0, osgCompute::MAP_DEVICE_TARGET );
// map params
void* srcArray = _srcArray->map( context, osgCompute::MAP_DEVICE_SOURCE );
void* trgBuffer = _trgBuffer->map( context, osgCompute::MAP_DEVICE_TARGET );
// KERNEL CALL 0
filter( _numBlocks,
_numThreads,
trgBuffer,
srcArray,
_srcArray->getByteSize(),
_filter );
_srcArray->unmap( context );
}
|
The unmap function ensures that the texture will be mapped back from the computation context into the render (GL) context. This is necessary for render targets and has to be done manually since in OSG you don't know when a camera has updated a texture object. We are sorry for the lack of documentation up to now - but we are working on it
Best regards,
Mick _________________ SVT Group |
|
Back to top |
|
 |
jorgea Newbie
Joined: 11 Feb 2012 Posts: 3
|
Posted: Sat Jul 12, 2014 4:29 pm Post subject: Issues |
|
|
|
|
Back to top |
|
 |
|