PDA

View Full Version : Fast pixel transfer to and from graphic cards in OpenVG ?



john_h
06-18-2009, 10:04 AM
Hi,

Does anybody know how to speed up pixel data transfer to and from graphic cards without involving CPU cycles ?

I am trying to develop a test application on TI OMAP3530 beagleboard that loads 24-bit bmp images, rotates each bmp image, and displays the rotated images.

I modified one of the openVG tranining sample code (OVGPatternFill.cpp) that came with OMAP35x_Graphics_SDK_3_00_00_06. To load bmp, I added loadBMP() in
http://www.videotutorialsrock.com/openg ... t/home.php (http://www.videotutorialsrock.com/opengl_tutorial/draw_text/home.php).
I built/ran the code for beagleboard and found out that the bottleneck is in the memory to GPU memory transfering (actually done by vgImageSubData() ). To transfer a 512x512 bmp image from local memory to GPU, it took around 500 msec.

I heard Pixel Buffer Objects (PBO) are often suggested for faster image transfer (http://www.songho.ca/opengl/gl_pbo.html), but it seems that PBO is not supported by OpenGL ES /OpenVG.

Any help would be appreciated !

Thank you.

john

Ivo Moravec
06-19-2009, 10:47 AM
VGImages should be created in graphics memory. If you upload your image to a VGImage, then you should be incurring no subsequent bandwidth penalties after the first frame. The one time vgImageSubData() is unavoidable as the GPU must get the data once no matter what happens (you should definitely not be calling it every frame though). Just do it once at program load time at reuse the VGImage over and over again.

Likewise a bound OpenGL ES textures will also be GPU memory by necessity (I assume you're rendering with a textured quad) [and also take about the same amount of time for the initial upload (if it is indeed a bandwidth issue)].

john_h
06-23-2009, 10:40 AM
Hi Ivo,

Thank you for your reply !

I am a newbie to OpenGL and OpenVG...and it looks there should no subsequent delays (500 msec for each bmp transfer, in my case) except the first frame.

"VGImages should be created in graphics memory. If you upload your image to a VGImage, then you should be incurring no subsequent bandwidth penalties after the first frame."

Yes, VGImages is created one time by CPatternFill::CreatePaints().

"The one time vgImageSubData() is unavoidable as the GPU must get the data once no matter what happens (you should definitely not be calling it every frame though). Just do it once at program load time at reuse the VGImage over and over again."

Actually I need to load a sequence of bmp images and rotate each bmp image with certain angle, I am loading a bmp image to CPU memory and calling vgImageSubData() each time through CPatternFill::RenderScene() !!

When I trying to load a sequence of bmp, calling vgImageSubData() each time is unavoidable ?

Thank you.

john

Ivo Moravec
06-24-2009, 08:42 AM
If all you're doing is rotating, you should be doing that by doing a vgRotate() on the VG_MATRIX_IMAGE_USER_TO_SURFACE matrix. You should even be able to fake perspective correction this way as the image matrix does not need to be affine (use vguComputeWarpQuadToQuad() / vgLoadMatrix() for that).

If you want to use the image as a texture paint, you should be able to use use the VG_MATRIX_FILL_PAINT_TO_USER and/or VG_MATRIX_STROKE_PAINT_TO_USER matrices to do the rotation.

If you're animating or something and the above is not possible, is it possible to load all the Images at the start and just use the correct one?

john_h
06-24-2009, 11:33 AM
Hi Ivo,

Thank you for your reply !!

"If all you're doing is rotating, you should be doing that by doing a vgRotate() on the VG_MATRIX_IMAGE_USER_TO_SURFACE matrix."

I am rotating the bmp images by vgRotate() using VG_MATRIX_PATH_USER_TO_SURFACE, not VG_MATRIX_IMAGE_USER_TO_SURFACE. It's just because the sample code I am referring for rotation uses VG_MATRIX_PATH_USER_TO_SURFACE.

"If you want to use the image as a texture paint, you should be able to use use the VG_MATRIX_FILL_PAINT_TO_USER and/or VG_MATRIX_STROKE_PAINT_TO_USER matrices to do the rotation."

I have not used those matrices, but I feel I should try them for images.

I have attached the code I am working on. It basically creates a rectangular path (512x512 square), fills the path with image data, and rotates current matrix. Can you please have a look the code ?

Thanks.

john
--------
#include "OVGTools.h"

#include "PVRShell.h"

#include "vg/openvg.h"

#include "GLES/egl.h"

#define PVRTRGBA(r, g, b, a) ((VGuint) (((r) << 24) | ((g) << 16) | ((b) << 8) | (a)))
#define IMG_WIDTH 512
#define IMG_HEIGHT 512

class Image {
public:
Image(char* ps, int w, int h);
~Image();

char* pixels;
int width;
int height;
};

Image::Image(char* ps, int w, int h) : pixels(ps), width(w), height(h) {

}

Image::~Image() {
delete[] pixels;
}


class CPatternFill : public PVRShell

{

private:

VGPath m_vgPath;

VGPaint m_vgImagePaint;

VGImage m_vgImage;

CPVRTPrintVG m_PrintVG;

unsigned int m_ui32AbsTime;
unsigned int fnum;
Image *image;

public:

CPatternFill() {}



/* PVRShell functions */

virtual bool InitApplication();

virtual bool InitView();

virtual bool ReleaseView();

virtual bool QuitApplication();

virtual bool RenderScene();



/************************************************** **************************

** Function Definitions

************************************************** **************************/

void CreatePath();

void CreatePaints();

};


void CPatternFill::CreatePath()

{

/*

Create the shape to pattern fill

*/



static VGubyte aui8PathSegments[] = { //commands for path

VG_MOVE_TO_ABS,

VG_VLINE_TO_REL,

VG_HLINE_TO_REL,

VG_VLINE_TO_REL,

VG_CLOSE_PATH

};



static VGfloat afPathCoords[] = { //data for path

0.0f, 0.0f,

512.0f, //0.21f,

512.0f, //0.21f,

-512.0f,

};



// Create a path handle...

m_vgPath = vgCreatePath(

VG_PATH_FORMAT_STANDARD,

VG_PATH_DATATYPE_F,

1.0f, 0.0f,

5,

5,

VG_PATH_CAPABILITY_APPEND_TO);



// ... and populate it with data

vgAppendPathData(m_vgPath, 5, aui8PathSegments, afPathCoords);

}


void CPatternFill::CreatePaints()

{
int pel;


//Create the image we are going to use as the pattern

m_vgImage = vgCreateImage(VG_sRGBX_8888, IMG_WIDTH, IMG_HEIGHT, VG_IMAGE_QUALITY_NONANTIALIASED);



VGuint* pui32ImgData = new VGuint[IMG_WIDTH * IMG_HEIGHT];

/* load bmp image */
image = loadBMP("lilies512x512.bmp");


for (int i = 0; i < image->height; ++i)

{
pel = i*image->width;

for (int j = 0; j < image->width; ++j)

{
pui32ImgData[pel+j] = PVRTRGBA(image->pixels[3*(pel+j)],image->pixels[3*(pel+j)+1],image->pixels[3*(pel+j)+2],255);

}

}
delete image;


vgImageSubData(m_vgImage,pui32ImgData, sizeof(VGuint) * image->width,VG_sRGBX_8888, 0, 0, image->width, image->height);


// Delete the image data as it is now in OpenVG memory

delete[] pui32ImgData;

pui32ImgData = 0;



// Create a paint

m_vgImagePaint = vgCreatePaint();



// Set its type to pattern

vgSetParameteri(m_vgImagePaint, VG_PAINT_TYPE, VG_PAINT_TYPE_PATTERN);



//Set the image for the paint to use as a pattern

vgPaintPattern(m_vgImagePaint, m_vgImage);



//Set the tiling mode of the pattern. In this case it is set to repeat endlessly

vgSetParameteri(m_vgImagePaint, VG_PAINT_PATTERN_TILING_MODE, VG_TILE_REPEAT);

}


bool CPatternFill::InitApplication()

{

// This sets up PVRShell to use an OpenVG context

PVRShellSet(prefOpenVGContext, true);

return true;

}


bool CPatternFill::QuitApplication()

{

PVRShellOutputDebug("leaving...");

return true;

}



bool CPatternFill::InitView()

{
m_ui32AbsTime = 0;
fnum = 0;


//Create the path

CreatePath();



//Create the paints

CreatePaints();



//Set the clear colour

VGfloat afClearColour[] = { 0.6f, 0.8f, 1.0f, 1.0f };

vgSetfv(VG_CLEAR_COLOR, 4, afClearColour);



m_PrintVG.Initialize(PVRShellGet(prefWidth), PVRShellGet(prefHeight));


return true;

}



bool CPatternFill::ReleaseView()

{

//destroy the path

vgDestroyPath(m_vgPath);



//destroy the paint

vgDestroyPaint(m_vgImagePaint);



//destroy the image

vgDestroyImage(m_vgImage);



m_PrintVG.Terminate();

return true;

}



bool CPatternFill::RenderScene()

{
int pel2;
m_ui32AbsTime += 12;

/*** I added belows to load new bmp image every time ***/
/*** (actually it's same image in this test) ***/

if(fnum > 0) {
VGuint* pui32ImgData = new VGuint[IMG_WIDTH * IMG_HEIGHT];

/* load new bmp image */

image = loadBMP("lilies512x512.bmp");


for (int i = 0; i < image->height; ++i)

{
pel2 = i*image->width;

for (int j = 0; j < image->width; ++j)

{
pui32ImgData[pel2+j] = PVRTRGBA(image->pixels[3*(pel2+j)],image->pixels[3*(pel2+j)+1],image->pixels[3*(pel2+j)+2],255);

}

}
delete image;


vgImageSubData(m_vgImage,pui32ImgData, sizeof(VGuint) * image->width,VG_sRGBX_8888, 0, 0, image->width, image->height);



// Delete the image data as it is now in OpenVG memory

delete[] pui32ImgData;

pui32ImgData = 0;
}
/****************************************/


// Clear the screen with the clear colour.

vgClear(0, 0, PVRShellGet(prefWidth), PVRShellGet(prefHeight));



// Switch the matrix mode to VG_MATRIX_PATH_USER_TO_SURFACE

vgSeti(VG_MATRIX_MODE, VG_MATRIX_PATH_USER_TO_SURFACE);

vgLoadIdentity();



// Set the paint you would like to fill the shape with

vgSetPaint(m_vgImagePaint, VG_FILL_PATH);

/*** I added belows to rotate current bmp image ***/

float afUnitMatrix[3*3];

vgGetMatrix(afUnitMatrix);


float fRotationAngle = m_ui32AbsTime;

vgTranslate((VGfloat)(IMG_WIDTH/2), (VGfloat)(IMG_HEIGHT/2));
vgRotate(fRotationAngle); // rotate curr Matrix by angle.
vgTranslate(-(VGfloat)(IMG_WIDTH/2), -(VGfloat)(IMG_HEIGHT/2));
/**************************/


// Draw the path with stroke and fill

vgDrawPath(m_vgPath, VG_STROKE_PATH | VG_FILL_PATH);



m_PrintVG.DisplayDefaultTitle("PatternFill+Rotate", "", ePVRTPrint3DLogoIMG);


return true;

}

------

Ivo Moravec
06-25-2009, 11:13 AM
Yeah. Don't do that.

This is what you should be doing:

At initialization time:
- create a VGImage - vgCreateImage()
- add the unrotated image data to it with vgImageSubData()

And in the render loop:
- set the matrix mode to VG_MATRIX_IMAGE_USER_TO_SURFACE
- do any rotates you want there (vgIdentity,vgTranslate,vgRotate, etc)
- call vgDrawImage()

Only if you needed some independent movement of the texture from the geometry should you use vgDrawPath and a VG_PAINT_TYPE_PATTERN type paint (vgPaintPattern()) [this would allow you to set the rotations on the geometry and texture separately using the paint and path matrices].
In any event, you should never be doing the bitmap rotations yourself though (and especially not per frame).

Ivo Moravec
06-25-2009, 11:19 AM
Oh and btw, and it's simpler to use vguRect() to make a rectangular path than using vgAppendPathData().

john_h
06-25-2009, 11:32 AM
Thank you for your suggestion !!

I got it. I will try the way you suggested and update.

Thanks.

john

john_h
06-30-2009, 12:50 PM
Hi Ivo,

I modified the code and tested it for the same images.

In the InitView,
- I created a vgImage one time using vgCreateImage() in sRGBA_8888 format
- loaded the first RGB 24-bit bmp image into CPU memory
- converted 24-bit rgb data into 32-bit rgba data
- transferred the 32-bit image data to GPU by using vgImageSubData()

In the render module,
i. set the matrix mode to VG_MATRIX_IMAGE_USER_TO_SURFACE
ii. set vgLoadIdentity() and did vgRotate()
iii. did vgDrawImage() for the first image
iv. loaded new bmp into CPU, convert to 32-bit rgba data, transfer to GPU, and repeated i~iii again

I executed the modified code on my beagleboard but I got no speed-up. For 512x512 24bit bmp image, I got similar results

- load bmp: 40msec
- convert to rgba 32 bit format: 13 msec
- vgImageSubData(): 500~512 msec

I understand trasnfering texture to GPU side costs delays, but 500 msec for each frame looks too much. I may need to contact the hardware manufacturer. Thank you for your help, Ivo. Any feedbacks would be greatly appreciated.

john

Ivo Moravec
07-01-2009, 12:55 PM
Why are you trying to upload a new bmp on each frame when you said all you want to do is rotate it?
If that is all you are doing, you should not be doing step iv in the render loop at all (only steps i.-iii.).


i. set the matrix mode to VG_MATRIX_IMAGE_USER_TO_SURFACE
ii. set vgLoadIdentity() and did vgRotate()
iii. did vgDrawImage() for the first image
iv. loaded new bmp into CPU, convert to 32-bit rgba data, transfer to GPU, and repeated i~iii again

Are you trying to use OpenVG to play back decoded movie frames or something? (in which case, why do you need the rotate? If you don't need the rotate, try using vgWritePixels() and bypass VGImages altogether).
If you're cycling between only a few different bmp files, then load them all up into different VGImages, and just draw the correct one. vgImageSubData() however, should not be being called from within the rendering loop on every frame (In general, it's not unusual that the bandwidth to the GPU is quite limited - especially on embedded devices) and there is usually very little need to do so.

Also for a speed up, try using nearest neighbor sampling on your image (create the image with VG_IMAGE_QUALITY_NONANTIALIASED) and see if that speeds things up for you any.

john_h
07-02-2009, 10:20 AM
Hi Ivo,

Thank you for your reply !


Why are you trying to upload a new bmp on each frame when you said all you want to do is rotate it?
If that is all you are doing, you should not be doing step iv in the render loop at all (only steps i.-iii.).
Right. If I work on a single image, step iv will not be used.
The reason I am uploading new bmp images is I need to work on image sequence. My application will read a bmp image from storage, do some manipulations (such as rotation), and send it to next module for further processing. This procedure should be performed for each image in real-time (e.g. 30 frame/sec).
Doing all these procedures on ARM-only platform cannot achieve our target frame rate. That's why I approached to GPU.

The image sequences(size 512x512) are typically more than 10 minutes, so it would not be possible to load all of them to GPU side at the initial stage.


Also for a speed up, try using nearest neighbor sampling on your image (create the image with VG_IMAGE_QUALITY_NONANTIALIASED) and see if that speeds things up for you any.
Yes, I will try this sampling mode and compare the speed and quality.

john