Results 1 to 6 of 6

Thread: VBO Test - glBufferData vs glBufferSubData vs glMapBufferOES

  1. #1
    Junior Member
    Join Date
    Mar 2011
    Posts
    4

    VBO Test - glBufferData vs glBufferSubData vs glMapBufferOES

    Hi,

    I am John, and new to this forum. I want to start by saying, I love OpenGL ES. There are few things that the more I use, the more I fall in love with; C and OpenGL ES are among them.

    Here is my question:
    I have been doing a lot of tests for how I should setup the base rendering engine. Right now I am not using buffers, but had been considering them for quite a while. I have found that glMapBufferOES is slower then glBufferSubData which is slower then glBufferData; and I am wondering why that is.

    My test environment:
    OpenGL ES 1.1
    iPad (first generation) - PowerVR SGX
    iPhone 3G - PowerVR MBX


    To my understanding, glBufferSubData should always be faster then glBufferData because glBufferData reallocs the memory each time called, thus if your size doesn't change, use glBufferSubData otherwise use glBufferData. What I have found is that glBufferSubData runs about 68% the speed of only using glBufferData.

    I also understand that glMapBufferOES is an extension, but I have found that it also runs slower then glBufferSubData or glBufferData. It, in fact, is the slowest way of updating vertex information.

    Overall:
    iPad speed: glMapBufferOES < glBufferSubData < glBufferData <= No buffers
    iPhone speed: glMapBufferOES ? glBufferSubData ? glBufferData ? No buffers


    Is this normal?

    Thanks for your help!

  2. #2
    Senior Member
    Join Date
    May 2008
    Posts
    102

    Re: VBO Test - glBufferData vs glBufferSubData vs glMapBuffe

    There might be inefficiencies in the implementation of glMapBuffers and glBufferSubData on the ipad. If things are optimal I would expect:

    glMapBufferOES ? glBufferSubData ? glBufferData

    Note, some driver will optimize the case of doing uploads with glBufferData if the size is the same as the previous, which allows skipping the re-allocation. You're likely hitting the optimized case. Otherwise, it would be much slower then glBufferSubData and glMapBuffers. You can try removing or adding one vertex each frame, and I bet you'd see a difference.

  3. #3
    Junior Member
    Join Date
    Mar 2011
    Posts
    4

    Re: VBO Test - glBufferData vs glBufferSubData vs glMapBuffe

    Thanks for the reply!

    I took your suggestion and randomized the quantity of data I sent to openGL (seeding prior to each test of course). Unfortunately I came up with the same results.

    Note: The values below are Frames Per Second that were averaged over 300 tests.

    Key:
    Code :
    I = Not buffered         P = Points     N = Not textured
    M = glMapBufferOES       Q = Quad       Y = Textured
    S = glBufferSubData
    V = glBufferData

    iPad

    Code :
    Count   IPN     IPY     IQN     IQY     MPN     MPY     MQN      MQY     SPN     SPY    SQN     SQY     VPN     VPY     VQN     VQY
    256     60.00	60.00	60.00	60.00	60.00	60.00	60.00	60.00	60.00	60.00	60.00	60.00	60.00	60.00	60.00	60.00
    384     60.00	60.00	60.00	60.00	60.00	60.00	60.00	60.00	60.00	60.00	60.00	60.00	60.00	60.00	60.00	60.00
    512     60.00	60.00	60.00	60.00	60.00	51.51	60.00	60.00	60.00	51.79	60.00	60.00	60.00	60.00	60.00	60.00
    768     60.00	53.95	60.00	60.00	60.00	40.53	60.00	56.76	60.00	40.54	60.00	57.18	60.00	54.07	60.00	60.00
    1024    60.00	42.78	60.00	60.00	60.00	33.26	60.00	48.07	60.00	33.41	60.00	48.31	60.00	42.74	60.00	60.00
    1532    60.00	29.96	60.00	48.71	60.00	24.60	60.00	36.71	60.00	24.69	59.87	36.72	60.00	30.09	60.00	48.23
    2048    60.00	23.13	60.00	37.88	54.96	19.55	51.04	29.70	55.59	19.65	51.45	29.77	60.00	23.00	60.00	37.86
    3072    60.00	15.85	60.00	26.11	43.30	13.82	39.87	21.40	43.71	13.87	39.97	21.35	60.00	15.81	60.00	26.18
    4096    54.32	12.09	49.16	18.79	35.81	10.74	32.81	16.83	36.28	10.78	33.06	16.82	54.30	12.08	49.17	19.98
    6144    40.51	 8.19	33.97	12.68	26.44	 7.40	23.86	11.71	26.76	 7.43	24.09	11.74	40.36	 8.18	36.56	13.60
    8192    31.51	 6.19	26.08	 9.58	21.02	 5.65	19.00	 9.06	21.27	 5.67	19.21	 9.08	31.47	 6.19	28.72	10.27
    10500   25.23	 4.85	20.42	 7.53	17.02	 4.46	15.12	 7.14	17.28	 4.47	15.31	 7.18	25.14	 4.86	22.84	 8.05


    iPad (Count - rand() % 10)

    Code :
    Count   IPN     IPY     IQN     IQY     MPN     MPY     MQN      MQY     SPN     SPY    SQN     SQY     VPN     VPY     VQN     VQY
    256     60.00   60.00	60.00	60.00	60.00	60.00	60.00	60.00	60.00	60.00	60.00	60.00	60.00	60.00	60.00	60.00
    384     60.00   60.00	60.00	60.00	60.00	60.00	60.00	60.00	60.00	60.00	60.00	60.00	60.00	60.00	60.00	60.00
    512     60.00   60.00	60.00	60.00	60.00	51.56	60.00	60.00	60.00	51.89	60.00	60.00	60.00	60.00	60.00	60.00
    768     60.00   54.64	60.00	60.00	60.00	40.56	60.00	56.29	60.00	40.53	60.00	56.74	60.00	53.90	60.00	60.00
    1024    60.00   42.79	60.00	60.00	60.00	33.30	60.00	47.73	60.00	33.47	60.00	47.97	60.00	42.92	60.00	60.00
    1532    60.00   30.14	60.00	48.39	60.00	24.61	59.68	36.39	60.00	24.70	59.83	36.41	60.00	30.20	60.00	48.81
    2048    60.00   22.96	60.00	37.58	54.67	19.54	50.84	29.34	55.50	19.64	51.20	29.48	60.00	23.01	60.00	37.14
    3072    60.00   15.88	60.00	25.78	43.05	13.81	39.71	21.14	43.64	13.87	39.71	21.12	60.00	15.87	60.00	25.69
    4096    53.66   12.08	48.94	18.50	35.64	10.73	32.70	16.62	36.15	10.77	32.96	16.70	54.47	12.09	48.96	19.65
    6144    40.48    8.17	33.59	12.55	26.38	 7.40	23.80	11.57	26.68	 7.43	23.87	11.61	40.33	 8.20	36.45	13.37
    8192    31.54    6.20	25.88	 9.46	20.93	 5.65	18.92	 8.93	21.19	 5.67	19.09	 8.97	31.30	 6.18	28.50	10.14
    10500   24.97    4.85	20.38	 7.43	16.98	 4.45	15.12	 7.04	17.20	 4.46	15.12	 7.07	25.24	 4.85	22.82	 7.94


    iPhone

    Code :
    Count   IPN     IPY     IQN     IQY     MPN     MPY     MQN      MQY     SPN     SPY    SQN     SQY     VPN     VPY     VQN     VQY
    256     30.00	29.23	20.77	20.12	30.00	29.38	20.22	19.18	30.00	29.24	20.22	19.29	30.00	29.33	20.11	20.18
    384     30.00	27.21	14.82	14.47	30.00	27.29	15.60	14.46	30.00	27.36	15.41	14.36	30.00	27.12	14.84	14.38
    512     30.00	24.60	12.11	11.32	30.00	24.77	12.09	11.29	30.00	24.70	12.10	11.28	30.00	24.63	12.08	11.26
    768     28.61	18.41	 8.21	 7.74	28.58	18.63	 8.22	 7.74	28.64	18.30	 8.21	 7.73	28.66	18.30	 8.20	 7.72
    1024    27.97	17.07	 6.17	 5.85	27.67	17.06	 6.17	 5.84	27.36	17.08	 6.15	 5.81	27.29	16.99	 6.15	 5.81
    1532    26.06	12.54	 4.06	 3.86	25.70	12.53	 4.06	 3.86	26.19	12.55	 4.06	 3.85	25.69	12.47	 4.05	 3.84
    2048    21.28	 9.79	 2.94	 2.80	21.40	 9.80	 2.94	 2.79	21.40	 9.77	 2.93	 2.79	21.26	 9.76	 2.91	 2.79

  4. #4
    Junior Member
    Join Date
    Mar 2011
    Posts
    4

    Re: VBO Test - glBufferData vs glBufferSubData vs glMapBuffe

    Whoops, hit submit instead of preview and it won't let me edit . I was trying to align the tables properly.

    *I should also note that 'quads' refers to GL_TRIANGLE_STRIP with 6 * count - 2 vertices.*

    I am just really confused why glBufferSubData could ever be slower then glBufferData.

  5. #5
    Senior Member
    Join Date
    May 2006
    Posts
    353

    Re: VBO Test - glBufferData vs glBufferSubData vs glMapBuffe

    Quote Originally Posted by John
    I am just really confused why glBufferSubData could ever be slower then glBufferData.
    Are you replacing the entire buffer contents or just a subset?
    Georg Kolling, Imagination Technologies
    Please ask questions specific to PowerVR hardware or SDKs on the PowerVR Insider Forum
    DevTech@imgtec.com | http://www.powervrinsider.com

  6. #6
    Junior Member
    Join Date
    Mar 2011
    Posts
    4

    Re: VBO Test - glBufferData vs glBufferSubData vs glMapBuffe

    Whoops, sorry, I forgot to explain.

    The count in my tests is the quantity that I am actually updating and drawing. The real amount in memory is always the "next power of two".

    Ex:
    Count (update and send) -> in Memory
    256 -> 256
    384 -> 512
    512 -> 512
    768 -> 1024
    1024 -> 1024
    ...
    etc.

    info->_sendCount = count;
    info->_maxCount = nextPowerOfTwo(count);

    Code :
    void render()
    ...
    if (info->_lastMaxCount != info->_maxCount)
    {
        info->_lastMaxCount = info->_maxCount;
        glBufferData(GL_ARRAY_BUFFER, dataSize * info->_maxCount, info->_vertices, GL_DYNAMIC_DRAW);
    }
    else
    {
        glBufferSubData(GL_ARRAY_BUFFER, 0, dataSize * info->_sendCount, info->_vertices);
    }
    ...

    This should give subBuffer an advantage on half of the tests (unless bufferData is optimized as jpilon mentioned). Though, even if it is the full buffer though, shouldn't it be at least equal speed?

    If you want, I can post the code, I don't mind sharing .

Similar Threads

  1. glBufferData error 502
    By ManuelP in forum OpenGL ES 2X - for programmable 3D graphics pipelines
    Replies: 2
    Last Post: 06-14-2012, 12:02 AM
  2. using glDrawElements() with VBO
    By ManuelP in forum OpenGL ES 2X - for programmable 3D graphics pipelines
    Replies: 2
    Last Post: 05-10-2012, 12:15 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •