Results 1 to 3 of 3

Thread: how to use OpenCL to process video sequence one by one?

  1. #1
    Junior Member
    Join Date
    Mar 2011
    Posts
    2

    how to use OpenCL to process video sequence one by one?

    Hello everyone,
    I am OpenCL learner, and now i want to process the YUV data (format 4:1:1,planar) with le full search Motion Estimation algorithms. The test video is CITY.yuv (size: 352 x 28.
    I know that how to process one frame or image in one CommandQueue, but i don't konw how to process the frame sequence one by one with OpenCL.
    As i do, i try to use one for-loop to deal with all the frames in one video, the codes are as follows:


    // --------------------------------------------------------
    // Core sequence... copy input data to GPU, compute, copy results back
    //suppose there is 100 frames.
    for(num=0;num<1000;num++)
    {
    int i,j;
    //Read and Copy pixel data into frame_original
    fseek(fp_cur,XX*YY*3/2*(num+1),SEEK_SET);
    if(fread(current_frame[0],XX*YY,1,fp_cur)==0)break;
    memcpy(frame_original, current_frame[0], width * height * sizeof(cl_uchar));
    //Read and Copy pixel data into frame_ref
    fseek(fp_ref,XX*YY*3/2*(num+0),SEEK_SET);
    if(fread(ref_frame[0],XX*YY,1,fp_ref)==0)break;
    memcpy(frame_ref, ref_frame[0], width * height * sizeof(cl_uchar));

    // Asynchronous write of data to GPU device
    ciErrNum = clEnqueueWriteBuffer(cqCommandQue, cm_original , CL_TRUE, 0, width*height*sizeof(cl_uchar), frame_original, 0, NULL, &H2D);
    shrCheckErrorEX (ciErrNum, CL_SUCCESS, pCleanup);

    ciErrNum = clEnqueueWriteBuffer(cqCommandQue, cm_ref , CL_FALSE, 0, width*height*sizeof(cl_uchar), frame_ref, 0, NULL, NULL);
    shrCheckErrorEX (ciErrNum, CL_SUCCESS, pCleanup);

    //Create the kernel
    ckKernel = clCreateKernel(cpProgram,"motion_estimation", &ciErrNum);
    shrCheckErrorEX (ciErrNum, CL_SUCCESS, pCleanup);
    // set the global size for each block in frame.
    global[0]= width/block_size;
    global[1]= height/block_size;

    // set the kernel arguments
    int n = 0;
    printf("clSetKernelArg...\n\n");
    ciErrNum = clSetKernelArg(ckKernel, n++, sizeof(cl_mem), (void*)&cm_original);
    ciErrNum |= clSetKernelArg(ckKernel, n++, sizeof(cl_mem), (void*)&cm_ref);
    ciErrNum |= clSetKernelArg(ckKernel, n++, sizeof(cl_int), (void*)&cm_mv_output);

    // Launch kernel
    ciErrNum = clEnqueueNDRangeKernel(cqCommandQue, ckKernel, 2, NULL,global,NULL,0, NULL, &ceEvent);
    shrCheckErrorEX (ciErrNum, CL_SUCCESS, pCleanup);

    // wait for the command to finish
    clFinish(cqCommandQue);

    // Read back results and check accumulated
    ciErrNum = clEnqueueReadBuffer(cqCommandQue, cm_mv_output, CL_TRUE, 0, X*Y*sizeof(MV), MV_output, 0, NULL, &GPUDone);
    shrCheckErrorEX (ciErrNum, CL_SUCCESS, pCleanup);
    // Release event
    ciErrNum = clReleaseEvent(ceEvent);
    shrCheckErrorEX (ciErrNum, CL_SUCCESS, pCleanup);
    ceEvent = 0;
    Cleanup (EXIT_SUCCESS);

    }

    I konw that there must be a better way to process the yuv data, if you have any idea or opinion, please let me konw.
    Thanks for your help.

  2. #2
    Senior Member
    Join Date
    May 2010
    Location
    Toronto, Canada
    Posts
    845

    Re: how to use OpenCL to process video sequence one by one?

    Each iteration of the loop does two things near the end:

    a) Calls clFinish(), which is not necessary.
    b) Makes a blocking call to clEnqueueReadBuffer().

    This means that each iteration of the loop submits a very small amount of work (CIF resolution) to the GPU and then waits until it's done. It's not the most suitable way to submit work to a GPU.

    There are several things you could try, such as using double or triple-buffering or submitting multiple frames worth of data in each NDRangeKernel given that CIF is so small.

    Also, do not call clCreateKernel() in each iteration. Create the kernel outside of the loop and reuse it.
    Disclaimer: Employee of Qualcomm Canada. Any opinions expressed here are personal and do not necessarily reflect the views of my employer. LinkedIn profile.

  3. #3
    Junior Member
    Join Date
    Mar 2011
    Posts
    2

    Re: how to use OpenCL to process video sequence one by one?

    Quote Originally Posted by david.garcia
    Each iteration of the loop does two things near the end:

    a) Calls clFinish(), which is not necessary.
    b) Makes a blocking call to clEnqueueReadBuffer().

    This means that each iteration of the loop submits a very small amount of work (CIF resolution) to the GPU and then waits until it's done. It's not the most suitable way to submit work to a GPU.

    There are several things you could try, such as using double or triple-buffering or submitting multiple frames worth of data in each NDRangeKernel given that CIF is so small.

    Also, do not call clCreateKernel() in each iteration. Create the kernel outside of the loop and reuse it.
    Thanks for your advises
    i will try it soon.

Similar Threads

  1. OpenCL and Xilisoft HD Video Converter 6
    By MindGum in forum OpenCL
    Replies: 0
    Last Post: 05-04-2011, 07:05 AM
  2. Replies: 4
    Last Post: 12-08-2009, 07:13 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •