Khronos Public Bugzilla
Bug 358 - CL_BUILD_PROGRAM_FAILURE while building simple program
CL_BUILD_PROGRAM_FAILURE while building simple program
Status: NEW
Product: OpenCL
Classification: Unclassified
Component: Sample Implementation
1.0
PC Mac OS
: P3 normal
: ---
Assigned To: Neil Trevett
OpenCL Working Group
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2010-09-15 23:58 PDT by Bogdan Opanchuk
Modified: 2010-09-15 23:58 PDT (History)
0 users

See Also:


Attachments
C program which reproduces the bug (4.47 KB, application/octet-stream)
2010-09-15 23:58 PDT, Bogdan Opanchuk
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Bogdan Opanchuk 2010-09-15 23:58:03 PDT
Created attachment 53 [details]
C program which reproduces the bug

Environment: iMac 2009, OSX 10.6.4, ATI Radeon 4670, OpenCL 1.0
While building the program (see below) with clBuildProgram(), I get CL_BUILD_PROGRAM_FAILURE error, and the build log is empty. If one removes any string (or the definition of the blank function), bug disappears. Moreover, the same program builds successfully on MacBook 2009, OSX 10.6.4, GF9600.

Full text of C program which creates context and builds CL program is attached. I get the following output:
$ ./test
Failed to build program; error -11
FFT program build log on device Radeon HD 4670:

The CL program in question:

__kernel void fftInv(__global float *in)
{
   __local float smem[768];
   size_t smem_store_index, smem_load_index;
   float2 a[8] = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0};
   int thread_id = get_local_id(0);
   int blocks_num = get_num_groups(0);
   int jj = thread_id >> 4;
   smem_store_index = jj;

   a[0] = in[0];
   a[1] = in[64];
   a[2] = in[128];
   a[3] = in[192];
   a[4] = in[256];
   a[5] = in[320];

   smem_load_index = thread_id;

   a[0].x = smem[smem_load_index + 0];
   a[1].x = smem[smem_load_index + 1];
   a[2].x = smem[smem_load_index + 6];
   a[3].x = smem[smem_load_index + 7];
   a[4].x = smem[smem_load_index + 12];
   a[5].x = smem[smem_load_index + 13];
   a[6].x = smem[smem_load_index + 18];
   a[7].x = smem[smem_load_index + 19];

   barrier(CLK_LOCAL_MEM_FENCE);

   smem[smem_store_index + 0] = a[0].y;
   smem[smem_store_index + 3] = a[1].y;
   smem[smem_store_index + 6] = a[2].y;
   smem[smem_store_index + 9] = a[3].y;
   smem[smem_store_index + 12] = a[4].y;
   smem[smem_store_index + 15] = a[5].y;
   smem[smem_store_index + 18] = a[6].y;
   smem[smem_store_index + 21] = a[7].y;

   barrier(CLK_LOCAL_MEM_FENCE);

   smem[smem_load_index + 0] = a[0].x;
   smem[smem_load_index + 2] = a[2].x;
   smem[smem_load_index + 4] = a[4].x;
   smem[smem_load_index + 6] = a[6].x;
   smem[smem_load_index + 8] = a[1].x;
   smem[smem_load_index + 10] = a[3].x;
   smem[smem_load_index + 12] = a[5].x;
   smem[smem_load_index + 14] = a[7].x;
}

__kernel void fftFwd(__global float2 *in, __global float2 *out, int S)\
{

}