PDA

View Full Version : address space qualifiers and pointers to arrays



romein
11-09-2012, 03:03 AM
The following code works on Nvidia, but the AMD and Intel OpenCL compilers complain about the absence of a valid address space qualifier:


__kernel void foo(__global float (*pointer_to_matrix)[3][4])
{
}

I think this code is legal. Any opinion?

utnapishtim
11-09-2012, 03:46 AM
According to OpenCL specification: "arguments to kernel functions in a program cannot be declared as a pointer to a pointer"

romein
11-09-2012, 03:57 AM
But this is not a pointer to a pointer, this is a pointer to a 3x4 matrix (i.e., a pointer to a memory area holding 12 consecutive floats).

utnapishtim
11-13-2012, 05:35 AM
When you declare "float matrix[3][4]", matrix is a pointer to an array of 12 floats.
So "float (*pointer_to_matrix)[3][4]" is obviously a pointer to a pointer to floats.

If you intend to use a pointer to an array of 12 floats, you should declare your kernel as

__kernel void foo(__global float matrix[3][4])

romein
11-14-2012, 08:27 PM
Both declarations declare a pointer to an array of 12 floats, even though the syntax for dereferencing the pointer is different.

Consider the five examples below; these are fully equivalent and could/should lead to the same generated binary code (remove the __kernel and __global, and it is legal C code; you will see that an optimizing C compiler will generate exactly the same instructions for all five cases).


__kernel void foo1(__global float *ptr) // ok on NVIDIA, AMD, and Intel
{
ptr[6] = 42;
}

__kernel void foo2(__global float ptr[12]) // gives error on Intel
{
ptr[6] = 42;
}

__kernel void foo3(__global float (*ptr)[12]) // gives error on AMD and Intel
{
(*ptr)[6] = 42;
}

__kernel void foo4(__global float ptr[3][4]) // gives error on AMD and Intel
{
ptr[1][2] = 42;
}

__kernel void foo5(__global float (*ptr)[3][4]) // gives error on AMD and Intel
{
(*ptr)[1][2] = 42;
}

As far as I can see, none of these constructs is forbidden by the OpenCL specification, so I think that all these five functions should be accepted by the compiler (only NVIDIA does so).