Search:

Type: Posts; User: Maxim Milakov

Page 1 of 2 1 2

Search: Search took 0.00 seconds.

  1. CL_MAP_WRITE: avoid copying data from device to host

    - cl_map_flags enumeration should be extended with additional value: CL_MAP_FULL. This value means that the user is planning to operate with all values of the buffer being mapped.
    - map_flags...
  2. Replies
    4
    Views
    1,446

    Re: async_work_group_strided_copy

    Thanks a lot! My submatrix width is variable, right now it is 5 in one kenel and 6 in another. I don't think there are built-in types with such a width.

    I already tried using async_work_group_copy...
  3. Replies
    4
    Views
    1,446

    async_work_group_strided_copy

    Is anyone using this function? Or understand what exactly does it do?

    I have a matrix in global memory:


    ooooooooooo
    ooooooooooo
    ooooXXXXXoo
    ooooXXXXXoo
    ooooXXXXXoo
  4. Replies
    7
    Views
    2,274

    Re: OCL programm without *.cl file

    ???, ? ????? ?????, ??? ???? ???????? ??????? ? ??????????????? ????????? ?? ??????. ? ????? ?? ???????? ??????? ? ???????? ???? ? ???????? ??????? ????? ???? ???????, ??????? OpenCL ?????...
  5. Replies
    7
    Views
    2,274

    Re: OCL programm without *.cl file

    You don't need external file. Check clBuildProgram: It takes string as parameter.
  6. Replies
    7
    Views
    2,274

    Re: OCL programm without *.cl file

    source_str = "__kernel your_kernel_name(parameters) { code }";
  7. Replies
    7
    Views
    2,274

    Re: OCL programm without *.cl file

    Of course, it can. It is obvious even from the code you supplied. Just initialize source_str the way you wish.
  8. Re: Some newbie questions about workitems and workgroup size

    This is how it works: each compute unit in your hardware can execute one work-group at a time.
    [/quote]

    Actually, no. AMD and NVidia GPUs are running several work-groups at single compute unit.
  9. Replies
    4
    Views
    1,857

    Re: __constant array initialization

    4xxx were not designed with OpenCL in mind. 5xxx (and later) were.
  10. Replies
    4
    Views
    1,857

    Re: __constant array initialization

    > My GPU is 4850, Local memory works, but limits max

    Actually, no. Local memory is emulated in global memory for ATI 4xxx devices. You can figure out this by querying device for...
  11. Re: Performance on APU with different buffer creation strate

    Yes.
  12. Re: Performance on APU with different buffer creation strate

    You are using AMD OpenCL driver, right? Then you need to check AMD APP OpenCL programming guide, zero copy buffers are thoroughly covered in that document.
  13. Replies
    17
    Views
    3,262

    Re: Buffer with USE_HOST_PTR doesn't work

    I checked Intel's OpenCL Optimization tutorial. It says:


    //min alignment query returns value in bits
    cl_uint min_align = 0; clGetDeviceInfo(g_dev, CL_DEVICE_MEM_BASE_ADDR_ALIGN…, &min_align,…);...
  14. Replies
    17
    Views
    3,262

    Re: Buffer with USE_HOST_PTR doesn't work

    Oh, yeh. You need to create bs as "new cl_int4[2000]";
  15. Replies
    17
    Views
    3,262

    Re: Buffer with USE_HOST_PTR doesn't work

    Well, I don't see any other problems in this code. Maybe you removed the wrong code when removing implementaton details.
  16. Replies
    17
    Views
    3,262

    Re: Buffer with USE_HOST_PTR doesn't work

    Well, your kernel is not initialized indeed.

    You setArg for scaleKernel, while you are executing testKernel.
  17. Replies
    17
    Views
    3,262

    Re: Buffer with USE_HOST_PTR doesn't work

    Most probably you get access violation when calling clWaitForEvents method. For 2 reasons:
    1) You don't check for error codes (in 'status' variable).
    2) 2064 is no evenly divisible by 64.
  18. Replies
    4
    Views
    1,326

    Re: opencl questions !!

    just one brisk question: do i have to write precisly in portable version to be able to run at both gpus? or i can open it in regular version installed on another pc?[/quote]

    There is no any...
  19. Replies
    2
    Views
    909

    Re: local_work_size question

    No, local worksize is actually the size of workgroup. Workgroup is a bunch of workitems which share the same local buffers.
  20. Replies
    5
    Views
    1,779

    Re: casting char* to struct

    The OpenCL extension cl_khr_byte_addressable_store removes certain restrictions on built-in types char, uchar, char2, uchar2, short, and half. An application that wants to be able to write to...
  21. Replies
    2
    Views
    1,433

    Re: Too many images?

    CL_DEVICE_MAX_WRITE_IMAGE_ARGS
  22. Replies
    12
    Views
    2,481

    Re: Workgroups and global IDs

    And one more advice: If you need to enqeue a bunch of kernels you better do subsequent enqeueNDRange and only then call flush (or finish).
  23. Replies
    12
    Views
    2,481

    Re: Workgroups and global IDs

    1) 1000 work items most probably is not enough to fully load high-end GPUs.
    2) Make global work-size to be multiple of 128. So if it is 10,000, then change it to 10,112. In fact, you can play here...
  24. Replies
    12
    Views
    2,481

    Re: Workgroups and global IDs

    What is the global work size?
  25. Re: How does the OpenCL Platform model map to actual hardwar

    PREFERRED_WORK_GROUP_SIZE_MULTIPLE is introduced in OpenCL 1.1. It is better to have workgroup size to be multiple of this value to avoid waisting device's resources. For GPU this value shows...
Results 1 to 25 of 42
Page 1 of 2 1 2