Results 1 to 4 of 4

Thread: OpenCL memory model

  1. #1
    Junior Member
    Join Date
    Sep 2008
    Posts
    11

    OpenCL memory model

    OpenCL describes a "relaxed" memory model.
    Where can I find more information on this?

    Specifically, what is the expected architecture between the host and the device?
    Is it anticipated to be a pipe line over a bus or is it expecting actual shared memory
    between the host and the device?


    Is there any provision to take advantage of true shared memory between the host and the device?
    Do the clEnqueueMapBuffer() and clEnqueueUnmapBuffer() functions work with actual shared memory between the host and device.
    Will this eliminate the need for block copies using the read and write functions?

    please refer to slide 19 in the OpenCL overview:
    http://www.khronos.org/developers/libra ... erview.pdf
    I am curious to know if the red block labeled "Global Memory" is on the host or on the device,
    and what the bi directional arrow between the "Compute Device" block and the "Compute Device Memory" block represents.


    Thank you

  2. #2
    Senior Member
    Join Date
    Jul 2009
    Location
    Northern Europe
    Posts
    311

    Re: OpenCL memory model

    I would suggest looking at the OpenCL spec. It covers a fair amount of detail as to what the memory model consists of.

    With regards to performance optimizations for mapping vs. reading, those are up to the vendors to implement for their platforms as needed, so the spec doesn't really say anything about them.

  3. #3
    Junior Member
    Join Date
    Feb 2010
    Posts
    2

    Re: OpenCL memory model

    re: dbs2's comment that the spec "covers a fair amount of detail"

    Perhaps it covers a fair amount of detail, but it is not sufficient.

    In particular, the operations relating to USE_HOST_PTR and COPY_HOST_PTR
    do not specify the obligation of the implementation relative to updates to
    the host buffer on the host side of a host-gpu interface, nor does it
    specify adequately or unambiguously the obligation of the implementation
    relative to updates on the gpu side to the memory object.

    This should be evident by the number of queries in these forums relating
    to USE_HOST_PTR and the like.

    Specifying a memory consistency model is hard. I know, because I've been involved in several such efforts. The spec needs to avoid assumptions around the interpretation of words like "MAP" or even "READ" and "WRITE." For instance:

    1. In an EnqueueMapBuffer operation, if the mapped address region pointed to by the return value is updated by the host [before the call, during the map operation, after the map operation completes] what is the obligation of the implementation to reflect that update to the memory in a GPU's global memory?

    2. Similar to case 1, if the GPU makes an update to a mapped region, what is the obligation of the implementation to update the host memory if the update happens [before, during, after] the call to EnqueueMapBuffer?

    3. If the CL_MEM_COPY_HOST_PTR is set for a CreateBuffer call, when does the copy operation actually take place? What are the obligations of the implementation relative to Host or Device updates to the memory object? (Again this may need to be specified in terms of before, after, during various other operations.)

    The specification needs work.

    How are specification issues actually resolved? Who is the source of definitive
    interpretations? Why isn't this documented in the specification itself?

  4. #4
    Senior Member
    Join Date
    Jul 2009
    Location
    Northern Europe
    Posts
    311

    Re: OpenCL memory model

    I believe the answers to your questions are:
    1: before: undefined; during: undefined; after: update on unmap
    2: before: at next map; during: undefined; after: undefined
    3: It's a copy. It should take place when the call is made.

    (I'm not the definitive one to answer these, but those are my interpretations of the spec.)

    Here's what the spec says about mapping:

    5.2.8.1 Behavior of OpenCL commands that access mapped regions of a memory object
    The contents of the regions of a memory object mapped for writing (i.e. CL_MAP_WRITE is set in map_flags argument to clEnqueueMapBuffer or clEnqueueMapImage) are considered to be undefined until this region is unmapped. Reads and writes by a kernel executing on a device to a memory region(s) mapped for writing are undefined.
    Multiple command-queues can map a region or overlapping regions of a memory object for reading (i.e. map_flags = CL_MAP_READ). The contents of the regions of a memory object mapped for reading can also be read by kernels executing on a device(s). The behavior of writes by a kernel executing on a device to a mapped region of a memory object is undefined.
    Mapping (and unmapping) overlapped regions of a buffer or image memory object for writing is undefined.
    The behavior of OpenCL function calls that enqueue commands that write or copy to regions of a memory object that are mapped is undefined.

Similar Threads

  1. Replies: 4
    Last Post: 07-25-2011, 01:23 PM
  2. Memory consistency model
    By llaves in forum OpenCL
    Replies: 1
    Last Post: 03-27-2010, 05:00 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •