Under IRIX, where a lot of the antecedents for OpenML originate, there is a sophisticated buffer management system called DMbuffers. A pool of buffers can be requested that conform to particular usage requirements (DMbuffer man page). My understanding is that each DMbuffer is backed by physically contiguous memory, depending on the underlying device requirements. From memory, on at least the O2, the memory comes from a special allocator.

By providing physically contiguous memory (possibly coherent as well?), this simplifies DMA transfers to/from the various hardware subsystems. While it is certainly viable to use scatter gather DMA facilities to communicate with most devices, I would imagine that a highly fragmented user allocated buffer would result in a significant SG length for something like realtime uncompressed HD content.

Is there value in providing something like the DMbuffer library for OpenML? Is it considered outside the scope, or just too painful to implement on a crossplatform basis? Is it mandated that device drivers handle user space buffers (thus pretty much requiring either a SG implementation or bounce buffers)?

Under Linux, up until the 2.4 kernels, the bigphysarea patch provided one way of providing a pool of memory suitable for this, as did reserving a set of memory at boot time. Under 2.6, hugetlbfs looks like a possible way to do it (although the fixed page size may be too inflexible).

Tom