Dear all,

My platform has one processor with 4 cores, and a single GPU. The application I am running is using MPI, and I would like to modify it to use also the GPU. Is it possible to create a buffer on the GPU that could be accessed by all CPUs (even if it cannot be done simultaneously)? All I need is to collect parts of a large matrix distributed among the CPUs into the GPU buffer, do some computations with GPU, and then distribute parts of the GPU result buffer to the CPUs. This would allow to avoid creating additional buffers on CPU and redundant MPI synchronization. Any advice is welcome.

Thank you.

Best regards,