Difference between revisions of "User talk:ElFarto/OpenGLBM"

From OpenGL Wiki
Jump to navigation Jump to search
Line 37: Line 37:
  
 
But then all of a sudden, we have shader objects. OK, do those take up GPU memory? If not, where are they stored? And if so, where does the driver put them? After all, this is a low-level design; the driver doesn't allocate GPU memory. What if the shaders would be faster if they were in GPU memory? And why bother with having default uniform storage at all? Just force everyone to use uniform blocks (and get rid of the idiotic notion that samplers are uniforms).
 
But then all of a sudden, we have shader objects. OK, do those take up GPU memory? If not, where are they stored? And if so, where does the driver put them? After all, this is a low-level design; the driver doesn't allocate GPU memory. What if the shaders would be faster if they were in GPU memory? And why bother with having default uniform storage at all? Just force everyone to use uniform blocks (and get rid of the idiotic notion that samplers are uniforms).
 +
 +
:This is just down to me not explaining it properly. The issue I came across while designing it was that there were a number of things that need to be returned from the shader compiler. The micro-code of the shader itself, plus all the uniforms locations. The object is located in system RAM. You are responsible for copying the shader to GPU RAM. The object is just a fat return value from the shader compiler. You can throw it away once you have all the information you need out of it.
 +
 +
:re: uniform storage. At this point I can't tell if you're being serious. Reading through the ATI documents, they basically say use the 'constant-file' if you using DX9 (where you get to set x,y,z,w directly) or 'constant-buffers' if you're using DX10 (where you only specify a memory address). There doesn't appear to be regular old uniforms anymore, just constant/uniform buffers.
  
 
And then you have command buffers. That really makes no sense with the whole "low level" thing. Are these allocating CPU memory or GPU memory? I guess it has to be CPU memory, since the driver cannot allocate GPU memory at all (because you are free to stomp over anything, anywhere, at any time). Again, what if command buffers would be faster to execute if they were in GPU memory?
 
And then you have command buffers. That really makes no sense with the whole "low level" thing. Are these allocating CPU memory or GPU memory? I guess it has to be CPU memory, since the driver cannot allocate GPU memory at all (because you are free to stomp over anything, anywhere, at any time). Again, what if command buffers would be faster to execute if they were in GPU memory?
  
 
Then, there's uploading and downloading from GPU memory. DMAing usually has very strict alignment requirements. These are based on the GPU hardware and system the platform is running on. Yet, you provide no API for actually doing that. It also may require allocating special uncached memory; again, no API is provided. Nor is an API provided for forcing lines out of the cache, as one might also want for DMA. [[User:Alfonse|Alfonse]] 11:45, 30 September 2011 (PDT)
 
Then, there's uploading and downloading from GPU memory. DMAing usually has very strict alignment requirements. These are based on the GPU hardware and system the platform is running on. Yet, you provide no API for actually doing that. It also may require allocating special uncached memory; again, no API is provided. Nor is an API provided for forcing lines out of the cache, as one might also want for DMA. [[User:Alfonse|Alfonse]] 11:45, 30 September 2011 (PDT)

Revision as of 07:27, 4 October 2011

Let the discussion commence!

Wrong Design.

The very foundation of this idea is fundamentally flawed. You're basically saying that you want a console-style graphics API. Well that's just not going to happen. Not on PCs and not on mobile platforms.

Consoles are not PCs. Consoles are only doing one thing: playing one game. On PCs and on mobile platforms, multiple applications must exist simultaneously. They have to get along with one another.

Take GPU memory, for example. You don't own that. The driver doesn't even own that. The operating system owns GPU memory. Vista and Win7 both use a driver model that gives them complete ownership of video memory. They can (and do) virtualize it, paging it in and out as needed. So you can no more arbitrarily talk to GPU memory than you can CPU memory. Other OS's have similar issues.

The GPU itself is a shared resource. It has to share time with the OS, OpenGL applications, OpenCL applications, etc. You can't own the GPU.

This is a bad API because the very concept of it is flawed. You looked at a console graphics API and decided that you wanted that on the PC, without taking into account the different needs of the platforms. That isn't to say that OpenGL couldn't use an overhaul to improve performance and usability. But simply saying, "go as low level as possible" just doesn't work. Alfonse 11:19, 30 September 2011 (PDT)

Regarding memory, the design doesn't state how the GPU/driver/OS manages that memory. If its virtualised it doesn't matter, the OS can swap stuff in and out to its heart's content, this works perfectly well for system RAM, I don't see why the same can't work for GPU RAM. We know GPUs can already do this, so I fail to see any problem there. I'm not trying to create an exclusive access API. The same goes for the rest of the GPU. They're already capable of handling multiple command streams at a time, and there's no need for 1 application to own the GPU or its memory.
As for OpenGL getting an overhaul, we both know that'll never happen. They had a chance to, and they ended up not doing it for whatever reason. elFarto 00:22, 4 October 2011 (PDT)

Wrong Abstraction

There is a reason that OpenGL hides implementation details. libgcm is only ever intended to work for a specific piece of hardware. OpenGL (and D3D for that matter) is an abstraction of hardware. This allows it to work for many kinds of hardware.

Would your API allow tile-based renderers to work at all? I highly doubt it. A tile-based renderer has different framebuffer needs from a regular rendering system.

My understanding of this is that you tell the GPU to render, it creates a list of triangles, sorts and bins them, renders little tiles of the screen then copys it to the framebuffer. I don't see any real issue there, however some concerns around the lack of depth buffer (since the GPU sorts the triangles up front). Obviously some API will need to be provided to the application can skip allocating a depth buffer. Also the GPU obviously needs memory to store this list of triangles, that's a little trickier to solve, but not impossible. There's also some interaction with anti-alasing (you don't need an AA'ed framebuffer to do anti-aliasing).
The API isn't ment to hide these details.elFarto 00:24, 4 October 2011 (PDT)

Your antialiasing "API" is another example. In trying to cover everything, it really covers nothing. You have these arbitrary numbers that map to hardware features, but nobody can tell exactly how. There's no API for asking what anti-aliasing level "3" means. This means there's no way for the application to select the antialiasing level explicitly. Alfonse 11:36, 30 September 2011 (PDT)

Anti-aliasing is probably the least thought through section, and it shows. I stand-by the idea to represent all anti-aliasing method using a single value, but yes, a better APIs for getting that number, and getting descriptions of each value is needed. This is an easy thing to solve:
 bmGetAntiAliasingLevel(minimumSamples);
 bmGetAntiAliasingLevelDescription(level, char **description);

Inconsistent Design

This design is inconsistent. It starts off by saying that it's supposed to be low-level, with no objects and direct access to GPU memory. You don't even have texture objects to manage texture memory.

But then all of a sudden, we have shader objects. OK, do those take up GPU memory? If not, where are they stored? And if so, where does the driver put them? After all, this is a low-level design; the driver doesn't allocate GPU memory. What if the shaders would be faster if they were in GPU memory? And why bother with having default uniform storage at all? Just force everyone to use uniform blocks (and get rid of the idiotic notion that samplers are uniforms).

This is just down to me not explaining it properly. The issue I came across while designing it was that there were a number of things that need to be returned from the shader compiler. The micro-code of the shader itself, plus all the uniforms locations. The object is located in system RAM. You are responsible for copying the shader to GPU RAM. The object is just a fat return value from the shader compiler. You can throw it away once you have all the information you need out of it.
re: uniform storage. At this point I can't tell if you're being serious. Reading through the ATI documents, they basically say use the 'constant-file' if you using DX9 (where you get to set x,y,z,w directly) or 'constant-buffers' if you're using DX10 (where you only specify a memory address). There doesn't appear to be regular old uniforms anymore, just constant/uniform buffers.

And then you have command buffers. That really makes no sense with the whole "low level" thing. Are these allocating CPU memory or GPU memory? I guess it has to be CPU memory, since the driver cannot allocate GPU memory at all (because you are free to stomp over anything, anywhere, at any time). Again, what if command buffers would be faster to execute if they were in GPU memory?

Then, there's uploading and downloading from GPU memory. DMAing usually has very strict alignment requirements. These are based on the GPU hardware and system the platform is running on. Yet, you provide no API for actually doing that. It also may require allocating special uncached memory; again, no API is provided. Nor is an API provided for forcing lines out of the cache, as one might also want for DMA. Alfonse 11:45, 30 September 2011 (PDT)