User talk:ElFarto/OpenGLBM

From OpenGL Wiki
< User talk:ElFarto
Revision as of 19:06, 4 October 2011 by Alfonse (talk | contribs) (On buffer storage)
Jump to navigation Jump to search

Let the discussion commence!

Wrong Design.

The very foundation of this idea is fundamentally flawed. You're basically saying that you want a console-style graphics API. Well that's just not going to happen. Not on PCs and not on mobile platforms.

Consoles are not PCs. Consoles are only doing one thing: playing one game. On PCs and on mobile platforms, multiple applications must exist simultaneously. They have to get along with one another.

Take GPU memory, for example. You don't own that. The driver doesn't even own that. The operating system owns GPU memory. Vista and Win7 both use a driver model that gives them complete ownership of video memory. They can (and do) virtualize it, paging it in and out as needed. So you can no more arbitrarily talk to GPU memory than you can CPU memory. Other OS's have similar issues.

The GPU itself is a shared resource. It has to share time with the OS, OpenGL applications, OpenCL applications, etc. You can't own the GPU.

This is a bad API because the very concept of it is flawed. You looked at a console graphics API and decided that you wanted that on the PC, without taking into account the different needs of the platforms. That isn't to say that OpenGL couldn't use an overhaul to improve performance and usability. But simply saying, "go as low level as possible" just doesn't work. Alfonse 11:19, 30 September 2011 (PDT)

Regarding memory, the design doesn't state how the GPU/driver/OS manages that memory. If its virtualised it doesn't matter, the OS can swap stuff in and out to its heart's content, this works perfectly well for system RAM, I don't see why the same can't work for GPU RAM. We know GPUs can already do this, so I fail to see any problem there. I'm not trying to create an exclusive access API. The same goes for the rest of the GPU. They're already capable of handling multiple command streams at a time, and there's no need for 1 application to own the GPU or its memory.
Except that how all of that management works is hidden by the driver model. Virtualization does not necessarily mean actual virtual memory as implemented on CPUs, where virtual memory fetches can automatically reach out and pull from the harddisk with some minimal OS interaction. Exactly how it works is something that only driver developers and Microsoft are privy to, so making assumptions based on knowledge not in hand is... questionable.
As for multiple command streams, again, the driver gets in the way. We don't know if the GPU has actual hardware logic for multiple streams, or if the driver just does some magic to make it all seem to work. By making your API this low level, you take away the tools that the driver needs to be able to do that magic. Like reasonable knowledge of what memory is in use and what memory is not. Alfonse 02:53, 4 October 2011 (PDT)
As for OpenGL getting an overhaul, we both know that'll never happen. They had a chance to, and they ended up not doing it for whatever reason. elFarto 00:22, 4 October 2011 (PDT)
And we both know that this proposal is even less likely to happen. So why talk about it? Alfonse 02:53, 4 October 2011 (PDT)
Just like your buffer storage idea? We talk about things we know will never get done because we want to see OpenGL improve, and we (I assume you're not on the ARB) have no other way of doing it. elFarto 05:02, 4 October 2011 (PDT)
My buffer storage proposal, or something like it, is something that they actually would do, if they saw the need for it. GL_ARB_texture_storage shows that the ARB is amenable to making these kinds of changes. Making an entirely new API, one that's so low level that it stops being an abstraction and becomes little more than GLide for modern GPUs, is not something the ARB would even consider. There's a big difference between proposing an extension and proposing that they make a whole new API that is fundamentally different from everything else.
I would never have proposed buffer storage if the ARB had not already demonstrated that they were willing to implement these kinds of purely API-based changes for the betterment of performance. In general, I only propose things that the ARB would actually consider implementing (based on past behavior), rather than simply whatever I would like OpenGL to do. It's also why I'm hard on those pie-in-the-sky proposals that most others throw out there. They increase the signal-to-noise ratio for little gain. Those are the kind of thing that makes the ARB not read that forum. Alfonse 12:06, 4 October 2011 (PDT)

Wrong Abstraction

There is a reason that OpenGL hides implementation details. libgcm is only ever intended to work for a specific piece of hardware. OpenGL (and D3D for that matter) is an abstraction of hardware. This allows it to work for many kinds of hardware.

Would your API allow tile-based renderers to work at all? I highly doubt it. A tile-based renderer has different framebuffer needs from a regular rendering system.

My understanding of this is that you tell the GPU to render, it creates a list of triangles, sorts and bins them, renders little tiles of the screen then copys it to the framebuffer. I don't see any real issue there, however some concerns around the lack of depth buffer (since the GPU sorts the triangles up front). Obviously some API will need to be provided to the application can skip allocating a depth buffer. Also the GPU obviously needs memory to store this list of triangles, that's a little trickier to solve, but not impossible. There's also some interaction with anti-alasing (you don't need an AA'ed framebuffer to do anti-aliasing).
The API isn't ment to hide these details.elFarto 00:24, 4 October 2011 (PDT)
Then it's not an API, is it? What you seem to be talking about is just GLide for ATI hardware. Or NVIDIA hardware. If I can't write code on one platform and have it run on another platform with minimal if any code changes, then there's no point in having an API there at all. I may as well be talking to the registers directly in hardware-specific language. Alfonse 02:47, 4 October 2011 (PDT)

Your antialiasing "API" is another example. In trying to cover everything, it really covers nothing. You have these arbitrary numbers that map to hardware features, but nobody can tell exactly how. There's no API for asking what anti-aliasing level "3" means. This means there's no way for the application to select the antialiasing level explicitly. Alfonse 11:36, 30 September 2011 (PDT)

Anti-aliasing is probably the least thought through section, and it shows. I stand-by the idea to represent all anti-aliasing method using a single value, but yes, a better APIs for getting that number, and getting descriptions of each value is needed. This is an easy thing to solve:
 bmGetAntiAliasingLevelDescription(level, char **description);

Inconsistent Design

This design is inconsistent. It starts off by saying that it's supposed to be low-level, with no objects and direct access to GPU memory. You don't even have texture objects to manage texture memory.

But then all of a sudden, we have shader objects. OK, do those take up GPU memory? If not, where are they stored? And if so, where does the driver put them? After all, this is a low-level design; the driver doesn't allocate GPU memory. What if the shaders would be faster if they were in GPU memory? And why bother with having default uniform storage at all? Just force everyone to use uniform blocks (and get rid of the idiotic notion that samplers are uniforms).

This is just down to me not explaining it properly. The issue I came across while designing it was that there were a number of things that need to be returned from the shader compiler. The micro-code of the shader itself, plus all the uniforms locations. The object is located in system RAM. You are responsible for copying the shader to GPU RAM. The object is just a fat return value from the shader compiler. You can throw it away once you have all the information you need out of it.
re: uniform storage. At this point I can't tell if you're being serious. Reading through the ATI documents, they basically say use the 'constant-file' if you using DX9 (where you get to set x,y,z,w directly) or 'constant-buffers' if you're using DX10 (where you only specify a memory address). There doesn't appear to be regular old uniforms anymore, just constant/uniform buffers.
This shows another part of this designs problem: lack of information. You talk about the ATI documents as though they were the last word on how GPUs work. You take that one data point and assume that all GPUs work that way. Do you have equivalent documentation on NVIDIA GPUs? What about Intels GPUs? What about PowerVRs? Without knowing that all GPUs do things that way, you cannot simply decree that it will be so. That's part of the reason why OpenGL is as high level as it is: to allow all kinds of hardware to implement it. Not just the GPUs that you happened to have low-level knowledge of. Alfonse 02:44, 4 October 2011 (PDT)

And then you have command buffers. That really makes no sense with the whole "low level" thing. Are these allocating CPU memory or GPU memory? I guess it has to be CPU memory, since the driver cannot allocate GPU memory at all (because you are free to stomp over anything, anywhere, at any time). Again, what if command buffers would be faster to execute if they were in GPU memory?

Then, there's uploading and downloading from GPU memory. DMAing usually has very strict alignment requirements. These are based on the GPU hardware and system the platform is running on. Yet, you provide no API for actually doing that. It also may require allocating special uncached memory; again, no API is provided. Nor is an API provided for forcing lines out of the cache, as one might also want for DMA. Alfonse 11:45, 30 September 2011 (PDT)