Multisampling is a process for reducing aliasing at the edges of rasterized primitives. It does this by
Combating the visual effects of aliasing is simple, so long as performance is irrelevant. Aliasing is the effect of not using enough samples in analog-to-digital conversions. Thus, all true antialiasing techniques revolve around increasing the number of samples used.
Texture filtering is a form of antialising, applied specifically to aliasing caused by accessing textures. Linear filtering mixes together neighboring samples instead of just using one. Mipmaps of a texture are essentially ways of pre-computing an approximation of accessing a large area of a texture. Each texel in a mipmap represents the average of several texels from the higher mipmap. Anisotropic filtering's ability to access different locations from a texture is also a form of antialiasing, fetching multiple samples to compute a more reasonable value.
But texture filtering only deals with aliasing that results from accessing textures and computations based on such accesses. Aliasing at the edge of primitives is not affected by such filtering.
A more general form of antialiasing is the simplest: render at a higher resolution, then compute the final image by averaging the values from the higher resolution image that correspond to each pixel. This is commonly called "supersampling".
In supersampling, each pixel in the eventual destination image gets its data from multiple pixels in the higher resolution image. The high-res pixels that correspond to a particular destination pixel are called "samples". Given this idea, we can think about a supersampled image as having the same pixel resolution as the destination, but with each pixel storing multiple samples of data.
An image where each pixel stores multiple samples is a "multisampled image".
When we do rasterization with supersampling, the primitive is broken down into multiple samples for each pixel. Each sample is taken at a different location within the pixel's area. So each sample contains all of the rasterization products and everything following them in the pipeline. So for each sample in the multisampled image, we must produce a Fragment, do various Per-Fragment Tests, execute a Fragment Shader on it to compute colors, perform blending, etc.
As previously stated, each sample within a multisampled pixel comes from a specific location within the area of that pixel. When we attempt to rasterize a primitive for a pixel, we sample the primitive at all of the sample locations within that pixel. If any of those locations fall outside of the primitive's area (because the pixel is at the edge of the primitive), then the samples outside of the area will not generate fragments.
So in every way, supersampling renders to a higher-resolution image; it's just easier to talk about it in terms of adding samples within a pixel.
While this is very simple and easy to implement, it's also obviously expensive. It has all of the downsides of rendering at high resolutions: lots of added rasterization, lots of shader executions, and those multisampled images consume lots of memory and therefore bandwidth. Plus, to compute the final image, we have to take time to average the colors from the multisampled image into its final resolution.
Reasonable use of texture filtering can reduce aliasing within the area of a primitive. Supersampling is only really useful at dealing with aliasing at the edges of primitives. But it affects everything and is quite expensive.
Multisampling is a small modification of the supersampling algorithm that is more focused on edge antialiasing.
In multisampling, everything is set up exactly like supersampling. We still render to multisampled images. The rasterizer still generates (most of) its rasterization data for each sample. Blending, depth testing, and the like still happen per-sample.
The only change with multisampling is that the fragment shader is not executed per-sample. It is executed at a lower frequency than the number of samples per pixel. Exactly how frequently it gets executed depends on the hardware. Some hardware may maintain a 4:1 ratio, such that the FS is executed once for each 4 samples in the multisample rendering.
That brings up an interesting question. If the FS is executed at a lower frequency, how do the other samples get fragment values? That is, for the samples that don't correspond to a FS execution, from where do they get their fragment values?
The answer is simple: the FS's fragment values are copied to multiple samples. So in our 4:1 example above, if the multisample image contains 4 samples per pixel, then those four samples will get the same fragment values.
In essence, multisampling is supersampling where the sample rate of the fragment shader (and all of its attendant operations) is at a lower rate than the number of samples per pixel.
There still remain a few details to cover. When discussing supersampling, it was mentioned that samples that are outside of the area of the primitive being rasterized don't get values. We can think of the samples in a fragment being rasterized as having a binary state of covered or not covered. The set of samples in a fragment area covered by the primitive represents the "coverage" of that fragment.
The concept of coverage is important because it can be access and manipulated.
There is one major caveat to executing the fragment shader at a rate lower than the number of samples: the location of the FS within the pixel. In multisampling, because the FS invocation will be broadcast to multiple samples, the location of that FS invocation within the pixel is not as important, since it will technically be wrong for any samples other than the one for the location of that FS invocation.
As such, it is entirely possible that the sample location used by the FS invocation could be outside of the primitive's area, if the pixel is at the edge of the primitive. Normally, this is fine, as interpolation of values past the edge of a primitive can still work mathematically speaking.
What may not be fine is what you *do* with that interpolated value. A particular FS execution may not function correctly in the event that the interpolated values represent a location outside of the primitive area.
This is what the centroid interpolation qualifier fixes. Any fragment shader inputs qualified by this identifier will be interpolated within the area of the primitive.
The whole idea of multisampling is that the FS should execute at a lower frequency than the sample count. However, it is sometimes useful to force the fragment shader to execute at the sample rate. For example, if you are doing post-processing effects on a multisample image before resolving it, then you need to execute those effects on each sample within each pixel. And the results could be very different depending on overlapping primitives and the like.
This is called "per-sample shading", and it effectively transforms multisampling into supersampling.