So we know that on GPU (Nvidia specifically) that global memory access is a *lot* slower than local storage. Does anybody know how the memory spaces, in particular constant memory, compare?

I have some routines that calculate values based on tables as static constants in the source and I'm wondering if I copied these in to local memory whether I might get a speed increase. The OpenCL spec says that constants are allocated in an area of global memory, so should I expect to have to do similar caching techniques as I do with global memory, or do constants get loaded into a faster access store?