Difference between rated peak performance and actual performance
I am using laptop with RADEON HD-6290. Rated peak performance of gpu is 44 gflops (checked on wikipedia). When i am running OpenCL sample examples (provided by AMD) on gpu, it is giving performance of 5-6 G instructions. Why such a large difference between peak vs actual capacity.
Please find attached file for more detail.
2nd last column is instructions/sec (calculated as Total work item * ALU instructions * 1000 / time (ms)).
Last column is instructions/sec normalized to 100% ALU busy.
first of all you shuld not believe a wikipedia post.
But it should be correct that the theoretical peak performance is much higher, because it is calculated from the spec of the board without a single float calculated normally. also keep in mind that the GPU timers are not that acurate and on windows there is a large overhead because of the wddi