Sunday, August 07, 2011

PDF slides completed for Chapter 19

PDF slides for Chapter 19 are now available at http:seyfarth.tv/asm for readers to download.  Chapter 19 is about computing correlation.  There is a C version, an SSE version and an AVX version.  The AVX version achieves 20.5 GFLOPS with 1 core of a Core i7 CPU.  This is a testament to the CPU design.  My code did unroll the loop and place partial sums in different registers which made it possible for the CPU to re-order the instructions and use multiple pipelines.  It achieved about 6 double precision results per cycle, which the instructions performed no more than 4 operations each and they took more than 1 cycle.  The CPU filled multiple pipelines fairly well without me having to understand every small detail of the CPU operation.

All 18 chapter now have slides prepared.

It's time to clean up some code and match it nicely with the book and upload source code.  I like working with code better than working with text, so this should be fairly fun.

You can find my book easily at Amazon or Barnes and Noble.  You can also find the direct link at  CreateSpace using https://www.createspace.com/3651611.

No comments: