Sunday, August 28, 2011

Should there be a limit to wealth?

Let's start by considering one extreme possibility:  one person owns everything on the planet.  Let's name this person Midas.  Midas owns all homes, so each other person must pay rent to Midas.  Midas owns all the farms, so each other person must buy food for Midas.  Midas owns all buildings and businesses, so each other person must work for Midas in order to survive.

Should anyone be allowed to own everything?  This seems like a possibly horrible life for the rest of humanity.  Of course, Midas could be a generous owner and perhaps he pays people enough for them to survive fairly well.  Eventually Midas will die and then I suppose Midas will grant the world to his first-born child.

Would Midas be just as happy owning enough to satisfy his every whim?  How much money is sufficient to satisfy his every whim?   One billion dollars?   One trillion dollars?

Wednesday, August 24, 2011

Started Blackboard Questions

I have decided to use the Blackboard courseware system to give a quiz for homework from each chapter I cover this semester in Assembly Language.  I believe that it will force my students to read the chapters fairly carefully which will help a lot.

So far I have prepared quizzes for the first 2 chapters.  It looks like it will take perhaps 1 to 2 hours per quiz to prepare objective questions.


The complete set of quizzes will be available via email to teachers who adopt the textbook in the future.


As usual the source code can be downloaded from http://seyfarth.tv/asm and the CreateSpace site for the book is https://www.createspace.com/3651611.

Saturday, August 20, 2011

Army Worms

My neighbor pointed out to me yesterday that my azaleas are being defoliated by an invasion of army worms.  I try to maintain my lawn and plants without using pesticides, so I will be killing some caterpillars by squishing them underfoot.

I decided to look them up on the internet.  They are indeed called army worms die to their tendency to show up in large numbers.

Apparently they like azaleas. I considered just letting them eat all they wanted. but after they finish their favorite foods they will move on to other sources.  They even like to eat your grass. In fact they can wipe out entire hay fields.

Surely there must be natural enemies to the munchers.  The little devils probably taste bad so that birds won't eat them.  I have lizards and geckos too.  These critters must be falling down on the job.

I guess for a few more days I will start the day on army worm patrol.  Who says I don't have in interesting life?

Saturday, August 13, 2011

No one stands alone

It bothers me that there are a large number of successful people who seem to lack gratitude for their success.  They think they did it "on their own".  No one stands alone.  We are all interdependent whether we like it or not.

Now I like to talk about how I have worked all my life as do many others.  Well, I have worked fairly hard, but I haven't achieved my small measure of success all by myself.  To start with my success is with computers and I certainly would be clueless about building my own chip factory and designing a computer from scratch.  My success is directly related to many thousands of people who have helped create a computer industry.

Friday, August 12, 2011

Pears, pears and more pears

Pears grow well in Mississippi if you keep them watered a bit through the dry spells.  This year it seemed we had almost 2 months with little rain.  Trees are fairly drought-resistant, but eventually there comes a point when you start to wonder how much your trees can take.  I waited until the grass started looking a bit stressed and started watering then.

I have tried a variety of spray nozzles and devices to spray large areas.  This year I tried just using the hose with no nozzle.  My theory is that the water comes out in greater volume without a nozzle.  This seems to be far quicker for watering small plants.  I think it is about twice as fast.  I have also tried more precise spot watering of stressed grass.  Overall I think this is a useful technique.

In the past I would leave a sprinkler in the yard, go inside for about 15 minutes and then do out to move the sprinkler.  Some areas would receive more water than necessary and it took over 2 hours to do the front and back yard.  With spot watering and no nozzle I can finish a little quicker though I am standing outside the while time.


Look inside my book!

It took a couple of weeks, but now you can use the Amazon "Look Inside" feature to view part of my book.  Amazon also matched B&N's price!  $21.56 is a fine price to pay for a textbook!  The title works a link, the blogger editor didn't allow me to fix a link for the picture.










Introduction to 64 Bit Intel Assembly Language Programming: Getting the most out of your computer
 
 






List Price: $29.95
Price: $21.56 & eligible for free shipping with Amazon Prime
You Save: $8.39 (28%)








Thursday, August 11, 2011

Demand side economics

It is currently popular in America to believe in supply side economics which basically means less government intervention in business.  The idea as I understand it is that by giving the people on the top of society more money they will have more money to employ the rest of society.  Given that there are smart people on both sides of this issue, I will certainly agree that this argument has some appeal to reason.

The other view of economics is demand side or Keynesian economics.  This view has it that government spending can rescue a failing economy.  This was arguably successful enough to prevent a peasant revolution during the Great Depression, though many people like to claim that it was World War II which successfully ended the depression.  There is some truth to the fact that WWII did bring the US to a state of high employment - through government spending which says that demand side economics worked when it was applied in a large enough dose.

Wednesday, August 10, 2011

Finished source code preparation

I have completed the organizing of code forthe rest of the book.  I expect that there will be some changes made to the code as I teach the class and learn what people need in the code.  There is a fair amount of code written over 3 months as I studied and wrote about 64 bit assembly language.  I am sure there is some amount of "style" changes during this time.  I have yet to see of think of a way to make assembly language look as stylish as C++.  No matter what you do assembly code requires attention to details which can be partially handled by the syntax of the language in C++.  Things like forgetting which register is first, second and third aren't issues in C++.  In fact C++ can sometimes keep you out of trouble with attempts to pass data of the wrong type into a function.  In assembly there is hardly any concept of type.

As usual the source code can be downloaded from http://seyfarth.tv/asm and the CreateSpace site for the book is https://www.createspace.com/3651611.

Finished source code for Chapter 12

Chapter 12 is about system calls.  I have prepared example programs for all the code snippets in the chapter.  I also wrote a copy program which does a moderate amount of testing in addition to reading a file in 1 read and writing it with 1 write.  This should serve as a fairly good resource.  It uses command line parameters for file names and tests nearly all the system calls.

It would be more practical to write a copy program which allocates a fairly large array and copies a file in pieces in a loop, but that is one of the exercises at the end of the chapter.

As usual the source code can be downloaded from http://seyfarth.tv/asm and the CreateSpace site for the book is https://www.createspace.com/3651611.

Finished source code for Chapter 11

It took me a few hours to write and test all the code in Chapter 11.  There were some pieces of code which I thought I had tested, but obviously had not.  Now my src/ch11 directory has 14 different source files.

I had been too rash and had not really noticed that the instructions for floating point comparisons require using different conditional jumps than the integer comparisons.  I also read somewhere that test was a good instruction for comparing.  Test reports that it performs an "and" instruction.  If the "and" of two operands yields a 0, then the zero flag is set.  Using the zero flag in the most obvious test, je, ends up jumping when the two operands have nothing in common rather than when they are equal.  So as a programmer, I want to use je to test for equal objects and I need "cmp" rather than "test".  Maybe someday I will find a use for test, but I have spent too much time working with numbers to appreciate that anding has anything to so with testing...

Now there is fairly complete set of errata for Chapter 11, in addition to the source code.

You can access the source code, PDF slides and errata at http://seyfarth.tv/asm and you can reach the CreateSpace page for the book at  https://www.createspace.com/3651611.


Tuesday, August 09, 2011

Beware when comparing floating point numbers

I have been writing a lot of test code verifying some of the short segments of code in my assembly book.  I ran into a bizarre state of affairs with floating point comparisons.  I had read the Intel instruction manual too quickly and ASSUMED that the floating point ucomiss instruction, being fairly new, would be designed to make for easy programming.  Imagine my shock when I tried to compare with ucomiss and then use jle to jump on less than or equal and it did not work properly.

I wrote a C program with all 5 arithmetic comparisons using floats to study more carefully what gcc does to cope with ucomiss.  Here's what gcc did for a less than comparison on 2 registers:

   ucomiss    %xmm1, %xmm0
   seta       %al
   testb      %al, %al
   je        .L3

Monday, August 08, 2011

More source code files available

My web site for the Assembly book, http://seyfarth.tv/asm, now has source code for Chapters 1-10.  During this process a few errors were found in the book and have been noted in the errata on the same site.

The process of organizing the source code is going quite well.  I ran into a program which persisted in not working properly with either gdb or ygdb.  The goal was to place a break point on main to simplify debugging.  I issued the command "b main" and it placed a break on a line in the copy_array function.  I don't understand this yet, but it was solved by making main appear before copy_array in the source code file.  Very strange...  I think I have debugged programs where main was at the bottom before, but it is so easy to be confused about precisely what you have done when using computers.

In any case I have 28 assembly source files in the current download and 3 C source files.  The various Makefiles build all the programs and I am pretty sure I have tested them all.  Testing is paramount.

You can find my book easily at Amazon or Barnes and Noble.  You can also find the direct link at  CreateSpace using https://www.createspace.com/3651611.

Sunday, August 07, 2011

Bug in bit field code at the end of Chapter 7

I have been testing all short sequences of code and found an error in the last part of Chapter 7 where I planned to use movsx with an immediate 32 bit field to prepare a mask.  The movsx instruction allows a register or memory 32 bit field to be sign-extended but not an immediate field.  The fix looked a little ugly and sign extension is not an elegant solution so I posted an better solution in the Chapter 7 errata page.

The better solution is shifting the rax register right 29 bits, followed by shifting it left 29 bits.  This is more generally applicable.  More importantly it actually works.


Started preparing source code for downloads

I started today on preparing a source code distribution for http://seyfarth.tv/asm. So far I have prepared a src directory in my own computer with ch01, ch02, ... ch19 as subdirectories.  There is a Makefile in each subdirectory which will manage building the demo programs for that chapter.

There is also a master Makefile in the src directory which by default will visit each chapter subdirectory and make its programs.  As an aid to myself, the master Makefile defines a tgz target which used to build a clean src.tgz file to place on the web server.

So far I have completed programs for Chapters 1-6 and am writing some programs for Chapter 7.  Wherever the book has code without an accompanying gdb session it is a fair bet that I haven't yet written and tested the code.  So far I have added one errata entry for a programming type in Chapter 7.

This is fairly quick to do.  I hope to finish it in 1 to 2 more days.

PDF slides completed for Chapter 19

PDF slides for Chapter 19 are now available at http:seyfarth.tv/asm for readers to download.  Chapter 19 is about computing correlation.  There is a C version, an SSE version and an AVX version.  The AVX version achieves 20.5 GFLOPS with 1 core of a Core i7 CPU.  This is a testament to the CPU design.  My code did unroll the loop and place partial sums in different registers which made it possible for the CPU to re-order the instructions and use multiple pipelines.  It achieved about 6 double precision results per cycle, which the instructions performed no more than 4 operations each and they took more than 1 cycle.  The CPU filled multiple pipelines fairly well without me having to understand every small detail of the CPU operation.

All 18 chapter now have slides prepared.

It's time to clean up some code and match it nicely with the book and upload source code.  I like working with code better than working with text, so this should be fairly fun.

You can find my book easily at Amazon or Barnes and Noble.  You can also find the direct link at  CreateSpace using https://www.createspace.com/3651611.

PDF slides completed for Chapters 17 and 18

PDF slides for Chapters 15 and 18 have been prepared and uploaded to http://seyfarth.tv/asm for readers to download. Chapter 17 is about counting 1 bits in an array.  It shows the effect of improved algorithms and shows a dramatic improvement when the popcnt instruction is used.

Chapter 18 is about the Sobel image filter.  A C version was written which processed 158 images per second. An assembly version was written operating in 14 Sobel results at once.  This processed 980 images per second, which was about 6.2 times as fast.

Saturday, August 06, 2011

PDF slides prepared for Chapter 16

PDF slides are now available on http://seyfarth.tv/asm for "Chapter 16: High Performance Assembly Programming".  This chapter is about some general strategies, most of which are done quite well by modern compilers.

The last 3 chapters explore 3 different algorithms which have been implemented in assembly.  In some of these the assembly code is far faster than the compiled C code.  This is generally only possible if you employ specialized instructions which are usually harder for the compiler to apply given your high level language code.  These instructions usually require some rearranging of the algorithm which is hard for a compiler to do better than a human.

PDF slides prepared for Chapter 15

I have prepared PDF slides for "Chapter 15: Data Structures" and uploaded them to http://seyfarth.tv/asm for people to download for classes.  Chapter 15 is clearly an optional chapter for many courses.  It might be beyond the call of duty to introduce data structures first in an assembly language class.  It would clearly make the concepts of lists and other data structures more concrete.

The chapter includes singly linked lists, doubly linked lists, hash tables and binary trees.  There is no effort made to produce balanced binary trees which would certainly make for more opaque coding.  The hash table code is efficient and reasonably easy to understand.  Of course you might very well get adequate performance at a fraction of the programming time by using STL data structures.

Friday, August 05, 2011

PDF slides prepared for Chapters 13 and 14

I have prepared and uploaded PDF slides for "Chapter 13: Structs" and "Chapter 14: Using the C stream I/O functions" to http://seyfarth.tv/asm for users of my book to download.  This is a short chapter focusing on using an array of structs.  Chapter 15 discusses data structures where structs will be central to the discussion.

The stream I/O chapter discusses using fopen, fseek, fprintf, fscanf, fread, fwrite, and several more basic I/O functions.  These are more efficient for reading and writing small amounts of data.  They are also fairly efficient for arrays of data.  The last function uses fseek to position a file and fwrite to write a customer object to a file.

Only 5 more sets of slides to prepare!

Climbing rapidly on Amazon and Google

My book is now number 2 if you search in books for "64 bit assembly" on Amazon.  The number one book is a more thorough book describing 32 bit and 64 bit assembly and it was published in 2005.  I will be happy enough if I remain number 2 in that ranking.

I would like to make it to the first page when searching for "assembly language", but my book needs a second edition to be better than most of the books in that list and some good reviews to climb up the rankings.

I have also made it to number 3 on google when searching for "64 bit assembly language" and number 4 when searching for "64 bit assembly".

The current rankings are encouraging.  My book will be found by most people interested in a book of 64 bit assembly language programming.  Now if I get some good reviews, sales will increase.

You can find my book easily at Amazon or Barnes and Noble.  You can also find the direct link at  CreateSpace using https://www.createspace.com/3651611.

Thursday, August 04, 2011

PDF slides prepared for Chapter 12

PDF slides for Chapter 12: System Calls have been prepared and uploaded to http://seyfarth.tv/asm for downloading.

Chapter 12 is about system calls.  It describes how both 32 bit and 64 bit system calls are implemented on Linux.  Interestingly the numbers used for system calls for 32 bit calls and 64 bit calls bear no obvious relation.

Using these numbers for system calls is fairly pointless since there are C wrapper functions for each system call which have easy to remember names and a little added capability.  The time taken for a system call is far greater than the time taken up in using the wrapper functions, so there is little value in using system calls directly.  Furthermore there is probably no reason not to write nearly all system call using code in C or C++ anyway.

12 of 19 chapter complete!

Slowly climbing up in the ranking

I've been monitoring my progress in various types of search.  My best success is on the Barnes and Noble web site.  My book shows up number 15 in a search for "assembly language" and number 1 for "64 bit assembly language".

Second best is searching for "64 bit assembly language" on Amazon's web site, where my book comes up number 6.

I need more effort and more time to climb up the ranks on google or bing.  I have had a little success with google finding one of my blog entries on this blog.  That is encouraging.  I need to figure out something people search for (that I know something about) and post a blog message.

PDF slides prepared for Chapter 11

I now have prepared and uploaded PDF slides for Chapter 11: Floating Point Instructions.  This might be enough assembly language for many people.  With the instructions discussed so far you could write fairly nice programs using arrays, functions, integer math, floating point math and simple I/O instructions.

This chapter also introduces the SSE instructions using 4 packed floats per XMM register or 2 packed doubles per XMM register.  This can be the start of some efficient programming.  Being able to issue commands to perform 4 floating point instructions can require some re-engineering of your algorithm, but if you do this carefully you can unroll loops and keep the SIMD pipeline fairly full.  This could yield perhaps 4 float results per machine cycle which is roughly 10-15 GFLOPs single precision.  Some of the latest CPUs can yield more than 1 SIMD instruction completion per cycle.

It will become more interesting with the later chapters on optimization.

I have also located a few more typos.  Check it all out at http://seyfarth.tv/asm

Wednesday, August 03, 2011

Warming up for Windows

Well I gave a little effort toward preparing to write a Windows version of my book today.  I wanted to keep it simple and use yasm, gcc and gdb.  Then the Windows book would be very easy to write.  Reality reared its ugly head when I stumbled into Windows territory.

I first tried installing Mingw and MSYS because I knew that Mingw has a 64 bit compiler.  I tried numerous times but never got a hello world program in C to compile.  I also had to install PKZip just to unpack one of the 2 zip files.  It was not looking good for describing in a Windows book how to install the tools.  It was looking impossible for me to do, although I am sure that it can be done and I would eventually stumble upon the secret.

Tuesday, August 02, 2011

Chapter 10 slides on seyfarth.tv/asm

I have completed the PDF slides for Chapter 10 and uploaded them to http://seyfarth.tv/asm for people to download as needed.  Chapter 10 is about arrays.  Prior to this chapter a little use was made of arrays, but this chapter shows a wide variety of memory addressing patterns which are applied to arrays.

This is some code to illustrate creating arrays using malloc.  This is far more convenient than using fixed size arrays in the data or bss segment.  It is faster and it allows using very large arrays, while the static arrays are limited to 2 GB.  This is the right way to work with arrays.

There is a sample program which allocates an array, fills it with random integers, and finds the minimum in the array.  There is also some code to copy one array to another using normal looping.

The chapter ends with a discussion of command line parameters which are critical for more Linux applications.

Only 9 more sets of slides to prepare!