Wednesday, December 28, 2011

DBE becomes EBE

I noticed that the console window in dbe did a fairly nice job of editing and after a little more tinkering I decided to bite the bullet yet again.  With a crude editor the debugger can perform a fairly good job as an integrated development environment.  I liked the name dbe, but that seems too much like a debugger.

The name which appealed to me was "ebe".  I suppose it most commonly refers to an "extraterrestrial biological entity", but I wouldn't let that stop.  It's easy to pronounce and it's catchy.  It could have a nice icon for people to click on.  But what should be the official reason for the 3 letters?

Monday, December 26, 2011

DBE Debugger Version 0.1

I have completed the first version of the dbe debugger.  I liked the name dbe - "db" is obviously for debug and I just like the final "e".  I will claim it stands for education, since its main purpose is to assist in teaching assembly language.

The debugger and the documentation are available at http://rayseyfarth.com/dbe.  The index page there is quite plain, but the user documentation is a decent looking LaTeX document converted to html using htlatex.

There are a few nice tweaks added to the first version which I had not considered before-hand.  First it is possible to mark a variable name in the source code window and add it to the displayed variables in the data window.  The program will determine the address using gdb.  Second the user can mark an address in the register window or the data window and easily create user-defined variable with that address.  These features will help a lot.

Monday, December 19, 2011

Started a new debugger

I recently decided that I had spent too much time trying to cope with the problems involved with using various IDE's and debuggers.  I had good success with ddd though it would have been nicer is my pygtk dialog would have added a variable to the ddd data window.  Then I started experimenting with Windows.
I found python along with 64 bit versions of gcc and gdb for windows.  The gdb did not have an internal python interpreter.  So I decided to bite the bullet.

Monday, December 12, 2011

Fluxbox rules!

I have gotten sick of taskbars.  You can autohide them and get pretty much complete use of all your screen space, but, no matter where you hide them - bottom, top, left or right, they will get in your way.  So I have tried quite a few window managers recently.  After much struggle with gnome, kde, xfce, blackbox, etc. I have come back several times to fluxbox.  I like the fact that you can set keyboard and mouse short-cuts in ~/.fluxbox/keys.  I have it set so that I can start nearly all the programs I normally want to run with no menu system at all.

So I've gone one step further, I have made the taskbar invisible.  I can bring up the fluxbox root menu using the windows-key with a right mouse click.  So I can always get to a menu without wasting screen space.  The biggest drawback I have noticed is the lack of a windows list in the taskbar.  Still I can use alt-tab to move through the windows rapidly enough when the window I want isn't visible. The next problem is not having a desktop switcher app in the toolbar.  I can move through the desktops using alt-ctrl and the arrow keys, so this is not hard at all to do without.

So, after much struggle, I have settled on a minimal interface.  Not one pixel is wasted on a toolbar.  No toolbar jumps over my applications no matter where I move the mouse.  I can immediately get to the menu at any point to try to find one of the infrequently-used programs on my system.

There is no need for any more window manager development.  I was frustrated by the Ubuntu switch to Unity, but the net result is that I will probably stick with fluxbox and ignore all the nonsense with all the window managers which change yearly apparently just to confuse the users.

Did I mention that fluxbox is immediately ready when I finish entering my password?  Take that, KDE!  No more Unity with its confusing almost-menu system.  Who needs that junk?  Taskbars are a wicked Windows invention!


Saturday, December 10, 2011

Python commands added to gdb

I have managed to define several useful commands for gdb using python.  These commands work with both gdb and ddd.  The basic commands are text commands, but I did manage to create a gtk window for defining variables and watches.

Defining variables

The first command is "dv" which stands for define variable.  The basic idea is to associate a name with an address which can be either a scalar or an array.  Here is the command usage

    dv name address format size count

Friday, December 09, 2011

Python - the little language that can

This week I discovered that the gdb debugger has a built-in python interpreter.  You can issue python commands from the gdb command line using the "python" or "py" prefix.  In the simplest form

     py print "Hello World"

will print "Hello World" using python from within gdb.  But there is much more.

First it is possible to define commands which can be executed directly by gdb.  In my case I initially defined 2 commands "dv" and "pv" which allow defining a variable based on its address with "dv" and printing the variable using "pv".  Then I moved on to defining a command "wv" which will print a variable after every execution step.

Monday, November 28, 2011

The Battle over Assembly Comments

I've been working toward having a single file, asm64.tex, to serve as the source for the printed book (pdf format), the Kindle e-book (mobi format) and the Nook e-book (epub format).  My adventures with htlatex are starting to gel into a reasonable plan, except for assembly language comments.

Assembly code is normally divided into columns like this:

label:   instruction   operands      ; comments

My habit is to start the instructions in column 9 with operands starting in column 17.  This looks pretty good, but many of my lines of code are a little long.  I originally shrank them somewhat to fit the width of the printed book (about 6 inches).  Then I ran into the Kindle.

The Kindle is limited to 47 characters of monospaced text per line or 65 characters in landscape mode.  I was originally fairly happy when I discovered the landscape mode.  Nearly all my comments were 65 or fewer characters.

Then I borrowed my son's Nook.  The Nook apparently did not have a landscape mode.  I opted from the start to make the Nook version be the same as the Kindle version, so I have to live with the portrait mode.

I have gone through the code once (or twice) shrinking comments and spreading comments out onto 2 lines to make the book look fairly good on the Kindle with 47 characters, but I was not too happy with short lines for the printed version of the book.  So after some thought I decided to dive in and solve the problem with a bash/awk script.

I wrote a fairly good script which tries to break long comments near the middle and apparently break one line into 2 lines is sufficient for my needs.  I was afraid that I would need more robust code, but I'll leave it as is for now.

My script does allow a command-line parameter for the number of columns.  I need to borrow the Nook again and study it in detail.  Perhaps it has a smaller monospaced font which will allow wider lines.  I also don't know precisely the number of characters across the Nook screen.  I will be prepared for updating the Kindle and Nook books fairly soon.  I have also found a moderate number of typos, so I might update the printed book on Create Space as well.

Hopefully I will get around to doing some Windows assembly language during the Christmas break...

Friday, November 11, 2011

Getting Better Math from htlatex

I am happy to report that I have solved one more issue involved with producing good ebooks from a LaTeX source.  So far I have discovered how to use a configuration file to force htlatex to produce png images for displayed math and tables.  This can also be done to render arrays and eqnarrays as images.  Unfortunately when I view my equations and tables on the Kindle they look a little small and grayer than normal text.

The solution to these problems is to modify /etc/tex4ht/tex4ht.env changing the default parameters for the dvipng command.  The original line looks like this

    Gdvipng -T tight -x 1400 -D 72 -bg Transparent -pp %%2:%%2 %%1 -o %%3

My updated file has this line

    Gdvipng -T tight --gamma 1.5 -x 1600 -D 72 -bg Transparent -pp %%2:%%2 %%1 -o %%3

The -x option is a magnification so my output is about 14% larger.  The --gamma 1.5 option sets the gamma value (default 1) to 1.5 to force the text to appear darker.

Here is what I would normally get:


Here is what I get with the revised parameters for dvipng:


This is darker without being harsh and is large enough to make my rendered math a little more readable on Kindles, Nooks and other e-readers.

Thursday, November 10, 2011

The Winner is --- htlatex!

I have finally run into the right choices for using htlatex to produce a good file for the Kindle.  It turns out that Eitan Gurari (developer of tex4ht) took care of my needs after all.  You can supply the name of a configuration file on the command line for htlatex:

    htlatex asm64.tex asm64

In my case the latex file is "asm64.tex" and the configuration file is "asm64.cfg", which contains

    \Preamble{pic-tabular}

    \begin{document}

    \EndPreamble

The first line contains the magic incantation "pic-tabular" which transforms all latex tabular sections (tables) into pictures automatically.  There are also "pic-tabbing",  "pic-array", "pic-eqnarray",  "pic-fbox" and "pic-m" options which I may eventually need.

After changing all my \includegraphics commands to refer to .jpg files, I had a nice asm64.html file which looked fine with firefox.  I tried using kindlegen which did a poor job of generating a .mobi file.  Then I used calibre to create a .mobi and this was nearly perfect.  I have a few images made from xfig which have lines above and below them which I would prefer to eliminate and I need to include a cover image.

This is quite a good solution.  All the equations and tables looked great.  I would have liked larger rendering which I can probably arrange for, but overall this is better than my hand-written html code with very little additional effort after producing the original version.

Thanks to Eitan Gurari for his excellent work!

Friday, November 04, 2011

Portable Book Format

Now that I have published Kindle and Nook versions of my assembly book I have some experience with book formatting.  LaTeX produced an excellent printed book, but it required a huge amount of changes to convert the book into HTML for conversion to Kindle (mobi) format.  Fortunately the Nook format (epub) is similar enough to mobi that the Calibre program produced an acceptable epub book automatically from the mobi book.

However I now have 2 versions of the book to maintain which is not desirable.  In the long run it will be better to find or create a single format for a book which can be converted automatically to multiple format

Tuesday, October 25, 2011

High Performance Assembly Language

The Core i7 CPU offers the Advanced Vector Extensions (AVX) which expand on the capabilities of the Streaming SIMD Extensions (SSE) instructions.  SSE added additional floating point and integer capabilities to the CPU, in particular the registers used by SSE (xmm0, xmm1,...) are 128 bits each and these bits can be used to store multiple integers of various lengths and also multiple floating point values.  The instructions operating on these multiple values are referred to as SIMD (single-instruction multiple-data) instructions.  With the AVX instructions the registers are extended to 256 bits, though the integer instructions only operate on 128 bit quantities.

Here I will present some simple code which will be extended profitably to improve performance.  The keys to success here are loop unrolling, using independent registers for the unrolling and using specialized instructions.  In the process there will be several versions of the basic function which perform better than the basic C code optimized with loop unrolling by the gcc compiler.

Sunday, October 23, 2011

Nook Book now on Sale

From all I had seen I expected the Nook version (EPUB format) of my Assembly Language book to be a snap.  It was relatively easy, but not quite trivial.

Problem number 1 for the Nook was that the Nook didn't allow view in landscape mode.  My assembly code has a moderate number of end-of-line comments.  Many of these wrapped onto the next line.  With the Kindle I had been able to view my book in landscape mode which handled at least 3/4 of the lines properly.  I had quite a few lines to repair.  Some I fixed with creative substitutions, but some I had to extend onto the next line.

The next problem worth reporting is having a 2 appear at the top of one set of exercises.  This 2 came from the second list item.  It made no sense that the 2 would appear at the top of the page.  The same HTML rendered properly on the Kindle.  Eventually I removed all <p> and </p> tags from that set of exercises.  I also converted a <blockquote> section to preformatted.  Now the same HTML works for Nook and Kindle.

It was perhaps 4 hours work to repair things for the Nook.  It is now available for sale.

The Nook version is EPUB format which should be portable to iPad, Sony eReader and most other devices.  It is DRM-free so people shouldn't run into DRM roadblocks.

Friday, October 21, 2011

Finally Published a Kindle Version of the Assembly Language Book

I received a request from a person teaching 64 bit assembly language at another university for a Kindle version of my textbook.  I had been studying the issue of and on for a few months.  I knew that the Kindle format was based on HTML, but it was difficult to determine exactly what commands worked on the Kindle.  I had tried a few tools - Calibre and kindlegen on Linux and the mobibook creator for Windows.  It seemed like epub was going to work fairly well, while mobi was a bit of trouble.

I decided that I had to have a Kindle to test my work, so I ordered one for $79 and set about converting my book from LaTeX markup to HTML markup.  As I did this I previewed the results in Firefox and got things in fairly good order by Tuesday when the Kindle arrived.  Immediately I discovered that some of the problems I observed in previewers did not exist on the Kindle.  It looked possible.

One issue was displaying diagrams I had created using xfig.  I tried gif, png and jpeg and all were ugly in the previewers.  For the printed book I had used PDF files created by xfig for all the diagrams.  Eventually I tried using gimp to convert the images to jpeg.  This worked very well.  gimp asked for the number of dots per inch at the start of importing each PDF file and I used this to generate images with widths which are small enough for the Kindle.  The Kindle likes to scale images to fit the width of its screen which is not what the HTML standard dictates, but at least I have nice looking diagrams though frequently a bit large.

Most of my use of math mode in LaTeX was fairly simple and I converted these to HTML using <sup> and <sub> tags along with using tables for alignment.  This was fairly successful, but I had a few fairly complex formulas to render.

I copied the LaTeX commands for the complex formulas to separate files and ran pdflatex to convert to PDF.  Then I used gimp to crop the images for inclusion as jpeg images in my HTML book.  These images looked great on the Kindle though they are generally too large.

I used <blockquote> to indent quite a few tables in the book.  I decided that indenting looked better than centering.

Kindlegen has a sample document which includes a manifest file (OPF) and a table of contents file (NDX) which I adapted for my purposes to create a complete Kindle document matching the Amazon requirements.  I submitted the ebook on Thursday afternoon and by Friday morning it was up, complete with "Look Inside", and I had 1 sale.  Curiously the "Look Inside" preview displayed my indented table flush left, so I downloaded a sample of the book to my Kindle.  The tables were indented.  Still the preview looked fine otherwise and is good enough for my purposes.

I spent 7 days working on the conversion, averaging perhaps 4 hours each day.  That's a lot markup changes.  Now I think I know enough to invent my own markup which could be used to generate LaTeX and HTML from a common source.

It's on sale here.

Now it's time for an epub version for Barnes and Noble...

Monday, September 12, 2011

Windows 7 Installation Woes

My computer at home has been serving as my router for several years and the only way to run Windows on the computer was using a virtual machine.  Rebooting into Windows would have prevented the other computers in the house from accessing the network.  Recently I changed the networking to use the DSL modem as the router for the house, which means that I could switch my computer to dual boot.

I also have a Netflix account which I would like to use with this computer.  I have tried using VMWare and VirtualBox with limited success, so it was time to install Windows 7 on the computer.

My /home directory consumed nearly all of a 1 TB drive, so I used resize2fs to change its size to about 800 GB.  /home was in partition /dev/sda5, so the new partition was /dev/sda6.  So I started trying to install Windows.  It complained about not being able to locate or create a partition when it clearly showed the partition in the partition choices.  I then consulted the internet.

Friday, September 09, 2011

Visual Studio

I am teaching CSC 101 - C++ Programming I for the first time this semester.  It's a busy semester.  I'm also teaching assembly language for the first time.  Fortunately I prepared for the assembly language class fairly adequately in the summer.

I decided to bite the bullet and use Blackboard which is Southern Miss's approved on-line instruction tool.  I am preparing materials and quizzes for both my C++ and assembly classes.  In addition I am preparing lab assignments for the CSC 101 Lab class.  That's a whole lot of Blackboard.  In the past I have used Moodle as a an organizational tool for all my classes.  Blackboard seems to be roughly equivalent in capability to Moodle.  I prefer Moodle, but that may be biased based on experience.  One definite advantage to Moodle was that we run it on a local computer which is very seldom overloaded, while Blackboard is external and there are fairly common small delays.

I have previously decided that I would use Visual Studio for a Windows-specific version of my assembly book, so the experience learning to use VS will be somewhat beneficial to me when I tackle the Windows book.  I have yet to decide whether I will use masm, nasm or yasm.  I haven't tried any of them yet, but the deciding issue may be how well it integrates with the VS debugger.  I expect that masm will be better, but I don't yet know.  I would prefer nasm or yasm since they are simpler and I already know enough about them, but masm may get the nod due to utility.

Sunday, August 28, 2011

Should there be a limit to wealth?

Let's start by considering one extreme possibility:  one person owns everything on the planet.  Let's name this person Midas.  Midas owns all homes, so each other person must pay rent to Midas.  Midas owns all the farms, so each other person must buy food for Midas.  Midas owns all buildings and businesses, so each other person must work for Midas in order to survive.

Should anyone be allowed to own everything?  This seems like a possibly horrible life for the rest of humanity.  Of course, Midas could be a generous owner and perhaps he pays people enough for them to survive fairly well.  Eventually Midas will die and then I suppose Midas will grant the world to his first-born child.

Would Midas be just as happy owning enough to satisfy his every whim?  How much money is sufficient to satisfy his every whim?   One billion dollars?   One trillion dollars?

Wednesday, August 24, 2011

Started Blackboard Questions

I have decided to use the Blackboard courseware system to give a quiz for homework from each chapter I cover this semester in Assembly Language.  I believe that it will force my students to read the chapters fairly carefully which will help a lot.

So far I have prepared quizzes for the first 2 chapters.  It looks like it will take perhaps 1 to 2 hours per quiz to prepare objective questions.


The complete set of quizzes will be available via email to teachers who adopt the textbook in the future.


As usual the source code can be downloaded from http://seyfarth.tv/asm and the CreateSpace site for the book is https://www.createspace.com/3651611.

Saturday, August 20, 2011

Army Worms

My neighbor pointed out to me yesterday that my azaleas are being defoliated by an invasion of army worms.  I try to maintain my lawn and plants without using pesticides, so I will be killing some caterpillars by squishing them underfoot.

I decided to look them up on the internet.  They are indeed called army worms die to their tendency to show up in large numbers.

Apparently they like azaleas. I considered just letting them eat all they wanted. but after they finish their favorite foods they will move on to other sources.  They even like to eat your grass. In fact they can wipe out entire hay fields.

Surely there must be natural enemies to the munchers.  The little devils probably taste bad so that birds won't eat them.  I have lizards and geckos too.  These critters must be falling down on the job.

I guess for a few more days I will start the day on army worm patrol.  Who says I don't have in interesting life?

Saturday, August 13, 2011

No one stands alone

It bothers me that there are a large number of successful people who seem to lack gratitude for their success.  They think they did it "on their own".  No one stands alone.  We are all interdependent whether we like it or not.

Now I like to talk about how I have worked all my life as do many others.  Well, I have worked fairly hard, but I haven't achieved my small measure of success all by myself.  To start with my success is with computers and I certainly would be clueless about building my own chip factory and designing a computer from scratch.  My success is directly related to many thousands of people who have helped create a computer industry.

Friday, August 12, 2011

Pears, pears and more pears

Pears grow well in Mississippi if you keep them watered a bit through the dry spells.  This year it seemed we had almost 2 months with little rain.  Trees are fairly drought-resistant, but eventually there comes a point when you start to wonder how much your trees can take.  I waited until the grass started looking a bit stressed and started watering then.

I have tried a variety of spray nozzles and devices to spray large areas.  This year I tried just using the hose with no nozzle.  My theory is that the water comes out in greater volume without a nozzle.  This seems to be far quicker for watering small plants.  I think it is about twice as fast.  I have also tried more precise spot watering of stressed grass.  Overall I think this is a useful technique.

In the past I would leave a sprinkler in the yard, go inside for about 15 minutes and then do out to move the sprinkler.  Some areas would receive more water than necessary and it took over 2 hours to do the front and back yard.  With spot watering and no nozzle I can finish a little quicker though I am standing outside the while time.


Look inside my book!

It took a couple of weeks, but now you can use the Amazon "Look Inside" feature to view part of my book.  Amazon also matched B&amp;amp;N's price!  $21.56 is a fine price to pay for a textbook!  The title works a link, the blogger editor didn't allow me to fix a link for the picture.










Introduction to 64 Bit Intel Assembly Language Programming: Getting the most out of your computer
 
 






List Price: $29.95
Price: $21.56 &amp;amp; eligible for free shipping with Amazon Prime
You Save: $8.39 (28%)








Thursday, August 11, 2011

Demand side economics

It is currently popular in America to believe in supply side economics which basically means less government intervention in business.  The idea as I understand it is that by giving the people on the top of society more money they will have more money to employ the rest of society.  Given that there are smart people on both sides of this issue, I will certainly agree that this argument has some appeal to reason.

The other view of economics is demand side or Keynesian economics.  This view has it that government spending can rescue a failing economy.  This was arguably successful enough to prevent a peasant revolution during the Great Depression, though many people like to claim that it was World War II which successfully ended the depression.  There is some truth to the fact that WWII did bring the US to a state of high employment - through government spending which says that demand side economics worked when it was applied in a large enough dose.

Wednesday, August 10, 2011

Finished source code preparation

I have completed the organizing of code forthe rest of the book.  I expect that there will be some changes made to the code as I teach the class and learn what people need in the code.  There is a fair amount of code written over 3 months as I studied and wrote about 64 bit assembly language.  I am sure there is some amount of "style" changes during this time.  I have yet to see of think of a way to make assembly language look as stylish as C++.  No matter what you do assembly code requires attention to details which can be partially handled by the syntax of the language in C++.  Things like forgetting which register is first, second and third aren't issues in C++.  In fact C++ can sometimes keep you out of trouble with attempts to pass data of the wrong type into a function.  In assembly there is hardly any concept of type.

As usual the source code can be downloaded from http://seyfarth.tv/asm and the CreateSpace site for the book is https://www.createspace.com/3651611.

Finished source code for Chapter 12

Chapter 12 is about system calls.  I have prepared example programs for all the code snippets in the chapter.  I also wrote a copy program which does a moderate amount of testing in addition to reading a file in 1 read and writing it with 1 write.  This should serve as a fairly good resource.  It uses command line parameters for file names and tests nearly all the system calls.

It would be more practical to write a copy program which allocates a fairly large array and copies a file in pieces in a loop, but that is one of the exercises at the end of the chapter.

As usual the source code can be downloaded from http://seyfarth.tv/asm and the CreateSpace site for the book is https://www.createspace.com/3651611.

Finished source code for Chapter 11

It took me a few hours to write and test all the code in Chapter 11.  There were some pieces of code which I thought I had tested, but obviously had not.  Now my src/ch11 directory has 14 different source files.

I had been too rash and had not really noticed that the instructions for floating point comparisons require using different conditional jumps than the integer comparisons.  I also read somewhere that test was a good instruction for comparing.  Test reports that it performs an "and" instruction.  If the "and" of two operands yields a 0, then the zero flag is set.  Using the zero flag in the most obvious test, je, ends up jumping when the two operands have nothing in common rather than when they are equal.  So as a programmer, I want to use je to test for equal objects and I need "cmp" rather than "test".  Maybe someday I will find a use for test, but I have spent too much time working with numbers to appreciate that anding has anything to so with testing...

Now there is fairly complete set of errata for Chapter 11, in addition to the source code.

You can access the source code, PDF slides and errata at http://seyfarth.tv/asm and you can reach the CreateSpace page for the book at  https://www.createspace.com/3651611.


Tuesday, August 09, 2011

Beware when comparing floating point numbers

I have been writing a lot of test code verifying some of the short segments of code in my assembly book.  I ran into a bizarre state of affairs with floating point comparisons.  I had read the Intel instruction manual too quickly and ASSUMED that the floating point ucomiss instruction, being fairly new, would be designed to make for easy programming.  Imagine my shock when I tried to compare with ucomiss and then use jle to jump on less than or equal and it did not work properly.

I wrote a C program with all 5 arithmetic comparisons using floats to study more carefully what gcc does to cope with ucomiss.  Here's what gcc did for a less than comparison on 2 registers:

   ucomiss    %xmm1, %xmm0
   seta       %al
   testb      %al, %al
   je        .L3

Monday, August 08, 2011

More source code files available

My web site for the Assembly book, http://seyfarth.tv/asm, now has source code for Chapters 1-10.  During this process a few errors were found in the book and have been noted in the errata on the same site.

The process of organizing the source code is going quite well.  I ran into a program which persisted in not working properly with either gdb or ygdb.  The goal was to place a break point on main to simplify debugging.  I issued the command "b main" and it placed a break on a line in the copy_array function.  I don't understand this yet, but it was solved by making main appear before copy_array in the source code file.  Very strange...  I think I have debugged programs where main was at the bottom before, but it is so easy to be confused about precisely what you have done when using computers.

In any case I have 28 assembly source files in the current download and 3 C source files.  The various Makefiles build all the programs and I am pretty sure I have tested them all.  Testing is paramount.

You can find my book easily at Amazon or Barnes and Noble.  You can also find the direct link at  CreateSpace using https://www.createspace.com/3651611.

Sunday, August 07, 2011

Bug in bit field code at the end of Chapter 7

I have been testing all short sequences of code and found an error in the last part of Chapter 7 where I planned to use movsx with an immediate 32 bit field to prepare a mask.  The movsx instruction allows a register or memory 32 bit field to be sign-extended but not an immediate field.  The fix looked a little ugly and sign extension is not an elegant solution so I posted an better solution in the Chapter 7 errata page.

The better solution is shifting the rax register right 29 bits, followed by shifting it left 29 bits.  This is more generally applicable.  More importantly it actually works.


Started preparing source code for downloads

I started today on preparing a source code distribution for http://seyfarth.tv/asm. So far I have prepared a src directory in my own computer with ch01, ch02, ... ch19 as subdirectories.  There is a Makefile in each subdirectory which will manage building the demo programs for that chapter.

There is also a master Makefile in the src directory which by default will visit each chapter subdirectory and make its programs.  As an aid to myself, the master Makefile defines a tgz target which used to build a clean src.tgz file to place on the web server.

So far I have completed programs for Chapters 1-6 and am writing some programs for Chapter 7.  Wherever the book has code without an accompanying gdb session it is a fair bet that I haven't yet written and tested the code.  So far I have added one errata entry for a programming type in Chapter 7.

This is fairly quick to do.  I hope to finish it in 1 to 2 more days.

PDF slides completed for Chapter 19

PDF slides for Chapter 19 are now available at http:seyfarth.tv/asm for readers to download.  Chapter 19 is about computing correlation.  There is a C version, an SSE version and an AVX version.  The AVX version achieves 20.5 GFLOPS with 1 core of a Core i7 CPU.  This is a testament to the CPU design.  My code did unroll the loop and place partial sums in different registers which made it possible for the CPU to re-order the instructions and use multiple pipelines.  It achieved about 6 double precision results per cycle, which the instructions performed no more than 4 operations each and they took more than 1 cycle.  The CPU filled multiple pipelines fairly well without me having to understand every small detail of the CPU operation.

All 18 chapter now have slides prepared.

It's time to clean up some code and match it nicely with the book and upload source code.  I like working with code better than working with text, so this should be fairly fun.

You can find my book easily at Amazon or Barnes and Noble.  You can also find the direct link at  CreateSpace using https://www.createspace.com/3651611.

PDF slides completed for Chapters 17 and 18

PDF slides for Chapters 15 and 18 have been prepared and uploaded to http://seyfarth.tv/asm for readers to download. Chapter 17 is about counting 1 bits in an array.  It shows the effect of improved algorithms and shows a dramatic improvement when the popcnt instruction is used.

Chapter 18 is about the Sobel image filter.  A C version was written which processed 158 images per second. An assembly version was written operating in 14 Sobel results at once.  This processed 980 images per second, which was about 6.2 times as fast.

Saturday, August 06, 2011

PDF slides prepared for Chapter 16

PDF slides are now available on http://seyfarth.tv/asm for "Chapter 16: High Performance Assembly Programming".  This chapter is about some general strategies, most of which are done quite well by modern compilers.

The last 3 chapters explore 3 different algorithms which have been implemented in assembly.  In some of these the assembly code is far faster than the compiled C code.  This is generally only possible if you employ specialized instructions which are usually harder for the compiler to apply given your high level language code.  These instructions usually require some rearranging of the algorithm which is hard for a compiler to do better than a human.

PDF slides prepared for Chapter 15

I have prepared PDF slides for "Chapter 15: Data Structures" and uploaded them to http://seyfarth.tv/asm for people to download for classes.  Chapter 15 is clearly an optional chapter for many courses.  It might be beyond the call of duty to introduce data structures first in an assembly language class.  It would clearly make the concepts of lists and other data structures more concrete.

The chapter includes singly linked lists, doubly linked lists, hash tables and binary trees.  There is no effort made to produce balanced binary trees which would certainly make for more opaque coding.  The hash table code is efficient and reasonably easy to understand.  Of course you might very well get adequate performance at a fraction of the programming time by using STL data structures.

Friday, August 05, 2011

PDF slides prepared for Chapters 13 and 14

I have prepared and uploaded PDF slides for "Chapter 13: Structs" and "Chapter 14: Using the C stream I/O functions" to http://seyfarth.tv/asm for users of my book to download.  This is a short chapter focusing on using an array of structs.  Chapter 15 discusses data structures where structs will be central to the discussion.

The stream I/O chapter discusses using fopen, fseek, fprintf, fscanf, fread, fwrite, and several more basic I/O functions.  These are more efficient for reading and writing small amounts of data.  They are also fairly efficient for arrays of data.  The last function uses fseek to position a file and fwrite to write a customer object to a file.

Only 5 more sets of slides to prepare!

Climbing rapidly on Amazon and Google

My book is now number 2 if you search in books for "64 bit assembly" on Amazon.  The number one book is a more thorough book describing 32 bit and 64 bit assembly and it was published in 2005.  I will be happy enough if I remain number 2 in that ranking.

I would like to make it to the first page when searching for "assembly language", but my book needs a second edition to be better than most of the books in that list and some good reviews to climb up the rankings.

I have also made it to number 3 on google when searching for "64 bit assembly language" and number 4 when searching for "64 bit assembly".

The current rankings are encouraging.  My book will be found by most people interested in a book of 64 bit assembly language programming.  Now if I get some good reviews, sales will increase.

You can find my book easily at Amazon or Barnes and Noble.  You can also find the direct link at  CreateSpace using https://www.createspace.com/3651611.

Thursday, August 04, 2011

PDF slides prepared for Chapter 12

PDF slides for Chapter 12: System Calls have been prepared and uploaded to http://seyfarth.tv/asm for downloading.

Chapter 12 is about system calls.  It describes how both 32 bit and 64 bit system calls are implemented on Linux.  Interestingly the numbers used for system calls for 32 bit calls and 64 bit calls bear no obvious relation.

Using these numbers for system calls is fairly pointless since there are C wrapper functions for each system call which have easy to remember names and a little added capability.  The time taken for a system call is far greater than the time taken up in using the wrapper functions, so there is little value in using system calls directly.  Furthermore there is probably no reason not to write nearly all system call using code in C or C++ anyway.

12 of 19 chapter complete!

Slowly climbing up in the ranking

I've been monitoring my progress in various types of search.  My best success is on the Barnes and Noble web site.  My book shows up number 15 in a search for "assembly language" and number 1 for "64 bit assembly language".

Second best is searching for "64 bit assembly language" on Amazon's web site, where my book comes up number 6.

I need more effort and more time to climb up the ranks on google or bing.  I have had a little success with google finding one of my blog entries on this blog.  That is encouraging.  I need to figure out something people search for (that I know something about) and post a blog message.

PDF slides prepared for Chapter 11

I now have prepared and uploaded PDF slides for Chapter 11: Floating Point Instructions.  This might be enough assembly language for many people.  With the instructions discussed so far you could write fairly nice programs using arrays, functions, integer math, floating point math and simple I/O instructions.

This chapter also introduces the SSE instructions using 4 packed floats per XMM register or 2 packed doubles per XMM register.  This can be the start of some efficient programming.  Being able to issue commands to perform 4 floating point instructions can require some re-engineering of your algorithm, but if you do this carefully you can unroll loops and keep the SIMD pipeline fairly full.  This could yield perhaps 4 float results per machine cycle which is roughly 10-15 GFLOPs single precision.  Some of the latest CPUs can yield more than 1 SIMD instruction completion per cycle.

It will become more interesting with the later chapters on optimization.

I have also located a few more typos.  Check it all out at http://seyfarth.tv/asm

Wednesday, August 03, 2011

Warming up for Windows

Well I gave a little effort toward preparing to write a Windows version of my book today.  I wanted to keep it simple and use yasm, gcc and gdb.  Then the Windows book would be very easy to write.  Reality reared its ugly head when I stumbled into Windows territory.

I first tried installing Mingw and MSYS because I knew that Mingw has a 64 bit compiler.  I tried numerous times but never got a hello world program in C to compile.  I also had to install PKZip just to unpack one of the 2 zip files.  It was not looking good for describing in a Windows book how to install the tools.  It was looking impossible for me to do, although I am sure that it can be done and I would eventually stumble upon the secret.

Tuesday, August 02, 2011

Chapter 10 slides on seyfarth.tv/asm

I have completed the PDF slides for Chapter 10 and uploaded them to http://seyfarth.tv/asm for people to download as needed.  Chapter 10 is about arrays.  Prior to this chapter a little use was made of arrays, but this chapter shows a wide variety of memory addressing patterns which are applied to arrays.

This is some code to illustrate creating arrays using malloc.  This is far more convenient than using fixed size arrays in the data or bss segment.  It is faster and it allows using very large arrays, while the static arrays are limited to 2 GB.  This is the right way to work with arrays.

There is a sample program which allocates an array, fills it with random integers, and finds the minimum in the array.  There is also some code to copy one array to another using normal looping.

The chapter ends with a discussion of command line parameters which are critical for more Linux applications.

Only 9 more sets of slides to prepare!

Sunday, July 31, 2011

Chapter 9 slides on seyfarth.tv/asm

I have uploaded the PDF slides for Chapter 9 to http://seyfarth.tv/asm where they are now available for download.  Chapter 9 is about functions.  It discusses the registers involved with Linux and Windows function parameters along with the registers which must be preserved across function calls.

The standard way of creating a stack frame for a function is explained which is quite helpful in debugging.  The chapter ends with a recursive factorial function.

Amazon search now works for my Assembly TextBook

If you search for "64 bit assembly" you should find my book listed on the first page on Amazon.  Unfortunately you can't buy the book directly from Amazon.  I hope that soon becomes an option.  There are 4 resellers listed and I expect that all will deliver the book.  Three of the four offer the book for slightly less than CreateSpace, but not a lot less when considering shipping.  CreateSpace prints the books, so perhaps their shipping is quicker, but these days things could be automated so that an order placed is transferred to CreateSpace within a second or two.  In any case I have made several orders from CreateSpace and have been quite pleased with their delivery times.

Chapter 8 slides now on www.seyfarth.tv/asm

I have prepared PDF slides for Chapter 8 and placed them on www.seyfarth.tv/asm.  I have also added a few errata to the web site.

Chapter 8 is about branching and looping.  The strategy is to convert C if/else statements and loops to equivalent assembly instructions.  This chapter is where the optimizations employed by the compiler start to become more comprehensible.

One of the simple optimizations is to convert loops to test at the bottom of loops which can mean one less jump statement in a loop.  Another is to count down if the only purpose is to do the loop a certain number of times.

This chapter discusses how the gcc compiler performs loop unrolling and why this matters.

The next chapter is functions, followed immediately by arrays.  After those chapters the student will be able to write some very interesting programs.

Saturday, July 30, 2011

Updated Bluehost to serve up http://rayseyfarth.com and updated DynDNS

I have used dyndns.com for my dns entries for a few years now.  I registered seyfarth.tv with them and have been moderately happy with that plans so far.

Unfortunately now I have to use WebHop DNS records in DynDNS to map my URLs to Blue Host's URL's.  I added rayseyfarth.com as my domain for Blue Host and updated the DynDNS entries to match the change.  This results in a request to http://seyfarth.tv/asm being redirected to http://rayseyfarth.com/asm which shows as the URL in the browser.

The main benefit to this is that further use of my web site will be independent of DynDNS.  This should improve the performance a little bit and avoid seeing horrible numeric IP addresses in the URL bar.

You can find my book easily at Amazon or Barnes and Noble.  You can also find the direct link at  CreateSpace using https://www.createspace.com/3651611.

Chapter 7 slides now on www.seyfarth.tv/asm

I have completed the PDF slides for Chapter 7 which is about bit instructions.  It covers negation, and, or, exclusive-or, shifting, rotating and testing individual bits.  In all cases examples are given to try to point out a possible use for the instructions.

One use for these instructions is in extracting and filling bit fields in a register or in memory.  Extraction is documented fairly well with 2 techniques illustrated with examples.  Perhaps a little later I will add some illustrations of insertion of bit-fields.

A second use for these instructions is the  maintenance of a set implemented as an array of bits.  Some sample code is given in Chapter 7 giving the fundamental techniques for managing a bit set, but there is not a complete program given for managing a set.  This would make a good exercise for this chapter, except that arrays are not fully introduced yet.  They are used with an index register in Chapter 7 with little fanfare.

Friday, July 29, 2011

Slides for Chapters 5 and 6 Ready

I have uploaded PDF slides for "Chapter 5: Registers" and "Chapter 6: A Little Bit of Math" to http://www.seyfarth.tv/asm so now there are 6 of 19 chapters prepared.

Chapter 5 introduces how to move data between registers and memory.  It also introduces the conditional move instructions which can add efficiency for fairly short if/else sequences.  That is a poor substitute for branching statements which come later, but they can be more efficient.

Chapter 6 adds in integer mathematics instructions.  With those you can do quite a lot.  You still need branching and input/output statements to make real programs, but it takes some time to get enough things covered to write more realistic programs.

Thursday, July 28, 2011

Cloud adventure continued

Goodbye Amazon Cloud Front
I gave it a pretty good effort with Amazon's Cloud Front, but grew tired and confused.  If I used the S3 storage, I could not use http://www.seyfarth.tv/asm and get a default index.html.  If I set it up to feed from my own computer, it seemed to always fetch PDF files from my computer.  It also would not work with my computer down.  Fighting against down time was a big part of my reason for choosing Cloud Front.

I tried several times with both modes.  The S3 storage system uses a 24 hour time to live (TTL) by default.  I quickly grew tired of all the confusion over changing data in files.  Web sites are never static and I don't wait 24 hours very well.  I got a wide array of strange behavior during all this time.  I would use curl to view what html I should get and it would look great, but no amount of clearing cache would make any browser give me what I saw with curl.  I fail to comprehend the problem, so I have given up on Amazon's Cloud Front.

Cloud Front was dirt cheap for light usage and very fast, but at some point I decided that I had to start valuing my own time and decided to fork out money to a hosting service.  I selected Blue Host  which was rated very well for a cheap price.

Wednesday, July 27, 2011

Moving to the cloud

I have decided to move my web sites to the Amazon CloudFront service, using their S3 storage service as the mechanism for storing files.  They are supposed to automatically direct browsers to closer servers, so there will be much better response world wide.

I also have fairly regular DSL outages, which might frustrate potential customers.  I would not enjoy trying to browse somewhere and getting no response.  Maybe Amazon will be amazing.  I have been using S3 for a while now and paying a few cents a months (normally 2 cents).  I might have to start paying 10 or 20 cents a month.

I suppose ATT might be happier with me not running a web server from my house.  It should be a win all-around.

I'll post an update with my charges for the web site.  I really think it will be quite cheap.

Tuesday, July 26, 2011

Chapter 4 slides now on asm.seyfarth.tv

I have spent some time today touching up the seyfarth.tv web site.  I am trying to use simply HTML for some pages and htlatex for others.  I find htlatex better for writing report-like pages.  I find it easier to use HTML pages for pages with lots of links.

I found a picture of my book on Amazon earlier, but it's gone now.  It seems a bit mysterious to me, but I think it will be available on Amazon later this week.

I have completed the slides for Chapter 4 which is about how the CPU uses the 4 level page table hierarchy to do logical to physical address translation.  I initially though I would skip it, but it seems fairly easy to understand, so I plan to teach it this fall.

I updated my web sites to run on Amazon's Cloud Front with their S3 storage.  I was forced to change my plan a little since seyfarth.tv/asm would not load the default index,html file.  So the real web site is asm.seyfarth.tv which I hope will be quicker and more reliable.  My DSL service goes out every time it rains in Southern California and I live in Southern Mississippi.

My Book is almost on Amazon

You can find some information about the Assembly Language textbook by searching for "64 bit assembly language".  My book is number 7 in that list.  Amazon says is is unavailable but is available at CreateSpace.

I feel sure that Amazon will have details available very soon.  I hope it comes up with the "Look inside the book" option.  I think that is the way things work for CreateSpace books.

It's nice to be on Amazon.  I hope I get somewhat favorable reviews.  I will read all the reviews to learn more about what people prefer in a book.  If I can supply what people want, then the book will sell itself.

Monday, July 25, 2011

Chapter 3 slides now on asm.seyfarth.tv

I have completed the slides for Chapter 3 of the assembly language book.  I have updated the asm.seyfarth.tv web site to have links for the first 3 PDF files.

Sunday, July 24, 2011

No eBooks for Me

I have spent too many hours trying to get things in order for the Kindle.  I figured out how to get a nice HTML book using htlatex, but the mobi book creator screwed it up pretty bad.  I've also reviewed some software for cracking DRM.   It's not a pleasant prospect to work a week or two getting a few eBook choices available and having someone start giving the book away.

Sometimes it seems that you are swimming upstream and you might as will turn around and go with the flow.  CreateSpace makes it possible to print books fairly cheaply and perhaps I need to focus on printed books.  For the next edition I think I will choose bigger pages, since it was a little challenging fitting assembly code with comments in the 6.14 inch width.  Next time I will go for 7x10.   It will make it possible to produce higher quality.

I will make my slides next. I have 2 of the 19 chapters' slides prepared now.  It will probably take a day or two per chapter.  I would like to be done by August 24 when classes start.

Assembly Language Textbook Web Site Started

I have made a minimal web site for the assembly book at seyfarth.tv/asm.  The skeleton is there with a tiny bit of meat - I have links to download the PDF slides for the first 2 chapters along with a brief introduction to the book.

There will be no errata for a few more days - maybe.  I have made 1 sale, so I may start hearing about mistakes real soon now.  I expect to find many corrections during the fall semester and will hopefully incorporate them into an updated book in January.  I will probably spend some more time on graphics.  My schedule this summer was a little tight: publishing a book and preparing slides in about 3-4 months.

You can find my book easily at Amazon or Barnes and Noble.  You can also find the direct link at  CreateSpace using https://www.createspace.com/3651611.

Assembly Language Textbook


Introduction to 64 Bit Intel Assembly Language Programming

I have written the first version of my 64 bit Assembly Language text book for Intel and AMD CPUs. I will be teaching from the textbook this fall (2011) and by that time I will have slides for the book, errata and code at my web site at seyfarth.tv/asm.

The web site is not very functional today. I have replaced my main computer with a new Core i7 computer which I have used for testing code for the text book. The web site needs a little work...

I managed to get 20.5 GFLOPS on my double precision correlation computation which is a testament to the design of the Core i series. I used the AVX instructions which allow me to specify instructions which can perform 4 double precision operations at a time. These operations take more than 1 cycle, so it is a little surprising to me to achieve about 6 double precision results per cycle. The CPU is performing out-of-order execution pretty nicely. I did make some effort to make a lot of the instructions within the main loop independent of each other to allow the CPU to do its job, but given the complexity of x86-64 instruction decoding I find it to be pretty impressive.