Thursday, November 2, 2017

Debugging without a debugger

Another retro post, from even earlier times than the last one.

This happened during my high school years when some of my friends already owned a computer but I had none. So we used to get together after school and binge play some video games -- starting with Lord of the Rings text adventures and Laser Squad hot-seats...

...and then, rummaging through then-abundant "bootleg software shops" (a post-USSR version of Game Stop, selling bunches of floppy disks with copied games or other software and with labels hand-printed on a 9-pin dot matrix printer) I discovered Eric the Unready (which you can actually play in an online emulator here).

I loved it at first sight. A wonderful piece of interactive fiction with hilarious jokes and puns, challenging puzzles and, myself an avid English learner at that time, an invaluable learning resource.

Unfortunately, right after playing through the first chapter, we ran into something annoying: copy protection feature. One of the "Prince of Persia"-type where you would be asked a few questions and would need the original printed game manual to answer these correctly and play on.



And needless to say we didn't have the manual -- and of course, neither did the shop we bought the game from.

Well, as a disclaimer... I do understand that software piracy is, in the big picture, bad bad BAD, but hey, we were 15 years old and had no idea of the big picture - nor any clue about copyrights and licensing. To us at that time, the fact that we had software on our computer meant that we could do as we pleased with it, especially given that we did buy it at a shop (sic!).

And even if someone were to lecture us on the proper course of action... At that time in that part of the world, an equivalent of $30 would be a decent monthly salary (yes, monthly, not daily, not hourly), and there was absolutely no way an ordinary person could possibly make a payment anywhere to a foreign country, or for that matter, to pay in any tender other than cash - no credit cards, no wire transfers, no bank accounts...

...so yes, I admit we were stealing apples from somebody else's garden, but quite unknowingly, almost unavoidably, and without causing anyone any real harm.



But all these sentiments aside, we were already hooked, and needed a way to play on.

Surely we had no Internet to look up the correct answers (there was no Google, no Chrome and even hardly any Internet Explorer yet!), we had no one to ask (the game was so out of mainstream, and the level of English command needed to play it was so untypical that we might well have been the only players in years). We also had nothing to tinker with the game with -- no hiew, no disassembler, no debugger proper.

We did have Game Wizard, a utility that you would normally use to save and reload in Tetris or make yourself infinite lives in Pacman. The way you do it would be to search the memory for "3" when you have 3 lives, then for "2" when you have 2 lives, and so on; with some luck you would find the address in memory that holds the lives variable for the game. You can then set it to 99 or freeze at 3 to get infinite lives.

And we gave it a try, for lack of anything better to do for the rest of the evening. We began by alternating memory searches in the state before vs. after the first question is answered -- hoping to reveal its correct answer by noticing different variable values depending on whether we chanced to answer correctly. Instead, we got:

XXXX:YYYY 01 02 01 02 01 02


As a wild guess I just set it to 4 instead...

...and got right through. (Apparently I inadvertently found the counter of the questions loop and moved right past it.)
All it took after that was to save the game (and the evening), granting us with endless hours of fun time.


The power of one-liners

In modern software engineering where most software is written by large groups of people most of whom seem to dislike reading each other's code. As a result, most organizations have some style guidelines for their code: proper indentation, naming conventions, commenting, you name it.

And most developers will have strong opinions about how code should (and even more so, shouldn't) look like. And they will defend (and, their position permitting, enforce) their opinions with near-religious fervor. Most often, they will yell at you for writing this:

for (i=0;i<n;i++) for (j=i;j<n;j++) if (a[i]>a[j]) {double t=a[i];a[i]=a[j];a[j]=t;} // bubble sort A


and instead insist on something like this:

// Bubble sort array A

for (counter_rows = 0; counter_rows < n; counter_rows++) 
  {
    for (counter_columns = counter_rows; counter_columns < n; counter_columns++) 
      {
        if (a[counter_rows] > a[counter_columns])
          { 
            double temp_variable t = a[counter_rows];
            a[counter_rows] = a[counter_columns];
            a[counter_columns] = temp_variable;
          }
      }
  }


Yes, it looks neat and tidy. But is it really all that readable?

No, not really.

It is very hard to honestly defend a standpoint that a piece of code is easier to read if have to scroll through three screens to read it, as opposed to fitting it on one screen. "But this way it is more organized", comes the objection. Very true, but how is it organized?

It is organized by instructions.

But what for? By instruction is how the compiler looks at the code, but the compiler does not give the slightest damn about your code style. Humans are much more interested in what the code does than in how the code does it (assuming that you, a fellow developer, already have some knowledge of the "how" once you know the "what").

From this standpoint it makes much more sense to organize the code by logically distinct blocks -- important steps of your algorithm that you would put on your flowchart or pseudocode (pseudocode, after all, was invented specifically for this purpose: convey the meaning of complicated code in a simpler, readable, understandable form).

And this means that simple, elegant one-liners -- once (and if) they are self-evident in what they do -- are much more preferable than expanding them on two screens. See for yourselves:

//simple operations, e.g. sumproduct or matrix multiplication
double S=0; for (int i=0;i<N;i++) S+=a[i]*b[i];

for (int i=0;i<N;i++) for (int j=0;j<N;j++) for (int k=0;k<N;k++) c[i][j]+=a[i][k]*b[k][j];

// one-liner error checking, prep or boilerplate
if (failed || !solution_good) return false;

double param; if (!genericParam.Has_Double) throw Error; param = genericParam.Get_Double();

customVector<double> vec(vec1); int vec_n=vec.Count(); double* vec_ptr=vec.Data(); 

//getter/setter methods
double someClass::getSomeProperty() {return someProperty;}

void someClass::setSomeProperty(double arg) {someProperty=arg;}

//etc...


UPDATE: Following some discussion, I feel I need to clear up a confusion here. Any code, prettified, is more readable than the same code, minified. The idea of one-liners isn't about improving the readability of the one-liner code itself! It is about the exact opposite: the one-liner code is assumed to be trivial and therefore not worth going into any great detail about, so the idea is to minify the one-liner code so that it does not get in the way of what's really important and interesting in your code. In other words, it is about improving the readability of the code surrounding your one-liners.


As for naming conventions, they definitely make sense for anything that would be (re)used in several places through the code. If something is set up in one place and is used elsewhere, by all means make the name of the variable (class, object, ...) speak for itself.

That said, still try to keep it short. Calculations in the code are formulas, and formulas read much easier with shorter variables than with long, verbose ones; that's why they introduced variables in textbook formulas in the first place, and they do write E = mgh instead of "Potential_energy = mass * specific_gravity * distance" anywhere beyond grade two at school.

For intermediate variables such as loop counters, simply don't bother. Mathematical names such as a, x, y, i, j, k will perfectly do and they will make your calculations so much easier.

There are exceptions -- sometimes, when naming is especially prone to confusion, do add some mnemonics, such as i_row and i_col rather than i and j, lest you mess up your array indices. But in most cases, formulas in the code need not look any more verbose than they do on your scrap paper.  Sometimes, it is even advisable to assign "long" mnemonic variables to short ones, do the math, and then assign the result back to the long variables.


So -- no, I'm not saying your code should look like Toledo Picochess (see below). But  do write in your IDE as you would write on the blackboard, and do write code as you would write pseudocode.

After all, making code readable is all about making it readable for a human.


P.S. Toledo Picochess looks like this: (now THAT's truly unreadable code!)
#define F (getchar()&15)
#define v main(0,0,0,0,
#define Z while(
#define P return y=~y,
#define _ ;if(
char*l="dbcefcbddabcddcba~WAB+  +BAW~              +-48HLSU?A6J57IKJT576,";B,y,
b,I[149];main(w,c,h,e,S,s){int t,o,L,E,d,O=*l,N=-1e9,p,*m=I,q,r,x=10 _*I){y=~y;
Z--O>20){o=I[p=O]_ q=o^y,q>0){q+=(q<2)*y,t=q["51#/+++"],E=q["95+3/33"];do{r=I[p
+=t[l]-64]_!w|p==w&&q>1|t+2<E|!r){d=abs(O-p)_!r&(q>1|d%x<1)|(r^y)<-1){_(r^y)<-6
)P 1e5-443*h;O[I]=0,p[I]=q<2&(89<p|30>p)?5^y:o;L=(q>1?6-q?l[p/x-1]-l[O/x-1]-q+2
:0:(p[I]-o?846:d/8))+l[r+15]*9-288+l[p%x]-h-l[O%x];L-=s>h||s==h&L>49&1<s?main(s
>h?0:p,L,h+1,e,N,s):0 _!(B-O|h|p-b|S|L<-1e4))return 0;O[I]=o,p[I]=r _ S|h&&(L>N
||!h&L==N&&1&rand())){N=L _!h&&s)B=O,b=p _ h&&c-L<S)P N;}}}t+=q<2&t+3>E&((y?O<
80:39<O)||r);}Z!r&q>2&q<6||(p=O,++t<E));}}P N+1e9?N:0;}Z I[B]=-(21>B|98<B|2>(B+
1)%x),++B<120);Z++m<9+I)30[m]=1,90[m]=~(20[m]=*l++&7),80[m]=-2;Z p=19){Z++p<O)
putchar(p%x-9?"KQRBNP .pnbrqk"[7+p[I]]:x)_ x-(B=F)){B+=O-F*x;b=F;b+=O-F*x;Z x-F
);}else v 1,3+w);v 0,1);}}