$T\mathcal{M}$

There is a connection somewhere

Jun 30, 2016

IOCC 1987 korn.c Analysis

David Korns's 1987 entry to the International Obscufated C Contest (IOCC) is one of the most baffling one liners I have ever encountered. It simply reads

    main() { printf(&unix["\021%six\012\0"],(unix)["have"]+"fun"-0x60);}

and obiviously just outputs unix when compiled on a Linux system. The first thing to analyse something like this is to run it through the preprocessor (and hope that it gets more readable for a change). So after gcc -E the single line from the main function becomes:

printf(&1["\021%six\012\0"],(1)["have"]+"fun"-0x60);

which at least tells us, that unix is a defined in the preprocessor, which actually violates the standard since preprocessor constants should start with two underscores and is switched off if a ansi standard ist explicitly given, like -std=c99. As an aside gcc can be coerced into giving all macros defined by default with

gcc -dM -E  -x c /dev/null

The next thing to clean up is the use of a[b] == b[a] in C, so unix["some string"] is actually the same as "some string"[1] or 'o' and the operator & then effectivly chops off the 's'. Therefore the format string evaluates to (writing type explicitly):

char* f="%six\n" //String literals are automatically null terminated.

The second part, which gets into the output of %s is similarly obscufated. In particular (unix)["have"] == 'a'. It uses one additional trick, (int)'a' is 97 or 0x61, so that the types are char + char* + int and Korn sneaked some pointer arithmetic into the program. So

char* s="fun" + (0x61 - 0x60) // s is "un"

and we end up with the cleaned code of

1
2
3
4
5
6
#include "stdio.h"

int main(){
   printf("%six\n", "fun" + 'a' - 0x60);
   return 0;
}

where the printf is basically

printf("%six\n", "un");