Skip to main content

Posts

Showing posts from January, 2022

Apple2TC: an Apple II Binary to C Decompiler - Part 7, Using The IR

 This is part 7 in a series of posts describing Apple2TC.  The start is in  Part 1 . Part 6 introduced our SSA-based Intermediate Representation. In this part we look at a running IR example and how it is transformed by our decompiler pipeline. Introducing The Example To demonstrate the utility of our IR, we have devised a slightly contrived example that nevertheless illustrates challenges encountered in real code.  The source of the example can be seen here:  https://github.com/tmikov/apple2tc/blob/master/blog/part6/ex.s . We assembled it with a6502: a6502 ex.s ex.b33 and disassembled it: apple2tc ex.b33 --asm > ex.lst Producing the following listing: /*0300*/ TSX /*0301*/ INX /*0302*/ INX /*0303*/ LDY M_0101,X ; $0101 /*0306*/ INY /*0307*/ STY M_90 ; $0090 /*0309*/ LDA M_91 ; $0091 /*030B*/ STA M_92 ; $0092 /*030D*/ L

Apple2TC: an Apple II Binary to C Decompiler - Part 6, The IR

 This is part 6 in a series of posts describing Apple2TC.  The start is in  Part 1 . Part 5 described getting two Apple II games to work when decompiled in "Simple C" mode. In this part, we introduce our new Intermediate Representation (IR) of the disassembled code, which unlocks much more sophisticated analysis and finally puts us on the road to real  decompilation. What is an IR? An IR is a way to represent executable code as a data structure in memory, in a manner that makes its semantics explicit, and thus easier to analyze and transform. Typically, one or more forms of IR are used by optimizing compilers to represent and optimize the source code they are given. We are coming from the opposite direction - we need to represent code that has already been compiled - but there is little difference. The same principles apply. Wikipedia has a high level article about Intermediate Representation , which, while not very informative on its own, can be used as a starting point by

Apple2TC: an Apple II Binary to C Decompiler - Part 5

 This is part 5 in a series of posts describing Apple2TC.  The start is in  Part 1 . Part 4 described the Apple II hacks we had to deal with, in order to get the decompiled version of Applesoft BASIC working. This part is a quick update on getting two Apple II games to work - Robotron 2084 and Snake Byte. Robotron 2048 After all the effort we went through for the BASIC ROM, Robotron 2048 was shockingly easy. First, we ran a2emu to collect runtime data: a2emu --collect --limit=30000000 robotron.b33 > robotron.json In the emulator, the game title screen appeared, we eagerly pressed spacebar, getting us to the control selection screen, selected keyboard, and we were off. The game is pretty challenging, but we managed to play a little before dying and closing the emulator. With the so collected runtime data, first we decompiled the binary to 6502 assembler: apple2tc --run-data=robotron.json robotron.b33 > robotron.lst Then to C (with the "simple" backend): apple2tc --simpl