Skip to main content

Posts

Showing posts from 2021

Apple2TC: an Apple II Binary to C Decompiler - Part 4

This is part 4 in a series of posts describing Apple2TC.  The start is in Part 1 .  In this part we finally get to the interesting part and describe the Apple II hacks we found while trying to run the decompiled version of Apple II BASIC. Running the Decompiled Code In the previous part we described our technique for generating a "Simple C" representation of the disassembled 6502 code. Now it is time to actually build and run the generated code! Using our a2io and decapplib libraries, it is trivial to link the generated Simple C code into an executable.  We started by decompiling and running the Apple II ROM code, containing Applesoft BASIC and Monitor, because its disassembled source is already available and heavily commented  (thanks to Andy McFadden ). We can "cheat" and look at the listing to check whether we are doing something wrong. Problem 1: Indirect Branches In our first attempt, we decompiled the ROM statically and used the Simple C backend to convert

Apple2TC: an Apple II Binary to C Decompiler - Part 3

 This is part 3 in a series of posts describing Apple2TC.  The start is in  Part 1 . In this part we get to actually generating some working C code. "Simple C" Backend As was established in the end of Part 2 , our jump tracing disassembler is successfully able to statically disassemble large portions of binaries given to it, generating a reasonable looking assembler listing. The problem is that there is no way to truly evaluate the quality of said listing - it looks reasonable, but we can't tell whether it is actually correct or complete, and most importantly whether it contains enough data to start working on decompilation to C. We could try to assemble it back into a binary and compare to the original, but that would be pointless because even a listing containing a sequence of byte values can be assembled correctly. DFB $AD, $ 34 , $ 12 ; This correctly assembles to LDA $1234, but is not really useful.   We need some way to execute  the disassembled code and ensure tha

Apple2TC: an Apple II Binary to C Decompiler - Part 2

This is part 2 of a series of posts describing Apple2TC. Check out Part 1 for an introduction. In this part we describe the different parts of Apple2TC. Apple2TC Components Apple2TC encompasses several sub-projects: id - a simple interactive assembler/disassembler/binary editor for quick exploration/patching during development. a6502 - a two pass 6502 assembler capable of assembling the Apple II+ ROM . a2emu - our custom Apple II emulator, feature complete with the exception of floppy disk support. apple2tc - our tracing disassembler/decompiler. a2emu and apple2tc are the two main components - one producing runtime data, the other consuming it. We will go into more detail about them, while just briefly describing the rest. id is a convenience tool for exploration and patching of Apple II binaries. It can load and save binary images, evaluate simple expressions, disassemble ranges of bytes or print them as words and bytes, or assemble individual instructions and patch bytes.  a6

Apple2TC: an Apple II Binary to C Decompiler - Part 1

This is a series of blog posts to serve as a log documenting my work on Apple2TC - an open source hobby project developed on GitHub:  https://github.com/tmikov/apple2tc . There are various interesting things that come up all the time and I thought it might be useful to record them for posterity. Part 2  describes the different components of the project Part 3 shows how we validate our disassembly by generating a working C representation of it. Part 4 shows the Apple II tricks we had to deal with to get to correct running code. Part 5 describes decompiling and running Robotron 2084 and Snake Byte. Part 6 introduces our SSA-based intermediate representation . Part 7 shows how to transform the IR to discover the original intent of the code.   What is Apple2TC? Apple2TC is an open source project to decompile original Apple II binaries, mostly games, into understandable and working modern C code,  completely automatically  by analyzing the runtime behavior of the code via custom softw