The Smallest Go Binary (5KB)

The Smallest Go Binary (5KB)

Featured on Hashnode

This whole adventure began because I wanted to write a C compiler in Go. I wanted to use Chibicc, a tiny C compiler, as a basis since I could start with the first commit and add each feature since each one is a separate commit. However, Chibicc uses GCC to assemble the final binary and since a direct port had already been done before, I wanted to do this only using the Go compiler. Sure, I could take the time to learn NASM or YASM or GAS or literally any other sane assembler. But I thought "How hard could it be to use just the Go assembler?"

The good thing about using Go's assembler is that I happen to be quite familiar with it because of the purego project. However, the assembly is quite quirky and limited so that created some challenges down the line. This isn't a post about assembly though, so read the docs described in A Quick Guide to Go's Assembler if you want a better understanding of it.

Everyone loves a simple Hello World example. So let's start with one. I wrote up a simple basic Hello World in assembly based on HelloSilicon's version since I'm not super familiar with the arguments to the macOS syscall interface. It was quite easy to translate. Then all I had to do was call the assembler directly, right?

Note that all assembly examples are macOS arm64. They won't work on other platforms.

// Setup the parameters to print hello world
// and then call the Kernel to do it.
TEXT _start(SB), 4, $0
    MOVD $1, R0              // 1 = StdOut
    MOVD $helloworld(SB), R1 // string to print
    MOVD $13, R2             // length of our string
    MOVD $4, R16             // Unix write system call
    SVC                      // Call kernel to output the string

    // Setup the parameters to exit the program
    // and then call the kernel to do it.
    MOVD $0, R0  // Use 0 return code
    MOVD $1, R16 // System call number 1 terminates this program
    SVC          // Call kernel to terminate the program

DATA helloworld+0(SB)/8, $"Hello Wo"
DATA helloworld+8(SB)/5, $"rld!\n"
GLOBL helloworld(SB), $13

But when I tried to assemble it I got this:

$ go tool asm asm.s
# smallestgo/asm
runtime.gcdata: missing Go type information for global symbol helloworld: size 13

What?! I thought I was writing assembly NOT Go? Surprisingly, the fix was super simple, just a single number "8".

GLOBL helloworld(SB), 8, $13

If you had taken the time to read the guide I mentioned above in its entirety (unlike me when I started this project) you would have noticed this:

There may be one or two arguments to the [GLOBL] directives. If there are two, the first is a bit mask of flags, which can be written as numeric expressions, added or or-ed together, or can be set symbolically for easier absorption by a human. Their values, defined in the standard #include file textflag.h

What is 8, you ask? Well, it's RODATA or in English: read-only data. It's the portion of the binary that can only be read by the process and never written to and will segfault if tried. This string is only ever read from and never written to so it's best to place it there in the binary. I'm going to ignore the fact that it mentions Go type information in the assembler because that can't possibly come back to haunt me (hint hint). Therefore, the next to do is to...

From the last step, we've produced an asm.o file. This is an object file and not a runnable executable. I'm no expert on compilers, assemblers and such but I do know that the linker takes the object files and produces the final executable.

$ go tool link -E _start -o exec asm.o
link: unlinkable object (from package main) - compiler requires -p flag
What is the E flag?
The E flag marks the entry function of the final binary. It's _rt0_arm64_darwin for a normal arm64 macOS Go binary.

Well, that didn't work. But it gave a helpful error message. So let's make that change.

$ go tool asm -E _start -p main asm.s
$ go tool link -o exec asm.o
.../pkg/tool/darwin_arm64/link: asm.o: not package main

What?! The package is set to main... Ugh. I didn't expect this to be so hard. The offending function is loadobjfile in cmd/link/internal/ld/lib.go with this single defer statement:

defer func() {
    if pkg == "main" && !lib.Main {
        Exitf("%s: not package main", lib.File)
    }
}()

It appears that although pkg in this context is set to "main", lib.Main is not true. Why exactly? I didn't feel like reading the entirety of the linker to find out. So I'm ignorantly going to claim that it is impossible to link an assembled file by itself. I'd love to be proven wrong though!

Then what solution is there? Well, there is a reason this is about the smallest Go binary and not about assembly. I ended up compiling a normal main package with my assembly function and overriding the entry of the program to be my assembly _start.

package main

func main() { /* doesn't ever get called */ }
$ go build -ldflags="-E _start"
HelloWorld!

Doing it this way works! However, it also means the entirety of the Go runtime is included in the binary. This means it is much bigger than the few instructions I wanted to run.

Since there is no way to link an assembly file by itself (probably) and I must include the runtime, how small can I get that runtime?

Build Flags

Go binaries are known for being noticeably larger than the average Hello World C program (see issue #6853). This is due to the extensive runtime that provides garbage collection, reflection, goroutines and much more. The most common method to shrink a binary is to use the linker flags to remove the symbol information.

$ go build -ldflags="-E _start -s -w"

These two flags are documented in cmd/link:

-s - Omit the symbol table and debug information.

-w - Omit the DWARF symbol table.

However, building an empty main function with these flags only gets us down to 863KB. This isn't bad and is where most people would likely stop. However, there is also the -trimpath build flag which according to its documentation

remove[s] all file system paths from the resulting executable. Instead of absolute file system paths, the recorded file names will begin either a module path@version (when using modules), or a plain import path (when using the standard library, or GOPATH).

This only affects the panic messages so it doesn't matter at all when all I want is the assembly instructions I wrote to run.

The -buildvcs=false is needed for reproducible builds. It removes the version control information which changes on each build. Although it doesn't provide much of a change at this stage, I will end up using it later.

There are some other third-party ways to compress the binary (see this blog post). However, since the goal is to only use assembly, there is no need for any of the Go code to be linked in. Now if only there was a way to edit the runtime...

-overlay

Oh boy, there is! You probably haven't heard of this flag before but it's a useful one to test changes to the runtime without changing the source directly. I first discovered this flag from hajimehoshi/hitsumabushi project which

aims to make Go programs work almost everywhere by overwriting system calls with C function calls.

I have no intention of replacing system calls with C functions but this flag also allows rewriting the runtime with empty functions. It's original purpose was for the Go language server (gopls) but it also can be used for my hackery. Issue #39958 talks more about its design. All that's needed to understand it is that an overlay is a JSON file with the following construct:

type Overlay struct {
    Replace map[string]string
}

The Replace field maps from the path of the standard source code to the path of the replaced version. That way I can just pass the JSON file as an argument to the build command and enjoy the modified changes.

$ go build -overlay=overlay.json

So my plan was simple. I would take all of the go files in go/go1.20.1/src/runtime and replace them with empty.go. Sadly, the hours of excruciatingly tedious work I actually ended up doing crushed that dream into a million pieces. The issue boils down to one tinnie tiny totally minuscule problem called

moduledata.

Yeah, didn't expect that, did you? Neither did I. Cuz I didn't know it was a thing. But how did I come to this conclusion? Well, if you try to remove everything all in one sweep, you'll get an ambiguous error message:

$ go build -overlay overlay.json
# command-line-arguments
go:info.uintptr: unreachable sym in relocation: type:uintptr

What? How is uintptr type not reachable? Why would the linker even care since there is zero Go code except for my empty nonetheless main function?

Well, the fun answer is that the Go compiler and linker share quite a lot of information. Watch The Design of the Go Assembler talk here which helps give a general overview of how integrated the whole Go ecosystem is. The picture below is from that presentation.

This level of integration improves linking speed. However, it also made pulling out my hair a lot more common. The simple solution to this problem is to just include the linker flags from above.

$ go build -overlay ../overlay.json -ldflags="-s -w" main.go

However, I wanted to know which symbols were still in the binary so I chose to do things the hard way. Normally, one would just use the go tool objdump tool to see what symbols are in the binary but I ended up using lensm a graphical interface around it that also shows how the Go code matches the instructions generated. Doing it this way entailed going through each file in the runtime which there are 701 Go files alone (I know not all are for macOS) and stubbing out each function with nothing or just returning the zero value and making sure it compiled after every file. Now, if you're curious about what was causing the linker error, it was these three lines in symtab.go:

modules := new([]*moduledata)
for md := &firstmoduledata; md != nil; md = md.next {
    /* .. */
    *modules = append(*modules, md)

If any one of these lines were deleted or modified, in any way I got the message above. The comment above the moduledata struct states:

moduledata records information about the layout of the executable image. It is written by the linker. Any changes here must be matched changes to the code in cmd/link/internal/ld/symtab.go:symtab. module data is stored in statically allocated non-pointer memory; none of the pointers here are visible to the garbage collector.

I guess that makes sense why removing the symbol information fixed the error message since none of that data would be stored by the linker. As for why touching any of the for loop lines borks it is anyone's guess. I'm assuming that dead code elimination removes it otherwise.

There were also some other fun errors like dwarf: missing type: type:runtime.imethod which were solved by removing dwarf symbol information.

World's Smallest Go Binary

So how small can a Go binary be? At the end of the day, I still needed a small Go runtime. I'm a little sad about this because I truly wanted to have zero Go code in the final executable. Alas, the smallest I could get was this:

package runtime

type moduledata struct {
    _ [505]byte
}

The linker cares about the size of the moduledata type but only if it's smaller than 505 bytes. Is this exploitable? Probably not and even if it was I sure would have no clue how. And don't think about removing the type unless you want to see the linker SIGSEGV.

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x2 addr=0x0 pc=0x1005c12a0]

goroutine 1 [running]:
cmd/link/internal/loader.(*Loader).Data(0x100754440?, 0x14000076960?)
        /usr/local/go/src/cmd/link/internal/loader/loader.go:1228 +0xa0
cmd/link/internal/ld.(*Link).symtab(0x14000148000, 0x1400006daa0)
        /usr/local/go/src/cmd/link/internal/ld/symtab.go:819 +0x29f4
cmd/link/internal/ld.Main(_, {0x10, 0x20, 0x1, 0x1f, 0x1e, 0x7c00000, {0x1006cc103, 0x14}, {0x1006cfdf1, ...}, ...})
        /usr/local/go/src/cmd/link/internal/ld/main.go:336 +0x12a8
main.main()
        /usr/local/go/src/cmd/link/main.go:72 +0xc58

I think I can claim that this is the WORLD'S SMALLEST GO BINARY since I am using the go build command. Here is the command in its entirety:

$ go build -trimpath -overlay overlay.json -buildvcs=false -ldflags="-E _start -s -w"

Conclusion

And here are the different sizes depending on platform (all are arm64):

  • Windows - 5,120 bytes

  • macOS - 51,186 bytes

  • Linux - 196,608 bytes

  • FreeBSD - 196,608 bytes

Surprisingly Windows is the smallest at only 5KB. I was surprised by this because I thought the PE (Portable Executable) format would have strict padding rules but it doesn't.

Linux and FreeBSD are so large because they are padded with zeros. I believe this is to ensure that the text segments start at 0x10_000 with an alignment of 0x10_000. These are the default values the linker chooses. However, this could probably be fixed using the -R and -T linker flags to set them to a smaller number.

I didn't end up using them in the final command because macOS wouldn't produce a runnable executable if I had. It seems they don't want to play the smallest possible binary game. 🤷‍♂️

At the end of the day did I succeed at writing a C compiler using only Go? No, because after finally solving the issue of assembling I want to work on a different project. But I did learn a ton about the internals of the Go runtime.

I hope no one would consider using this for anything important. If you do want to experiment though, I wrote a program to generate the overlay.json for a given Go install. It will also create the empty.go, notempty.go, and empty.s files. These files are needed to completely replace the runtime with nothing but the minimalest. Just cd into the directory you want to generate the files and run this command:

$ go run github.com/totallygamerjet/smallestgo

If you're curious about the other smallest binary challenges here are ELF and PE blogs that I enjoyed reading. Go has a long way to go before it can reach those levels of tiny.

Did you find this article valuable?

Support TotallyGamerJet by becoming a sponsor. Any amount is appreciated!