Emulating Linux MIPS in Perl - Part 1: ELF loader
The real reason why I wanted a blog is to be able to talk about interesting things, and sometimes to publish or write about stuff I did that might be interesting to others, but that wouldn't warrant making a formal release of some software package.
So, today I'll write about a MIPS CPU emulator that emulates just enough of a Linux kernel and enough of an ELF loader to be able to run some statically linked programs.
Uhm, why would one do that, and why in Perl?
To run dash or bash, i.e., some POSIX shell.
Uhm, and why would one want to do that?
Well, to run perl's Configure script of course!
Uhm, and why not use bash directly?
Ah, to run Configure on native windows, to see if staticperl could be made to work with it.
This is quite succinctly the trail of thought that led to this silly project, in reverse order. It was a bit more complicated in proper order:
Staticperl doesn't work natively on windows, because it builds perl using its Configure script, which isn't used on windows. I wondered if it somehow worked if you would just run it through a POSIX shell. Then I wondered how nice it would be to have a POSIX shell written in Perl (there is a longstanding tradition to try to replace unix utilities by programs written in Perl).
I studied the POSIX shell grammar, and quickly found out that the grammar is just there for fun, you can't split lexing and parsing easily, and it would be a painstaking process to parse a shell script, octet by octet, instead of being able to write a nice regex-based grammar.
So this being out, I wondered about maybe compiling dash with LLVM and emulating it's virtual machine code in Perl. Well, LLVM intermediate code is far from trivial, so I wondered, maybe some other GCC backend might result in an easier to emulate machine language. Like, some MIX backend or somesuch.
Well, after quite some research, I decided that MIPS is that backend - MIPS I has very few instructions, is very regular, and compilers should be easy to find, so I wouldn't even have to bother building my own cross compilation environment. The latter was a failure, as nobody uses MIPS I anymore and later architectures are way more complicated, but then, that's what I thought.
Guts
Ok, on to some code (part 4 has download links if you want to see the full version) - overall, I needed three distinct functional parts: the ELF loader, the CPU, and the Linux kernel API.
Let's start with the ELF loader:
sub mips_exec($$;$$) { my ($path, $argv, $envv, $auxv) = @_;
Surprise, it's a sub that takes the path, argument vector, environment vector and the ELF auxiliary vector.
mem_reset;
That calls into the CPU to reset the memory to be all zero.
Next we load the actual binary. It's maybe not obvious, but due to the power of Perl, this can be used to pass in the binary as a string reference, in which case the emulator can be used to have a binary in the DATA
section instead of in an external file.
my $file = do { open my $fh, "<", $path; local $/; <$fh> };
What follows is some pretty boring ELF header parsing, mostly checking endianness and architecture:
# 32 bit, msb, elf version "\x7fELF\x01\x02\x01" eq substr $file, 0, 7 or die "not an elf file, or wrong class, encoding or version"; my ($type, $machine, undef, $entry, $phoff, undef, undef, undef, $phentsize, $phnum) = unpack "nnNNNNNnnn", substr $file, 0x10; $type == 2 or die "file not an executable"; $machine == 8 or die "file not mips r3000 big endian";
For your reference, the ELF Header of my dash binary looks like this:
ELF Header: Magic: 7f 45 4c 46 01 02 01 00 00 00 00 00 00 00 00 00 Class: ELF32 Data: 2's complement, big endian Version: 1 (current) OS/ABI: UNIX - System V ABI Version: 0 Type: EXEC (Executable file) Machine: MIPS R3000 Version: 0x1 Entry point address: 0x4004e0 Start of program headers: 52 (bytes into file) Start of section headers: 213052 (bytes into file) Flags: 0x1007, noreorder, pic, cpic, o32, mips1 Size of this header: 52 (bytes) Size of program headers: 32 (bytes) Number of program headers: 4 Size of section headers: 40 (bytes) Number of section headers: 22 Section header string table index: 21
Having done that, the actual work starts - parsing the program headers (which tell the loader where to load which parts of the file to) and executing them.
ELF is complex enough to do what is needed, but it gracefully degrades into simplicity for simple cases, such as static binaries. For your reference, the program headers of my dash look like this:
Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align REGINFO 0x0000b4 0x004000b4 0x004000b4 0x00018 0x00018 R 0x4 LOAD 0x000000 0x00400000 0x00400000 0x2e76c 0x2e76c R E 0x10000 LOAD 0x02f000 0x0043f000 0x0043f000 0x0126c 0x05c90 RW 0x10000 GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RWE 0x4
And the code simply iterates over them, by copying the file data into the emulator memory for all sections of type LOAD
, which is only slightly complicated by the way the emulator represents memory:
for my $i (0 .. $phnum - 1) { my ($type, $offset, $vaddr, $physaddr, $size, $memsz, $flags, $align) = unpack "N*", substr $file, $phoff + $i * $phentsize, 32; $type != 2 or die "dynamic loading is not supported"; next unless $type == 1; next unless $size; for my $o (0 .. $size / 4 - 1) { my $w = unpack "N", substr $file, $offset + $o * 4, 4; my $a = $vaddr + $o * 4; $mem[$a >> ADDR_SHIFT][($a >> 2) & ADDR_MASK] = $w; } }
While a simple array of 32 bit words would suit the emulator quite nicely, this would be very wasteful on RAM, as the memory map is quite sparse, with almost 4GB of unused memory between the heap and the stack.
Therefore, memory is represented by a two-level tree. The top level is the array @mem
which has an entry for every 64KiB memory page. It's either undef
, to signal the memory is not allocated, or a reference to another array with 16384 32 bit values, the actual memory.
So if you have an address in $a
, this is how to get the corresponding 32 bit word:
sub ADDR_SHIFT(){ 16 } sub ADDR_MASK (){ 0x3fff } $mem[$a >> ADDR_SHIFT][($a >> 2) & ADDR_MASK]
The >> 2
is needed because we store 32 bit words, not bytes.
This representation requires "only" 65536 array entries to represent the full 32 bit address space, which is quite manageable, at a medium speed penalty.
The memory map of our virtual machine looks like this, which is a bit arbitrary, but works:
0x00000000 - 0x0fffffff space for ELF segments (256MiB) 0x10000000 - 0xf00effff heap (malloc etc.) 0xf00f0000 - 0xf00fffff stack (64KiB) 0xf0100000 - 0xf010ffff argv, envv, auxv 0xf0110000 - 0xffffffff space for arg/env strings
Next we reset the CPU and tell it to start execution at the program entry point taken from the ELF header:
cpu_reset $entry;
And now comes the real hard part - setting up the stack, which kind of includes the arguments, environment and auxiliary vector:
sub STACK (){ 0xf00f0000 } my $str = STACK + 65536; my $ptr = STACK; my $add_int = sub { memset $ptr, pack "N", $_[0]; $ptr += 4; }; my $add_str = sub { $add_int->($str); memset $str, "$_[0]\x00"; $str += 1 + length $_[0]; }; $add_int->(scalar @$argv); $add_str->($_) for @$argv; $add_int->(0); $add_str->($_) for @$envv; $add_int->(0); # auxv $add_int->($_->[0]), $add_int->($_->[1]) for @$auxv; $add_int->(0); $add_int->(0);
As you can see, it works by having some helper subs to push ints and strings onto the "stack" area, followed by "pushing" the arguments, environment and aux vector. While the code looks simple, it's been harder to research a good layout than the program headers.
Note also that the code cheats a bit: instead of calculating exactly how much space to allocate for strings and so on, it simply puts the strings 256KiB (65536 * 4) after the stack, and the vectors pointing to them directly after the stack.
Having done all that, we are finished - running the actual CPU is left to the caller.
How to
So how is it used? The simple case would be tun run an external file, say, dash:
mips_exec "/tmp/dash", ["./run", @ARGV], [map "$_=$ENV{$_}", keys %ENV];
This loads the file /tmp/dash, pretends to call it as ./run (remember that executable name and first argument vector entry are distinct), because dash really only cares about whether the first character of its name is a -
(to decide whether it is supposed to act as a login shell).
Otherwise, the perl script arguments are passed through, as well as the environment. And I don't bother with an auxiliary vector - the code to maintain it was really just mental masturbation. Or, maybe, you never know when you might need to use it for real.
Here is another example, which runs a simple benchmark script:
mips_exec "/tmp/dash-mipsel", ["sh", "-c", "for d in 0 1; do for a in 0 1; do :;done;done"], [map "$_=$ENV{$_}", keys %ENV];
The other case is to add the actual MIPS binary in the DATA
section of the perl script, which just changes how the executable "path" is passed in (and passes $0
as first argument):
my $file; if (my $data = do { local $/; binmode DATA; <DATA> }) { $file = \$data; unshift @ARGV, $0; } else { $file = $ARGV[0]; } mips_exec $file, \@ARGV, [map "$_=$ENV{$_}", keys %ENV];
And after loading, what's left is to actually emulate the damn CPU:
cpu_run;
And that's all for today, folks! The next part will look at the Linux kernel emulator, followed by the CPU emulator, and finally the part where all is tied together and you can download it!