• caglararli@hotmail.com
  • 05386281520

Is it possible to break out of 8086 tiny from within?

Is it possible to break out of 8086 tiny from within?

We don't normally worry about old school viruses breaking out of emulators; but sometimes we worry about targeted exploit code breaking out of emulators.

8086tiny is an 8086/80186 CPU emulator. The active development repository seems to be here: https://github.com/ecm-pushbx/8086tiny

Some analysis:

  1. We can corrupt the internal CPU registers ZS and XF; causing incorrect instruction execution.
  2. We cannot corrupt the decoder tables directly because they're copied out of the BIOS image before the first CPU instruction executes and are not ever recopied.
  3. Trashing the BIOS image in RAM does not lead to an exploit.
  4. We can read or write 1 byte out of bounds with a MOVSW instruction; however there's a 16 byte buffer to prevent exactly that problem. If the compiler miscompiled the MOVSW codepath (ignoring unsigned short wraparound) there would be an exploit here but that's too much to hope for.
  5. We can mess with the emulated XMS memory and allocate and fill many megabytes of RAM. It appears the stated 16MB maximum is incorrect and we actually have 512MB of RAM to play with. Double free is not possible.
  6. However we can write to the BIOS image on disk via db 0x0F 0x03 with DL=2. This is certainly not intended and we can corrupt the decoder tables for the next boot. Since we can also write to the BIOS startup routine we can take advantage of our own trashed decoder tables.

This line of code looks juicy: SEGREG(seg_override_en ? seg_override : bios_table_lookup[scratch2_uint + 3][i_rm], bios_table_lookup[scratch2_uint][i_rm], regs16[bios_table_lookup[scratch2_uint + 1][i_rm]] + bios_table_lookup[scratch2_uint + 2][i_rm] * i_data1+) However it's not quite so cut and dried as that. There's a cast to unsigned short inside SEGREG preventing breaking out via offset. Other bios_table_lookup reads don't turn into pointers. We can add arbitrary values to reg_ip but that doesn't even help because it's an unsigned short. We also have complete control of seg_override at this point, but it's an unsigned char.

  1. We can read and write to arbitrary handles using db 0x0F 0x02 and db 0x0F 0x03 with DL=3 This uses an index overrun on disk; thus using the value of scratch_int as a handle, which we control. This provides access to standard input (which we already have), standard output (which we already have), standard error, the disk handles (which we already have), and the X11 handle(s) (which we were not expecting). However, an lseek operation will always be executed first and neither read nor write will be tried if it fails, so no we can't access the X11 handle(s). If somebody was silly enough to give this thing another file handle it's ours to play with, but who would do such a thing?

  2. There are no spectre or meltdown attack vectors present as memory bound limiting is done with casts to unsigned short rather than comparison operators. There might be a rowhammer pathway but this doesn't look like the easiest rowhammer to exploit.

This is as far as I'm going to be able to get.

I don't know if it's actually possible to break out or not; it looks like this code was not hardened at all (witness some unexpected things found on initial analysis) but ...

Am I missing something vital or is this one a stupidly tough nut to crack?