weird machines, map/territory

(this is the essay version of a talk i am giving/gave at babycastles' wordhack
event. the event focuses on the intersection of language and computing, and this
talk/essay tries to combine the ways i think philosophically with the ways i
think about computation, with an emphasis on language. this is a first draft for
right now, i need to go back and, like, make some of the language less dramatic)

here, i'm going to talk about a concept in computer security, and then through a
series of a few bugs, hopefully broaden the concept until it reaches the
physical nature of computing itself. this is a kind of technical topic, but this
is wordhack, so i am going to try to emphasize the ideas of language within it,
and some of the philosophical properties that this idea reveals. this is based
on the idea of exploitation - intuitively, making computers and software do the
"wrong" thing - and through these explorations, i'll be asking what the "wrong"
thing even is.

this concept might seem somewhat superfluous, but it's hard to overstate its
importance. weaponizing the concept of "weird machines" has a lot of political
consequences, the united states (in the nsa's project dubbed 'stuxnet') used
them to spin iranian gas centrifuges, used in the production of nuclear
material, so rapidly that they tore themselves apart.

but in short, i hope this will help you think about computers in new and
hopefully fruitful ways.

i. pokemon yellow arbitrary code execution
let's start by talking about a tool-assisted speedrun (TAS) of pokémon
yellow¹. in this specific run, you see the trainer start
playing, take the default name, name their rival something strange, save and
reset, start playing again, then start doing strangethings with their apparently
empty pokemon list and garbled item list. afterwards, they walk up to their
television and the game suddenly fades out, they are informed that they've won
the game, but their name is mario, and then the game is suddenly tetris.
especially due to the familiarity of the original game (probably), this feels
like magic: a familiar thing that we are used to obeying specific rules, has
suddenly been completely subverted, now capable of so much more than we
previously imagined. if you don't know the arcane secrets involved, magic is
clearly the only explanation.

when reading through writeups of the bugs used, i found this evocative quote
that captures what we're talking about here:

- "The aim of this run is to show more aspects of what you can do with ACE
[arbitrary code excution]. Ifeel it is often misunderstood as "you canskip to
the end", or "you can cause crazy effects in the game", the concept of being
able to to literally anything is hard to grasp, perpetuated by the fact that
in most applications what you want to do is fairly limited by the goal you
have set within the game.'" [1]

i'm going to talk about this bug in the most detail, to set up the groundwork
for the further examples that are more relevant.

here's an overview of the hardware of the game boy, for those who might not be
familiar.

inside the game boy is the processor, the actual 'brain' of the game boy. in
this case, this is a z80 processor², which indicates the physical structure
of the chip itself, and also the language that the chip speaks. these chip
languages in general are called ISAs, instruction set architectures. so if this
chip is the brain, then the procession of the mind, the stream of consciousness,
for the chip is the z80 ISA.

when turning on the game boy, this mind wakes up, then starts pulling in
information from the cartridge. the cartridge is what's known as ROM, or
read-only memory, like a book - it is reading from there, following its orders.
much of this involves loading information into the RAM, or random-access memory,
which is more like a whiteboard. unlike the ROM in the cartridge, the RAM can be
erased and re-written arbitrarily. so in it we load the graphics, sound, text,
etc. when the game boy turns on, the RAM is as pristine as can be, but the ROM
never changes.

in the case of pokemon yellow, there are also some read-write areas in the
cartridge. unlike the RAM, this doesn't go away when the game boy turns off.
this is used for saving your progress.

in the worldview of the processor - the phenomenology of the processor, maybe -
the RAM and ROM are not distinct. to it, memory is just a large, continuous
plane. this plane has a few other inhabitants, the buttons (whether they are
pressed or not) live here, as well as the video memory, building what is shown
on the screen. this is important because the processor's IP, instruction
pointer, acting like a finger pointing to a word in a book or on the whiteboard,
can be pointing anywhere in here. the IP tells the brain what its next action
should be - and these instructions can befrom the original book, or it can be
something that we have constructed fresh since we've turned the game on.

in this run, the first thing that happens when the game proper starts, is to
commit the cardinal sin of turning the game boy off while saving. as it turns
out,doing this at precisely the right time mangles the save in such a way that
the game will now believe we have 255 pokémon. the game only expects us to ever
have 6, and we had 0, so what are the consequences of this?

the actual mechanics of the list of pokémon are laid bare. the pokémon list was
never literally shuffling around pokémon, it was shuffling around bytes of RAM.
that this was movement of pokémon is just semiotics laid on top of this raw
mechanism, first by the world of the cartridge, and then by us. the player is
now able to shuffle around parts of the game boy's mind, their hand reaching
out further into the memory plane than we used to be.

the player swaps a blank pokemon with a hypothetical 10th pokemon: this modifies
the memory so the game now believes we have 255 items. our inventory has
undergone a change much like the pokémon list did - it too is now able to reach
further out into memory, and now, by tossing items, we can start to write
individual bytes rather than merely shuffling around what already exists.

with this, the player begins to write a program into the memory, and this is the
start of the first payload - the small program representing the first rung of
the ladder used to take control of this world. they need to use their two crude
tools in harmony to be able to properly construct in this limited environment,
so they go back to the pokemon list to shuffle chunks of the rival's strange
name into the ad-hoc program. eventually, it is complete, and the game starts to
take in a different function: it is reading the keypresses and writing them
directly to memory, and through this they have yet another crude tool, one more
flexible in some ways, but less in others. now there is far more space to work,
but the tool is finicky: the bytes entered need to be representable by the keys
of the game boy. the keys which once represented walking, moving through menus,
etc are now pencils directly writing the new program. as the brain remains
intensely focused on the program being written, eventually something is written
that, rather than continuing to sustain the process of writing, jumps forward
into what was being built. at this point, the player is in the workshop that
they have set up, no longer limited by the crude tools lying dormant within the
game itself, but now utilizing specialized tools that the player has provided
themselves, and anything is possible.

so in this process, the 'exploit' intuitively is the ability that we have to
make useful modifications through corrupting a save file, and the process that
follows in turning this small weirdness into something we can utilize towards
our own ends. but to try and formalize the idea of an exploit can be elusive -
how do we talk about a 'mistake', and how we leverage it, in a strict sense?

the concept of the weird machine is a metaphor, where we look at the intended
usage of the z80 chip itself. this is ostensibly a normal machine - we give it
instructions, and it works through them in an expected way. these instructions
are the shared language between us and the chip, allowing us to communicate
deliberately and effectively. similarly, we can look at our interactions with
the pokémon-game-boy as another normal machine, where we know to push buttons
and accomplish expected tasks.

where the pokémon-game-boy suddenly becomes a weird machine is when we realize
that the language we are using is different, broader than anyone involved
expected - these buttons do not just allow use to walk around and use items, but
to subvert the entire machine that we are in a dialogue with. the idea of weird
machines aims to establish the difference - and possibly, the inherent
irreconcilability - between an idealized program that is always operating in a
space intended by the authors, and the actual, material, created machine as it
exists. the weird machine is exploring the ways that the map is not the
territory. an exploit, then, is anything that knowingly takes advantage of this
difference between map and territory.

in pokémon, the 'language' being used, in the user's perspective, goes from a
transcendental ruleset where buttons contextually indicate moving through
menus, walking, catching pokemon, etc to a full-system, totally immanent
language involving turning the game on and off, writing programs on the fly
with the buttons, interacting directly with the underlying hardware, the
physical system: this was always the true language of pokemon, but we never
knew that we could take advantage of it.

well, we can tame this weird machine. even if we can't stop the user from
turning off the game boy, we can add consistency checks on the saves, or
structure the game boy cartridges to have enough power to always complete
their write operations, maybe. the computer is, for now, just doing what
we ask of it, and we are just not communicating clearly. but systems have
grown far more complicated than the game boy and its z80.

ii. spectre/meltdown
- "A warning regarding explanations about processor internals in this blogpost:
This blogpost contains a lot of speculation about hardware internals based on
observed behavior, which might not necessarily correspond to what processors
are actually doing."³

as processor performance gains from simply packing more transistors into a chip
started to diminish, and for other design reasons, the actual mechanisms that
processors implemented grew in complexity. any part of the processor that was
once laying around idle was put to work: if there was no work to be done, they
were to take their best guess and start doing that. if that guess was right,
there would be a head start on whatever the user might've wanted to do, and if
it was wrong, well, it'd be no worse than doing nothing.

the saying goes that computers are neither smart nor stupid, that they just do
what you tell them, to a fault. in 2018, google project zero disclosed³ two
attacks on modern processors that showed that even a hypothetically
perfectly-worded series of instructions could be subverted. the chip's guessing
behavior - also known as speculative execution - had subtle side effects that
other programs could detect the presence of. possible pathways through any
given program, even invalid ones, collapsed into the world of the actual.

for our purposes, the point is that we had confirmation that the "normal
machines" we thought we were working with had not existed for the past 20
years, and the immanent language of the weird machines we were using had not
yet been fully understood.

the process of understanding the language of the processors we use has gone
from reading a manual and developing a true understanding, to one more similar
to the natural sciences: theorizing about observed phenomena and their
implications. projects, such as sandsifter⁴ to programatically explore the
potential language of the weird machine often has startling results:

- "Typically, several million undocumented instructions on your processor will
be found, but these generally fall into a small number of different groups.
After binning the anomalies, the summarize tool attempts to assign each
instruction to an issue category:
• Software bug (for example, a bug in your hypervisor or disassembler),
• Hardware bug (a bug in your CPU), or
• Undocumented instruction (an instruction that exists in the processor,
but is not acknowledged by the manufacturer)"⁴

the immanent language of these machines, it turns out, is intractably large. and
the uncodified language that exists at these boundaries is shifting and
unpredicatable.

even the shifts in territory that directly invalidate our map are close secrets.
intel is, naturally, a monopolising and largely malignant force: beyond not
revealing the actual nature of these machines (to whatever extent they
themselves know it), these abberations in the territory themselves are erased
from history:

- "Once they stop manufacturing a [processor], they reserve the right to remove
the errata and you won't be able to find out what errata your older stepping
has unless you're importantenough to Intel."⁴

of course, processors are physical objects and reversing them is possible. the
noble 6502 processor has been opened, x-rayed, analyzed, and eventually a
perfect transistor-level emulation was produced, bug for bug. this is a process
involving an immense amount of time and resources, though: the thought of
adapting this process from the simple 6502 to something as complicated and fast-
changing as a modern intel processor seems essentially infeasible.

iii. rowhammer(.js)
there is a practice called typosquatting, where someone buys a domain like
dacebook.com to try to trick people who mis-type URLs. an equivalent trick,
called bitsquatting⁵, is to buy something like ggggle.com, which is a single bit
off of google.com. this happens to work because every so often, bits in our
computer's memory will randomly corrupt. in normal conditions, it happens
very infrequently, rare enough that we can count on it not happening. but
due to the sheer amount of computers that are online, and trying to visit
google, it happens frequently enough that this tactic will catch you a lot
of victims.

like the game boy, the personal computer's ram clears when it is powered off.
in addition, the nature of the ram our modern computers use - DRAM, for
dynamic random access memory - exists in a state of constant precarity. the
memory cells of the DRAM represent individual bits with simple capacitor-
transistor pairs, needing to be constantly, rapidly refreshed (at a rate of no
greater than 64ms between refreshes, but often down into the nanoseconds) in
order to be able to retain information.

our next exploit, called rowhammer, exploits this precarity. a variant of it,
called rowhammer.js⁶, does this in a browser running
javascript; nothing special, just presenting a regular website. ordinary, albiet
mysterious javascript is used to map the physical layout of the memory, gain an
understanding of how other programs are behaving (the areas they are
trafficking, how they are using memory), potentially occupy large regions of that
memory to force these other programs to contort themselves in vulnerable ways,
and then stress the memory to induce bit flips that could potentially cause
misbehavior in them.

unlike some of the darker corners of the immanent processor language that tools
like sandsifter seek to locate, here, normal-enough javascript is being used.
the devil lies in the subtle, ironic implications between the lines of the code.
familiar constructs gain sinister meaning in the larger context of their actual,
material execution, the machine falling apart as it physically tries to meet
these strange demands.

an entertaining aside: to try and limit javascript's ability to detect the
material conditions of its own operation - a necessary part of javascript
attacks utilizing rowhammer or spectre - browsers limited javascript's sense of
time, providing it access to a clock no more precise than 100ms in some
instances. what those that are familiar with weird machines discovered, however,
was that they could make a more precise clock out of several worse ones.⁷ these
resulting constructed clocks end up being more accurate than the original ones
were in the first place. these attacks still work, and typical web pages can
still reach through your browser, and even through a virtual machine housing
that browser - attempting to hide the physical behind a curtain can only ever
work so well.

anyway, by and large, computers work. this is in the same sense that, by and
large, newtonian physics works. the "normal", non-weird machine is an ideal
fantasy, but one that accomplishes work, and imagining constructions without it
- much like the simplifications that newtonian physics provides - is bleak.

but the shock that comes when first learning about spectre, rowhammer, and the
enormous body of similar work is striking. it is demonstrable that, despite the
faith we have in the normal machine, the weird machine is the one that
physically exists, the one that inhabits our actual reality. learning the nature
of the weird machines that surround us is fertile ground for works of
technological art, as well as an active arena of modern, international warfare.

--------------------------------------------------------------------------------
thanks to carrie z, Zara Storm, and Elaine Kearny for helping review this essay.

citations, references, further reading:
[1] Submission #5384: MrWint's GBC Pokémon: Yellow Version "Arbitrary Code Execution" in 05:48.28
[2] well sort of. it's actually a kind of mangled version of the z80 processor,
and people usually call it gb-z80, but i'm just going to call it z80 because
it doesn't really matter here.
[3] Reading privileged memory with a side-channel
[4] sandsifter
[5] DEF CON 19 - Artem Dinaburg - Bit-squatting: DNS Hijacking Without Exploitation
[6] Rowhammer.js: A Remote Software-Induced Fault Attack in JavaScript
[7] Fantastic Timers and Where to Find Them: High-Resolutio Microarchitectural Attacks in JavaScript

CPU Bugs
New CPU Features
Pokemon Yellow Total Control Hack
Weird machines, exploitability, and provable unexploitability
The Page-Fault Weird Machine: Lessons in Instruction-less Computation