4917: Machine Code For Kids
Richard Buckland's 4 bit computer
check out his lectures: Lecture 3: Machine Code - Richard Buckland UNSW, Tom Scott's The Fetch-Execute Cycle: What's Your Computer Actually Doing, or Ben Eater's How do CPUs read machine code.
the computer is named: 4917
Table of Contents:
4917 - The Computer:
This computer has four important components.
* Memory
also called RAM, for Random Access Memory, because we can read and
write into any location. The memory is where we keep our program and
its data.
* Processor
also called CPU, Central Processing Unit, is the piece of the
computer that executes the program from the memory
* Printer
where we can print values
* Beeper
we can use it to beep
Bits, Nibbles and Bytes:
1 bit is the smallest amount of information we could have, it is
either 1 or 0. For example, we can encode the flip of a coin in 1 bit,
it is either Heads or Tails.
In 4 bits we can have the numbers from 0 to 15:
0000 0 0001 1 0010 2 0011 3
0100 4 0101 5 0110 6 0111 7
1000 8 1001 9 1010 10 1011 11
1100 12 1101 13 1110 14 1111 15
4 bits are called a nibble
8 bits is a byte
1024 bytes is a kilobyte
1024 kilobytes is a megabyte
1024 megabytes is a gigabyte
1024 gigabytes is a terabyte
...
CPU (Processor):
The CPU loads instructions from memory and executes them. You can
think of the CPU as an octopus. The octopus can only work with its
arms, for example if one arm has 5 coins and the other has 3, it can
add them and have one arm with 8 coins. Then you can ask the octopus
to put the coins back on the shelf. The CPU just like the octopus
can only work with its arms, we call them 'registers'. Our CPU has 4
registers. Modern CPUs have around 32 registers, some have more than
600.
██████████
████▒▒▒▒▒▒▒▒▒▒████
██▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒██
██▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒██
██▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒██
██▒▒▒▒ ▒▒▒▒▒▒ ▒▒▒▒██
██▒▒ ▒▒ ▒▒██
██▒▒ ████ ▒▒ ████ ▒▒██
██▒▒ ██████████████ ▒▒██
██▒▒ ██▒▒▒▒▒▒██ ▒▒██
██▒▒██▒▒██▒▒██████▒▒██▒▒██▒▒██
██▒▒▒▒██████▒▒██████▒▒██████▒▒▒▒██
██▒▒██R0▒▒████▒▒▒▒▒▒████▒IS▒██▒▒██
████R0▒▒██R1██████IP██▒IS▒████
██R0▒▒████R1▒██▒IP▒████▒IS▒██
██████▒R1▒██ ██IP▒▒██████
████ ████
┌────────┐ ┌────────┐
│ IP: 0 │ │ IS: 0 │
└────────┘ └────────┘
┌────────┐ ┌────────┐
│ R0: 0 │ │ R1: 0 │
└────────┘ └────────┘
Each of them can hold a 4 bit number from 0 to 15. Registers are
actually just memory that the CPU can do stuff with (like the arms of
the octopus), e.g. ADD R0 + R1 or INCREMENT R0. IP and IS are a bit
special, because the CPU uses them to know which instruction to fetch
from memory and what its value is. R0 and R1 are called general
purpose registers because we can use them for whatever we want.
Once you power on the CPU, it fetches an instruction from the memory address
pointed by IP(instruction pointer), and puts it in IS(instruction store) and
executes it and then moves IP to the next instruction, or sets it to a specific
value if you are JUMPing. It keeps doing that forever and ever until you turn it
off.
MEMORY:
This computer has 16 addressable memory locations, each location can
hold a number between 0 and 15.
You can think of it as a list of numbers, the position or index on
each number is its address (a bit like house numbers on the street):
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
^ ^ ^
| | |
position 0 | position 15
position 8
This however is a bit annoying to read, so using a 4x4 grid makes it a
bit easier, so it would look like this:
┌────┬────┬────┬────┐
│ 0 │ 1 │ 2 │ 3 │
├────┼────┼────┼────┤
│ 4 │ 5 │ 6 │ 7 │
├────┼────┼────┼────┤
│ 8 │ 9 │ 10 │ 11 │
├────┼────┼────┼────┤
│ 12 │ 13 │ 14 │ 15 │
└────┴────┴────┴────┘
top left is address 0, bottom right is address 15, so we can reach any
cell by using its address.
For example, if we have the value 5 on address 1 and value 7 on
address 12, the memory looks like this:
┌────┬────┬────┬────┐
│ 0 │ 5 │ 0 │ 0 │
├────┼────┼────┼────┤
│ 0 │ 0 │ 0 │ 0 │
├────┼────┼────┼────┤
│ 0 │ 0 │ 0 │ 0 │
├────┼────┼────┼────┤
│ 7 │ 0 │ 0 │ 0 │
└────┴────┴────┴────┘
The memory is reset every time you turn off your computer, so it starts
zeroed out:
┌────┬────┬────┬────┐
│ 0 │ 0 │ 0 │ 0 │
├────┼────┼────┼────┤
│ 0 │ 0 │ 0 │ 0 │
├────┼────┼────┼────┤
│ 0 │ 0 │ 0 │ 0 │
├────┼────┼────┼────┤
│ 0 │ 0 │ 0 │ 0 │
└────┴────┴────┴────┘
This is why in real computers we need hard disks, where we persist
information that survives powering off the device, and when it starts
we can load this information into the memory and use it.
In our computer however it is possible however to pre-configure the
memory to start preloaded with our program.
Instructions:
Those are the possible instructions our CPU can do:
0 HALT
1 R0 = R0 + R1 (add R0 and R1 and store the result in R0)
2 R0 = R0 - R1 (subtract R0 and R1 and store the result in R0)
3 R0 = R0 + 1 (increment R0)
4 R1 = R1 + 1 (increment R1)
5 R0 = R0 - 1 (decrement R0)
6 R1 = R1 - 1 (decrement R1)
7 BEEP
8 X PRINT X (print the next memory cell)
9 X R0 = MEMORY[X] (load the value of address X into R0)
10 X R1 = MEMORY[X] (load the value of address X into R1)
11 X MEMORY[X] = R0 (store the value of R0 into address X)
12 X MEMORY[X] = R1 (store the value of R1 into address X)
13 X JUMP X (jump to the value in the next memory cell)
e.g. 13 7 means jump to address 7
14 X JUMP X IF R0 == 0 (jump to X if R0 is equal to 0)
e.g. 14 7 means jump to address 7 if R0 is equal to 0
otherwise proceed with the next instruction
15 X JUMP X IF R0 != 0 (jump to X if R0 is *not* equal to 0)
You see some of the instructions take 1 memory cell, like INCREMENT or
BEEP, but others take two cells like PRINT or JUMP.
You can also see that it cant just go and do addition or subtraction
directly in memory, first it needs to load the values from memory into
R0 and R1 and then do addition and then put it back in memory.
The JUMP instructions are also called BRANCH instructions, BRANCH,
BRANCH IF ZERO, BRANCH IF NOT ZERO or B, BZ, BNZ for short.
How does the computer work:
Lets look at the computer as a whole while it has the following
program preloaded in memory: 3 11 4 8 0.
┌────────┐ ┌────────┐
│ IP: 0 │ │ IS: 0 │
└────────┘ └────────┘
┌────────┐ ┌────────┐
│ R0: 0 │ │ R1: 0 │
└────────┘ └────────┘
┌────┬────┬────┬────┐
│ 3 │ 11 │ 4 │ 8 │
├────┼────┼────┼────┤
│ 0 │ 0 │ 0 │ 0 │
├────┼────┼────┼────┤
│ 0 │ 0 │ 0 │ 0 │
├────┼────┼────┼────┤
│ 0 │ 0 │ 0 │ 0 │
└────┴────┴────┴────┘
Once we boot the computer this is what the CPU will do:
* FETCH the value at address IP into IS
* EXECUTE the instruction from IS
* UPDATE the IP location accordingly
e.g. if the instruction is 13 7 (JUMP TO 7), it will just do IP = 7
or if it is just 3 (INCREMENT R0) it will do IP = IP + 1
* GOTO FETCH, forever and ever
so in our case it will look like this:
IP = 0, IS = 3
increment the value of R0
IP = IP + 1
IP = 1, IS = 11 4
store the value of R0 on address 4
IP = IP + 2
IP = 3, IS = 8 1
PRINT 1, you see when we started on address 4 we had the value 0
but after the previous instruction, we put 1 there, so when
we execute the print instruction it will print 1
IP = 5, IS = 0
the value at address 5 is 0, so the CPU will HALT and not
update the instruction pointer(IP) anymore
Another program we could examine: 7 8 4 13 0
┌────────┐ ┌────────┐
│ IP: 0 │ │ IS: 0 │
└────────┘ └────────┘
┌────────┐ ┌────────┐
│ R0: 0 │ │ R1: 0 │
└────────┘ └────────┘
┌────┬────┬────┬────┐
│ 7 │ 8 │ 4 │ 13 │
├────┼────┼────┼────┤
│ 0 │ 0 │ 0 │ 0 │
├────┼────┼────┼────┤
│ 0 │ 0 │ 0 │ 0 │
├────┼────┼────┼────┤
│ 0 │ 0 │ 0 │ 0 │
└────┴────┴────┴────┘
The execution is as follows:
IP = 0, IS = 7
BEEP
IP = IP + 1
IP = 1, IS = 8 4
PRINT 4
IP = IP + 2
IP = 3, IS = 13 0
JUMP TO ADDRESS 0
IP = 0
IP = 0, IS = 7
BEEP
IP = IP + 1
...
This is an infinite loop, it will beep and print 4 forever.
Overflow and Underflow:
Our computer does not understand negative numbers, so if R0 is 0 and
you decrement it, it will turn into 15 (which is the maximum value)
0 - 1 is 15 (subtract 1 from 0)
0 - 2 is 14 (subtract 2 from 0)
This is called integer underflow
You also can not have numbers bigger than 15, so if R0 is 15 and you
increment it, it will turn into 0.
If R0 is 2 and you add 15 to it, it will turn into 1
1 + 14 is 15
1 + 15 is 0
2 + 15 is 1
3 + 15 is 2
..
5 + 15 is 4
It is the same for IP, it overflows just like R0 and R1
This is called integer overflow
Code is Data:
You can see that the processor absolutely can not distinguish between
code and data, it just fetches wherever IP points to and executes it.
It does not have concepts like variables or numbers or letters or
anything like it. It just sees its registers and the current
instruction and does it blindly.
Because code and data both live in the memory we can make a program
that writes itself, a self modifying program.
┌────────┐ ┌────────┐
│ IP: 0 │ │ IS: 0 │
└────────┘ └────────┘
┌────────┐ ┌────────┐
│ R0: 0 │ │ R1: 0 │
└────────┘ └────────┘
┌────┬────┬────┬────┐
│ 5 │ 11 │ 3 │ 0 │
├────┼────┼────┼────┤
│ 0 │ 0 │ 0 │ 0 │
├────┼────┼────┼────┤
│ 0 │ 0 │ 0 │ 0 │
├────┼────┼────┼────┤
│ 0 │ 0 │ 0 │ 0 │
└────┴────┴────┴────┘
The execution is as follows:
IP = 0, IS = 5
DECREMENT R0, R0 = 0 - 1
(since R0 is 0 at start, it will underflow to 15)
IP = IP + 1
IP = 1, IS = 11 3
STORE R0 in address 3, MEMORY[3] = 15
(on addr 3 we had HALT before, but now it will be JUMP IF R0 != 0)
IP = IP + 2
IP = 3, IS = 15 0
JUMP TO 0 IF R0 != 0
(since R0 is 15, it is not 0, so we will jump)
IP = 0
IP = 0, IS = 5
DECREMENT R0, R0 = 15 - 1
IP = IP + 1
IP = 1, IS = 11 3
STORE R0 in address 3, MEMORY[3] = 14
(we had JMP IF R0 != 0 before, but now it will be JUMP IF R0 == 0)
IP = IP + 2
IP = 3 IS = 14 0
JUMP TO 0 IF R0 == 0
(since R0 is 14, that means that it is not 0, so we wont jump)
IP = IP + 2
IP = 5 IS = 0
HALT
So this program changes itself twice, first it changes HALT to JUMP IF
NOT ZERO, and then to JUMP IF ZERO.
Game:
There are many possible ways to play the game.
* Random Memory Corruption
roll a dice to tell you which address to corrupt, and roll again to
tell you which value, then you have to explain what the corrupted
program will do.
* Limited Monkey Patching
you can increment the value of one memory address to make the
program do something else.
* Monkey Patching
modify the program to do something else, you have total control
* Break The Register
make the program to do something else by changing R0 and R1's values
before the program starts
* Hacking
requires two people, and the goal is to HALT the other player's
computer without using the HALT instruction, you can modify one
memory address per turn
* Composing
You can interrupt the current program, and load new program into
memory but leaving the values of IP, R0 and R1 from the previous
program.
Ideally you will find what is working with your kids, could be dice, could be
treasure hunt, could be 'whoever guesses the result of the program the fastest
wins'. Or you could try to have 'inverted computer' that reads the instructions
backwards. I am still working on game ideas with the printed cards so if you
have an idea, please send me an email at b0000@fastmail.com.