4917: Machine Code For Kids

Richard Buckland's 4 bit computer

check out his lectures: Lecture 3: Machine Code - Richard Buckland UNSW, Tom Scott's The Fetch-Execute Cycle: What's Your Computer Actually Doing, or Ben Eater's How do CPUs read machine code.
the computer is named: 4917

Table of Contents:

4917 - The Computer:

This computer has four important components. * Memory also called RAM, for Random Access Memory, because we can read and write into any location. The memory is where we keep our program and its data. * Processor also called CPU, Central Processing Unit, is the piece of the computer that executes the program from the memory * Printer where we can print values * Beeper we can use it to beep

Bits, Nibbles and Bytes:

1 bit is the smallest amount of information we could have, it is either 1 or 0. For example, we can encode the flip of a coin in 1 bit, it is either Heads or Tails. In 4 bits we can have the numbers from 0 to 15: 0000 0 0001 1 0010 2 0011 3 0100 4 0101 5 0110 6 0111 7 1000 8 1001 9 1010 10 1011 11 1100 12 1101 13 1110 14 1111 15 4 bits are called a nibble 8 bits is a byte 1024 bytes is a kilobyte 1024 kilobytes is a megabyte 1024 megabytes is a gigabyte 1024 gigabytes is a terabyte ...

CPU (Processor):

The CPU loads instructions from memory and executes them. You can think of the CPU as an octopus. The octopus can only work with its arms, for example if one arm has 5 coins and the other has 3, it can add them and have one arm with 8 coins. Then you can ask the octopus to put the coins back on the shelf. The CPU just like the octopus can only work with its arms, we call them 'registers'. Our CPU has 4 registers. Modern CPUs have around 32 registers, some have more than 600. ██████████ ████▒▒▒▒▒▒▒▒▒▒████ ██▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒██ ██▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒██ ██▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒██ ██▒▒▒▒ ▒▒▒▒▒▒ ▒▒▒▒██ ██▒▒ ▒▒ ▒▒██ ██▒▒ ████ ▒▒ ████ ▒▒██ ██▒▒ ██████████████ ▒▒██ ██▒▒ ██▒▒▒▒▒▒██ ▒▒██ ██▒▒██▒▒██▒▒██████▒▒██▒▒██▒▒██ ██▒▒▒▒██████▒▒██████▒▒██████▒▒▒▒██ ██▒▒██R0▒▒████▒▒▒▒▒▒████▒IS▒██▒▒██ ████R0▒▒██R1██████IP██▒IS▒████ ██R0▒▒████R1▒██▒IP▒████▒IS▒██ ██████▒R1▒██ ██IP▒▒██████ ████ ████ ┌────────┐ ┌────────┐ │ IP: 0 │ │ IS: 0 │ └────────┘ └────────┘ ┌────────┐ ┌────────┐ │ R0: 0 │ │ R1: 0 │ └────────┘ └────────┘ Each of them can hold a 4 bit number from 0 to 15. Registers are actually just memory that the CPU can do stuff with (like the arms of the octopus), e.g. ADD R0 + R1 or INCREMENT R0. IP and IS are a bit special, because the CPU uses them to know which instruction to fetch from memory and what its value is. R0 and R1 are called general purpose registers because we can use them for whatever we want. Once you power on the CPU, it fetches an instruction from the memory address pointed by IP(instruction pointer), and puts it in IS(instruction store) and executes it and then moves IP to the next instruction, or sets it to a specific value if you are JUMPing. It keeps doing that forever and ever until you turn it off.

MEMORY:

This computer has 16 addressable memory locations, each location can hold a number between 0 and 15. You can think of it as a list of numbers, the position or index on each number is its address (a bit like house numbers on the street): 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ^ ^ ^ | | | position 0 | position 15 position 8 This however is a bit annoying to read, so using a 4x4 grid makes it a bit easier, so it would look like this: ┌────┬────┬────┬────┐ │ 0 │ 1 │ 2 │ 3 │ ├────┼────┼────┼────┤ │ 4 │ 5 │ 6 │ 7 │ ├────┼────┼────┼────┤ │ 8 │ 9 │ 10 │ 11 │ ├────┼────┼────┼────┤ │ 12 │ 13 │ 14 │ 15 │ └────┴────┴────┴────┘ top left is address 0, bottom right is address 15, so we can reach any cell by using its address. For example, if we have the value 5 on address 1 and value 7 on address 12, the memory looks like this: ┌────┬────┬────┬────┐ │ 0 │ 5 │ 0 │ 0 │ ├────┼────┼────┼────┤ │ 0 │ 0 │ 0 │ 0 │ ├────┼────┼────┼────┤ │ 0 │ 0 │ 0 │ 0 │ ├────┼────┼────┼────┤ │ 7 │ 0 │ 0 │ 0 │ └────┴────┴────┴────┘ The memory is reset every time you turn off your computer, so it starts zeroed out: ┌────┬────┬────┬────┐ │ 0 │ 0 │ 0 │ 0 │ ├────┼────┼────┼────┤ │ 0 │ 0 │ 0 │ 0 │ ├────┼────┼────┼────┤ │ 0 │ 0 │ 0 │ 0 │ ├────┼────┼────┼────┤ │ 0 │ 0 │ 0 │ 0 │ └────┴────┴────┴────┘ This is why in real computers we need hard disks, where we persist information that survives powering off the device, and when it starts we can load this information into the memory and use it. In our computer however it is possible however to pre-configure the memory to start preloaded with our program.

Instructions:

Those are the possible instructions our CPU can do: 0 HALT 1 R0 = R0 + R1 (add R0 and R1 and store the result in R0) 2 R0 = R0 - R1 (subtract R0 and R1 and store the result in R0) 3 R0 = R0 + 1 (increment R0) 4 R1 = R1 + 1 (increment R1) 5 R0 = R0 - 1 (decrement R0) 6 R1 = R1 - 1 (decrement R1) 7 BEEP 8 X PRINT X (print the next memory cell) 9 X R0 = MEMORY[X] (load the value of address X into R0) 10 X R1 = MEMORY[X] (load the value of address X into R1) 11 X MEMORY[X] = R0 (store the value of R0 into address X) 12 X MEMORY[X] = R1 (store the value of R1 into address X) 13 X JUMP X (jump to the value in the next memory cell) e.g. 13 7 means jump to address 7 14 X JUMP X IF R0 == 0 (jump to X if R0 is equal to 0) e.g. 14 7 means jump to address 7 if R0 is equal to 0 otherwise proceed with the next instruction 15 X JUMP X IF R0 != 0 (jump to X if R0 is *not* equal to 0) You see some of the instructions take 1 memory cell, like INCREMENT or BEEP, but others take two cells like PRINT or JUMP. You can also see that it cant just go and do addition or subtraction directly in memory, first it needs to load the values from memory into R0 and R1 and then do addition and then put it back in memory. The JUMP instructions are also called BRANCH instructions, BRANCH, BRANCH IF ZERO, BRANCH IF NOT ZERO or B, BZ, BNZ for short.

How does the computer work:

Lets look at the computer as a whole while it has the following program preloaded in memory: 3 11 4 8 0. ┌────────┐ ┌────────┐ │ IP: 0 │ │ IS: 0 │ └────────┘ └────────┘ ┌────────┐ ┌────────┐ │ R0: 0 │ │ R1: 0 │ └────────┘ └────────┘ ┌────┬────┬────┬────┐ │ 3 │ 11 │ 4 │ 8 │ ├────┼────┼────┼────┤ │ 0 │ 0 │ 0 │ 0 │ ├────┼────┼────┼────┤ │ 0 │ 0 │ 0 │ 0 │ ├────┼────┼────┼────┤ │ 0 │ 0 │ 0 │ 0 │ └────┴────┴────┴────┘ Once we boot the computer this is what the CPU will do: * FETCH the value at address IP into IS * EXECUTE the instruction from IS * UPDATE the IP location accordingly e.g. if the instruction is 13 7 (JUMP TO 7), it will just do IP = 7 or if it is just 3 (INCREMENT R0) it will do IP = IP + 1 * GOTO FETCH, forever and ever so in our case it will look like this: IP = 0, IS = 3 increment the value of R0 IP = IP + 1 IP = 1, IS = 11 4 store the value of R0 on address 4 IP = IP + 2 IP = 3, IS = 8 1 PRINT 1, you see when we started on address 4 we had the value 0 but after the previous instruction, we put 1 there, so when we execute the print instruction it will print 1 IP = 5, IS = 0 the value at address 5 is 0, so the CPU will HALT and not update the instruction pointer(IP) anymore Another program we could examine: 7 8 4 13 0 ┌────────┐ ┌────────┐ │ IP: 0 │ │ IS: 0 │ └────────┘ └────────┘ ┌────────┐ ┌────────┐ │ R0: 0 │ │ R1: 0 │ └────────┘ └────────┘ ┌────┬────┬────┬────┐ │ 7 │ 8 │ 4 │ 13 │ ├────┼────┼────┼────┤ │ 0 │ 0 │ 0 │ 0 │ ├────┼────┼────┼────┤ │ 0 │ 0 │ 0 │ 0 │ ├────┼────┼────┼────┤ │ 0 │ 0 │ 0 │ 0 │ └────┴────┴────┴────┘ The execution is as follows: IP = 0, IS = 7 BEEP IP = IP + 1 IP = 1, IS = 8 4 PRINT 4 IP = IP + 2 IP = 3, IS = 13 0 JUMP TO ADDRESS 0 IP = 0 IP = 0, IS = 7 BEEP IP = IP + 1 ... This is an infinite loop, it will beep and print 4 forever.

Overflow and Underflow:

Our computer does not understand negative numbers, so if R0 is 0 and you decrement it, it will turn into 15 (which is the maximum value) 0 - 1 is 15 (subtract 1 from 0) 0 - 2 is 14 (subtract 2 from 0) This is called integer underflow You also can not have numbers bigger than 15, so if R0 is 15 and you increment it, it will turn into 0. If R0 is 2 and you add 15 to it, it will turn into 1 1 + 14 is 15 1 + 15 is 0 2 + 15 is 1 3 + 15 is 2 .. 5 + 15 is 4 It is the same for IP, it overflows just like R0 and R1 This is called integer overflow

Code is Data:

You can see that the processor absolutely can not distinguish between code and data, it just fetches wherever IP points to and executes it. It does not have concepts like variables or numbers or letters or anything like it. It just sees its registers and the current instruction and does it blindly. Because code and data both live in the memory we can make a program that writes itself, a self modifying program. ┌────────┐ ┌────────┐ │ IP: 0 │ │ IS: 0 │ └────────┘ └────────┘ ┌────────┐ ┌────────┐ │ R0: 0 │ │ R1: 0 │ └────────┘ └────────┘ ┌────┬────┬────┬────┐ │ 5 │ 11 │ 3 │ 0 │ ├────┼────┼────┼────┤ │ 0 │ 0 │ 0 │ 0 │ ├────┼────┼────┼────┤ │ 0 │ 0 │ 0 │ 0 │ ├────┼────┼────┼────┤ │ 0 │ 0 │ 0 │ 0 │ └────┴────┴────┴────┘ The execution is as follows: IP = 0, IS = 5 DECREMENT R0, R0 = 0 - 1 (since R0 is 0 at start, it will underflow to 15) IP = IP + 1 IP = 1, IS = 11 3 STORE R0 in address 3, MEMORY[3] = 15 (on addr 3 we had HALT before, but now it will be JUMP IF R0 != 0) IP = IP + 2 IP = 3, IS = 15 0 JUMP TO 0 IF R0 != 0 (since R0 is 15, it is not 0, so we will jump) IP = 0 IP = 0, IS = 5 DECREMENT R0, R0 = 15 - 1 IP = IP + 1 IP = 1, IS = 11 3 STORE R0 in address 3, MEMORY[3] = 14 (we had JMP IF R0 != 0 before, but now it will be JUMP IF R0 == 0) IP = IP + 2 IP = 3 IS = 14 0 JUMP TO 0 IF R0 == 0 (since R0 is 14, that means that it is not 0, so we wont jump) IP = IP + 2 IP = 5 IS = 0 HALT So this program changes itself twice, first it changes HALT to JUMP IF NOT ZERO, and then to JUMP IF ZERO.

Game:

There are many possible ways to play the game. * Random Memory Corruption roll a dice to tell you which address to corrupt, and roll again to tell you which value, then you have to explain what the corrupted program will do. * Limited Monkey Patching you can increment the value of one memory address to make the program do something else. * Monkey Patching modify the program to do something else, you have total control * Break The Register make the program to do something else by changing R0 and R1's values before the program starts * Hacking requires two people, and the goal is to HALT the other player's computer without using the HALT instruction, you can modify one memory address per turn * Composing You can interrupt the current program, and load new program into memory but leaving the values of IP, R0 and R1 from the previous program. Ideally you will find what is working with your kids, could be dice, could be treasure hunt, could be 'whoever guesses the result of the program the fastest wins'. Or you could try to have 'inverted computer' that reads the instructions backwards. I am still working on game ideas with the printed cards so if you have an idea, please send me an email at b0000@fastmail.com.