The ISA leaves something to be desired for "simplest". Simple, sure, but parameters (and unused ones, at that!)? Memory copy instructions? Multiply and no shifts? Addition _and_ subtraction?
Others have mentioned Subleq (Subtract And Branch If Less Than Or Equal To), but there's more useful designs that meet all the design constraints. They state that "It is also not intended to be as simple and elegant as it could possibly be.", but it's called "The Simplest Virtual Computer" - that kind of name is a challenge.
What do you recommend?
For what? The simplest possible ISA? Something like an OISC or a ZISC, probably. Simplest "virtual computer"? Maybe the SK(I) combinator system? Specific improvements to this architecture? My personal preference would be to ditch the three arguments that aren't always used, have two 8bit instructions per 16bit word, use a stack (to eliminate the operands). But that's just one mode of thought for a very simple computer, not the only one. This line of thought is particularly inspired by the F18 Forth chips. They're quite minimal, simple, tight CPU designs: https://www.greenarraychips.com/home/documents/greg/DB001-22...
It took me about 20 minutes to make a basic FPGA-friendly implementation of the CPU core in Verilog, but the 256KB of ram is pretty expensive (16-bit color seems excessive for such a simple core!).
Ugh, I just noticed it's even worse - the display is double-buffered, so you actually need 384KB of ram (128KB for system RAM and 2 x 128KB display buffers!).
Out of curiosity - why do you care? Do you want to run this thing on a calculator from the 90s or something like that?
Block RAM in FPGAs is very limited and adding something like a DDR3 controller is a hassle - of the dev boards I have lying around, they have between 48KB and 200KB of block RAM. The actual CPU logic is probably <5% of those chips. I have a board with a 16-bit, 16MB PSRAM interface, but then the memory interface is shared between the video generation and the CPU. Going with 4-bit colors or something seems like it would be more in-line with the design philosophy (but it was targeted to be emulated on a modern PC, where memory and color depth are largely free).
After looking up the term "FPGA" I realize that my question was a bit silly. Makes sense, thanks for elaborating
I wondered how long it would take for someone to do this. An amount of time approaching zero apparently. Nice.
Implementing a CPU, if you're not trying to hyper-optimize it, is reasonably straightforward. This is a very simple design. It would take a little bit more work to add proper video output and mouse input, but those blocks are easy to find for VGA and PS/2 mice. I think this would run > 100 MHz very easily on anything made in the last decade.
Have FPGAs been evolving very slowly? Modern CPUs have some 20 billion transistors and these limitations don't seem to have grown at the same rate since the '90s the last time I looked at them.
FPGA's also have billions of transistors now, but adding more block memory means removing transistors from something else, such as LUT's, registers, DSP blocks etc.
As always it is a tradeoff, and given many designs don't need much block memory, or need so much memory that external memory is a better choice anyway.
Hi. Care to share the source code perhaps?
That instruction set looks really weird, like some sort of obscure and long-forgotten 1960s experimental architecture.
If I understand it correctly, the entire register file except for PC is effectively memory-mapped, it uses 16-bit word addressing and the peripherals (framebuffer + mouse) are accessed through dedicated instructions. Instructions are four words long, the first one only having 16 valid opcodes and the rest referencing either direct memory addresses or raw word values.
> forgotten 1960s architecture… the entire register file is memory-mapped
1830s, in fact—the original computer, Babbage’s Analytical Engine!
I like toys like this. I even have made a few of my own, doodled instruction set ideas on many scraps of paper, and so forth.
It's a bit weird that this particular instruction set annoys me so much. Perhaps annoys is the wrong word, maybe irritates is better. The entire thing I can get behind except for the instruction encoding. The instructions are too large for the space available. 128k of ram for programs and 128k for screen (and workspace area, given sync), but at 8 bytes per instruction it eats up the limited resource too quickly.
I guess that irritation comes from the efficiency hacks that these constrained environments encourage. It revives an age of clever tricks to get the most of of a little. The instruction size pokes at that aesthetic like a lump in a pillow.
I'm not sure what would be a better solution while preserving simplicity. maybe a single extra dereference to turn
intoopcode arg1 arg2 arg3 @(IP) @(IP+1) @(IP+2) @(IP+3) ;; IP+=4
which would be one system word per instruction lookup, from a dictionary of instructions in RAM (coders would have to be careful not to lose the ability to generate all constant values if they eliminated certain dictionary entries(@(@IP) @(@IP+1) @(@IP+2) @(@IP+3) ;; IP+=1
Tiny programs would get bigger , Larger programs would be smaller when they reuse instructions.
If you wanted to work against the idea of 'simplest' and embrace the notion of weird-features-because-of-technical-debt, you could make an 'upgraded' machine that has a second bank of 128k (64k words) as dedicated program memory, that is initialized at startup with the numbers 0x0000 to 0xffff. This would make the machine run SVC16 programs unmodified. You could then either have it use program ROMs or add a single instruction
Which would enable writing the program space.StoreProgMem @progMem(@arg1+@arg3) =(@arg2+@arg3)
Egads, I've been nerd sniped!
Why, there is a well-known approach to having constant-size, small opcodes: a data stack. The only non-constant size instruction would be the one that puts an immediate argument onto the stack. Jumps would pop the target address off the stack, too. The stack should be the width of the ALU, say, 32 bit, while the machine codes could be half a byte, like here.
I'm very surprised this smallest machine is not a stack machine.
I feel similar about the instruction encoding.
Another way could be to use registers, possibly just one register. Like in old machines that had an accumulator, which was implied. So „Add x“ adds x to the accumulator.
In some sense, there is already a register file in the proposed VM, the whole memory.
Another, related way, to reduce instruction size is to not allow all operands to be 16 bit, that is, have some registers that are more „special“ than others. For example, say the first 16 memory locations are special. Then it’s possible to have a
Encoding using 4 bits each, for 16 bit instructions. The problem is that some instructions will need a full 16 bit argument, for example some load immediate, or goto.Opcode, arg1, arg2, arg3
Common ways to deal with that:
1) variable width instructions (16/32 bit) - makes machine less simple.
2) always use 32 bit instructions, meaning instructions are
This is simpler, but still pretty wasteful for instruction length. Could also useopcode, arg4bit, arg4bit, arg16bit.
This turns the first 64 memory locations into special registers, but uses awkward 6bit values.opcode4bit, arg6bit, arg6bit, arg16bit
3) use 8 bit immediates with hi/lo access (like arm). So then u have instructions like
Setting the hi/lo byte of the register file (I.e. the first 16 memory locations). This in turn means there will be more instructions, they will not work directly on memory (like goto needs 2 loads). So while there is the nice property that instruction length equals word length, it also means more kinds of instructions (perhaps, like jump relative), more complicated instructions, and more instructions needing to be executed to accomplish similar tasks (eg 3 instructions for arbitrary jump).mov_hi dest4, imm8 nov_lo dest4, imm8
All these considerations are probably counter to wanting a maximally simple machine.
Do what Woz did and write a Sweet16 interpreter for the machine!
Compare this to the LGP-30, a "small" (desk-sized) vacuum tube computer of the 1950s. It also had just 16 instructions. It nominally had some registers, but these were just locations on the same magnetic drum that stored all the other data.
What a great Wikipedia page! This is a real labour of love, I love the detail and the wry asides ("Every token had to be delimited by an apostrophe, making it hard to read and even harder to prepare tapes").
17ms for a multiplication, wow. Incredible to think that's now the budget for an entire 3D rendered frame.
Like others in this thread I was very skeptical of the SVC16 design, but this real computer feels weirdly unbalanced in the same kind of way. 31-bit instructions, but only 4K words of memory (16KB) in total? 17ms multiplication, but it can also do division? Very strange.
It is like the PDP-8 (tiny instruction set) and the TMS 9900 (no registers) but the 16-bit word size for memory did not catch on for real hardware.
Doesn't the PDP-11 and competing microcomputers of that era count? Being the PDP-11, it's lavishly flexible (i.e. it can do byte-addressed as well), but ultimately 16-bit words were conventional.
8086, 80286, 680[01]0 all had a 16-bit word size and could only access RAM at 16-bit granularity (I/O is a different story). Of course they had the ability to extract just a byte as well.
68000 generates a bus error if you try unaligned 16/32-bit access.
But you could have had a system with 64k 16-bit words that gives 128k bytes.
16 bit microcontrollers are a $25 billion market.
Those are usually supporting byte accesses, not limited to word accesses.
I worked with a 16-bit minicomputer as late as 1995.. word addressable, not byte addressable.
I worked on embedded code in the early 2000s using 16-bit DSPs. Texas Instruments C54x if I remember right.
Not being byte-addressable was a real pain. There was a C compiler, but off-the-shelf C code hardly ever worked properly because everyone assumes CHAR_BIT is 8.
I give you the worlds simplest physical computer:
https://en.wikipedia.org/wiki/CARDboard_Illustrative_Aid_to_...
This is really simple and approachable. Reminds me slightly of Forth or subleq.
https://colorforth.github.io/index.html
Can't not mention a great game about programming a SIC using subleq called SIC-1: https://store.steampowered.com/app/2124440/SIC1/
Second part of the game requires writing self-modifying code which is quite mindboggling!
Here's a Forth that runs on SUBLEQ https://github.com/howerj/subleq, there's also a file system for it https://github.com/howerj/ffs.
I like nock.
the official specification for Nock, the assembly language used by the Urbit virtual machine, fits on a t-shirt: The Nock specification is only 200 words long It can be compressed to 371 bytes Nock is a low-level, homoiconic combinator language that's similar to JVM code. It's designed to be simple, and is interpreted by Vere. Hoon is the practical layer of Urbit, and it compiles itself to Nock, the simple layer.
A more concise and less opaque alternative https://github.com/tibru/tibru
Funny, I've been working on a similar project, inspired by uxn. Initially, mine was also going to use 16-bit words, but I've since moved to 32-bit signed words. It'll be a bit heftier than this project in other ways, too, but not by much.
A bit confused by the goto and skip instructions. Why multiply by 4 in one, but not the other? Sounds reasonable that both keep the target address a valid instruction pointer (aligned to 4 words)?
Adding two words to create an address is a fun variation of segment pointers, but even more wasteful than x86 16-bit segment+offset pointers (not a complaint, just an observation).
> Adding two words to create an address is a fun variation of segment pointers, but even more wasteful than x86 16-bit segment+offset pointers (not a complaint, just an observation)
I'm curious why you think it's wasteful?
It seems similar to the 6809 indexed addressing modes which I liked (a long time ago).
Two 16-bit numbers could be combined to form a 32-bit pointer. Used as segment<<4+offset they just form a 20-bit pointer. Feels like wasting 12 bits to me.
I remember coming across a JS repo that had lots of minimum viable implementations.
It has simple processors, simple compilers etc.
I wish I could find it.
Not JS but there's Build Your Own X https://github.com/codecrafters-io/build-your-own-x
This is super cool. Thanks for posting it. I dream of being retired and being able to go through each one.
Wow this looks cool, I’m going to try writing a program. I have recently been learning about computer architecture (specifically older 8 bit CPUs) so when I read “There are no CPU registers” I was very intrigued.
Can someone explain to me the appeal of these virtual console projects? Is it like lock picking or demoscene in a sense that you need to figure how to achieve impressive things under tight constraints?
I enjoy implementing these sorts of things from a spec. It's very rewarding to have other peoples' games boot up in your emulator.
I feel like it is the same form of intellectual stimulation that one gets from crosswords or sudoku but with an added creative dimension. Where crosswords and sudoku present a challenge with a fixed target goal, the demo scene and virtual consoles give you the difficulty with an open ended goal.
Code golf is somewhere in the middle, Often you have a fixed goal, but there is no one 'right' solution, creativity can find many paths to achieve the desired output while reducing line or character count.
So I do
Dweets https://beta.dwitter.net/u/lerc/top
My own fantasy console https://k8.fingswotidun.com/static/ide/?gist=ad96329670965dc...
A stack machine image generator limited to 50 characters of program code https://c50.fingswotidun.com/
Why do it? Why play with Lego? It's interesting and you get to see what you make.
The proliferation of fantasy consoles is simply because making a fantasy console appeals to the same aesthetic. See what you can make.
Purpose, my guess: Fun, joy of understanding, tinkering, learning and education.
Such projects can grow and eventually somebody learning on this builds the next Language or Qemu.
I like this. Thanks for sharing.
As for me, it is bad that there is no any permanent storage like disk. Also the lack of keyboard is a big limitation
You could memory map additional device drivers at the end of the memory as a way to extend with arbitrary peripherals
To keep things simple maybe just handle that using emulator save-states instead?
Worse is better
stack pointer? offset addressing mode?
I thought it is excel.
subleq can be done in few lines of AWK, you could just use busybox instead of a huge Rust toolchain.
The ISA is a bit complex for the simplest Virtual Computer. One Instruction computers are a thing: https://en.wikipedia.org/wiki/One-instruction_set_computer#I...
Readme mentioned that creating the simplest possible computer wasn't the goal
I had a similar thought. Better name is SVC16: Simple Virtual Computer