Tanstack Start | SVC16: Simplest Virtual Computer

SVC16: Simplest Virtual Computer(github.com)

207 points by thunderbong 7 months ago | 66 comments

machinestops 7 months ago
The ISA leaves something to be desired for "simplest". Simple, sure, but parameters (and unused ones, at that!)? Memory copy instructions? Multiply and no shifts? Addition _and_ subtraction?
Others have mentioned Subleq (Subtract And Branch If Less Than Or Equal To), but there's more useful designs that meet all the design constraints. They state that "It is also not intended to be as simple and elegant as it could possibly be.", but it's called "The Simplest Virtual Computer" - that kind of name is a challenge.
- tmountain 7 months ago |parent
  What do you recommend?
  - machinestops 7 months ago |parent
    For what? The simplest possible ISA? Something like an OISC or a ZISC, probably. Simplest "virtual computer"? Maybe the SK(I) combinator system? Specific improvements to this architecture? My personal preference would be to ditch the three arguments that aren't always used, have two 8bit instructions per 16bit word, use a stack (to eliminate the operands). But that's just one mode of thought for a very simple computer, not the only one. This line of thought is particularly inspired by the F18 Forth chips. They're quite minimal, simple, tight CPU designs: https://www.greenarraychips.com/home/documents/greg/DB001-22...
mechagodzilla 7 months ago
It took me about 20 minutes to make a basic FPGA-friendly implementation of the CPU core in Verilog, but the 256KB of ram is pretty expensive (16-bit color seems excessive for such a simple core!).
- mechagodzilla 7 months ago |parent
  Ugh, I just noticed it's even worse - the display is double-buffered, so you actually need 384KB of ram (128KB for system RAM and 2 x 128KB display buffers!).
  - sfn42 7 months ago |parent
    Out of curiosity - why do you care? Do you want to run this thing on a calculator from the 90s or something like that?
    - mechagodzilla 7 months ago |parent
      Block RAM in FPGAs is very limited and adding something like a DDR3 controller is a hassle - of the dev boards I have lying around, they have between 48KB and 200KB of block RAM. The actual CPU logic is probably <5% of those chips. I have a board with a 16-bit, 16MB PSRAM interface, but then the memory interface is shared between the video generation and the CPU. Going with 4-bit colors or something seems like it would be more in-line with the design philosophy (but it was targeted to be emulated on a modern PC, where memory and color depth are largely free).
      - sfn42 7 months ago |parent
        After looking up the term "FPGA" I realize that my question was a bit silly. Makes sense, thanks for elaborating
- boutell 7 months ago |parent
  I wondered how long it would take for someone to do this. An amount of time approaching zero apparently. Nice.
  - mechagodzilla 7 months ago |parent
    Implementing a CPU, if you're not trying to hyper-optimize it, is reasonably straightforward. This is a very simple design. It would take a little bit more work to add proper video output and mouse input, but those blocks are easy to find for VGA and PS/2 mice. I think this would run > 100 MHz very easily on anything made in the last decade.
- imglorp 7 months ago |parent
  Have FPGAs been evolving very slowly? Modern CPUs have some 20 billion transistors and these limitations don't seem to have grown at the same rate since the '90s the last time I looked at them.
  - rdc12 7 months ago |parent
    FPGA's also have billions of transistors now, but adding more block memory means removing transistors from something else, such as LUT's, registers, DSP blocks etc.
    As always it is a tradeoff, and given many designs don't need much block memory, or need so much memory that external memory is a better choice anyway.
- naitgacem 7 months ago |parent
  Hi. Care to share the source code perhaps?
boricj 7 months ago
That instruction set looks really weird, like some sort of obscure and long-forgotten 1960s experimental architecture.
If I understand it correctly, the entire register file except for PC is effectively memory-mapped, it uses 16-bit word addressing and the peripherals (framebuffer + mouse) are accessed through dedicated instructions. Instructions are four words long, the first one only having 16 valid opcodes and the rest referencing either direct memory addresses or raw word values.
- adonovan 7 months ago |parent
  > forgotten 1960s architecture… the entire register file is memory-mapped
  1830s, in fact—the original computer, Babbage’s Analytical Engine!
Lerc 7 months ago
I like toys like this. I even have made a few of my own, doodled instruction set ideas on many scraps of paper, and so forth.
It's a bit weird that this particular instruction set annoys me so much. Perhaps annoys is the wrong word, maybe irritates is better. The entire thing I can get behind except for the instruction encoding. The instructions are too large for the space available. 128k of ram for programs and 128k for screen (and workspace area, given sync), but at 8 bytes per instruction it eats up the limited resource too quickly.
I guess that irritation comes from the efficiency hacks that these constrained environments encourage. It revives an age of clever tricks to get the most of of a little. The instruction size pokes at that aesthetic like a lump in a pillow.
I'm not sure what would be a better solution while preserving simplicity. maybe a single extra dereference to turn
```
    opcode arg1 arg2 arg3
    @(IP) @(IP+1) @(IP+2) @(IP+3)   ;; IP+=4
```
into
```
    @(@IP) @(@IP+1) @(@IP+2) @(@IP+3)   ;; IP+=1
       
```
which would be one system word per instruction lookup, from a dictionary of instructions in RAM (coders would have to be careful not to lose the ability to generate all constant values if they eliminated certain dictionary entries(
Tiny programs would get bigger , Larger programs would be smaller when they reuse instructions.
If you wanted to work against the idea of 'simplest' and embrace the notion of weird-features-because-of-technical-debt, you could make an 'upgraded' machine that has a second bank of 128k (64k words) as dedicated program memory, that is initialized at startup with the numbers 0x0000 to 0xffff. This would make the machine run SVC16 programs unmodified. You could then either have it use program ROMs or add a single instruction
```
    StoreProgMem @progMem(@arg1+@arg3) =(@arg2+@arg3)
 
```
Which would enable writing the program space.
Egads, I've been nerd sniped!
- nine_k 7 months ago |parent
  Why, there is a well-known approach to having constant-size, small opcodes: a data stack. The only non-constant size instruction would be the one that puts an immediate argument onto the stack. Jumps would pop the target address off the stack, too. The stack should be the width of the ALU, say, 32 bit, while the machine codes could be half a byte, like here.
  I'm very surprised this smallest machine is not a stack machine.
  - ant6n 7 months ago |parent
    I feel similar about the instruction encoding.
    Another way could be to use registers, possibly just one register. Like in old machines that had an accumulator, which was implied. So „Add x“ adds x to the accumulator.
    In some sense, there is already a register file in the proposed VM, the whole memory.
    Another, related way, to reduce instruction size is to not allow all operands to be 16 bit, that is, have some registers that are more „special“ than others. For example, say the first 16 memory locations are special. Then it’s possible to have a
```
    Opcode, arg1, arg2, arg3
```
    Encoding using 4 bits each, for 16 bit instructions. The problem is that some instructions will need a full 16 bit argument, for example some load immediate, or goto.
    Common ways to deal with that:
    1) variable width instructions (16/32 bit) - makes machine less simple.
    2) always use 32 bit instructions, meaning instructions are
```
    opcode, arg4bit, arg4bit, arg16bit.
```
    This is simpler, but still pretty wasteful for instruction length. Could also use
```
    opcode4bit, arg6bit, arg6bit, arg16bit
```
    This turns the first 64 memory locations into special registers, but uses awkward 6bit values.
    3) use 8 bit immediates with hi/lo access (like arm). So then u have instructions like
```
    mov_hi dest4, imm8
    nov_lo dest4, imm8
```
    Setting the hi/lo byte of the register file (I.e. the first 16 memory locations). This in turn means there will be more instructions, they will not work directly on memory (like goto needs 2 loads). So while there is the nice property that instruction length equals word length, it also means more kinds of instructions (perhaps, like jump relative), more complicated instructions, and more instructions needing to be executed to accomplish similar tasks (eg 3 instructions for arbitrary jump).
    All these considerations are probably counter to wanting a maximally simple machine.
- zellyn 7 months ago |parent
  Do what Woz did and write a Sweet16 interpreter for the machine!
  https://en.wikipedia.org/wiki/SWEET16
jonjacky 7 months ago
Compare this to the LGP-30, a "small" (desk-sized) vacuum tube computer of the 1950s. It also had just 16 instructions. It nominally had some registers, but these were just locations on the same magnetic drum that stored all the other data.
https://en.wikipedia.org/wiki/LGP-30
- iainmerrick 7 months ago |parent
  What a great Wikipedia page! This is a real labour of love, I love the detail and the wry asides ("Every token had to be delimited by an apostrophe, making it hard to read and even harder to prepare tapes").
  17ms for a multiplication, wow. Incredible to think that's now the budget for an entire 3D rendered frame.
  Like others in this thread I was very skeptical of the SVC16 design, but this real computer feels weirdly unbalanced in the same kind of way. 31-bit instructions, but only 4K words of memory (16KB) in total? 17ms multiplication, but it can also do division? Very strange.
PaulHoule 7 months ago
It is like the PDP-8 (tiny instruction set) and the TMS 9900 (no registers) but the 16-bit word size for memory did not catch on for real hardware.
- vardump 7 months ago |parent
  8086, 80286, 680[01]0 all had a 16-bit word size and could only access RAM at 16-bit granularity (I/O is a different story). Of course they had the ability to extract just a byte as well.
  68000 generates a bus error if you try unaligned 16/32-bit access.
  - PaulHoule 7 months ago |parent
    But you could have had a system with 64k 16-bit words that gives 128k bytes.
- nabla9 7 months ago |parent
  16 bit microcontrollers are a $25 billion market.
  - p_l 7 months ago |parent
    Those are usually supporting byte accesses, not limited to word accesses.
- nxobject 7 months ago |parent
  Doesn't the PDP-11 and competing microcomputers of that era count? Being the PDP-11, it's lavishly flexible (i.e. it can do byte-addressed as well), but ultimately 16-bit words were conventional.
- Tor3 7 months ago |parent
  I worked with a 16-bit minicomputer as late as 1995.. word addressable, not byte addressable.
  - iainmerrick 7 months ago |parent
    I worked on embedded code in the early 2000s using 16-bit DSPs. Texas Instruments C54x if I remember right.
    Not being byte-addressable was a real pain. There was a C compiler, but off-the-shelf C code hardly ever worked properly because everyone assumes CHAR_BIT is 8.
    - Tor3 7 months ago |parent
      The one I worked with had a C compiler as well, though it was K&R and not ANSI. The instruction set supported operations on 8-bit values by op codes which selected the lower or higher 8 bits of a word, so the compiler itself worked transparently as you would expect (the architecture was big endian so that had to be taken into account sometimes, but that's true of byte-addressable systems too).
ilaksh 7 months ago
This is really simple and approachable. Reminds me slightly of Forth or subleq.
https://colorforth.github.io/index.html
https://en.wikipedia.org/wiki/Forth_(programming_language)
https://github.com/davidar/subleq/blob/master/README.md
- ivanjermakov 7 months ago |parent
  Can't not mention a great game about programming a SIC using subleq called SIC-1: https://store.steampowered.com/app/2124440/SIC1/
  Second part of the game requires writing self-modifying code which is quite mindboggling!
- 7 months ago |parent
  [deleted]
- howerj 7 months ago |parent
  Here's a Forth that runs on SUBLEQ https://github.com/howerj/subleq, there's also a file system for it https://github.com/howerj/ffs.
llm_trw 7 months ago
I give you the worlds simplest physical computer:
https://en.wikipedia.org/wiki/CARDboard_Illustrative_Aid_to_...
gnarbarian 7 months ago
I like nock.
the official specification for Nock, the assembly language used by the Urbit virtual machine, fits on a t-shirt: The Nock specification is only 200 words long It can be compressed to 371 bytes Nock is a low-level, homoiconic combinator language that's similar to JVM code. It's designed to be simple, and is interpreted by Vere. Hoon is the practical layer of Urbit, and it compiles itself to Nock, the simple layer.
- pcmoore 7 months ago |parent
  A more concise and less opaque alternative https://github.com/tibru/tibru
- gnarbarian 7 months ago |parent
  https://docs.urbit.org/language/nock/reference/specification
d3VwsX 7 months ago
A bit confused by the goto and skip instructions. Why multiply by 4 in one, but not the other? Sounds reasonable that both keep the target address a valid instruction pointer (aligned to 4 words)?
Adding two words to create an address is a fun variation of segment pointers, but even more wasteful than x86 16-bit segment+offset pointers (not a complaint, just an observation).
- RaftPeople 7 months ago |parent
  > Adding two words to create an address is a fun variation of segment pointers, but even more wasteful than x86 16-bit segment+offset pointers (not a complaint, just an observation)
  I'm curious why you think it's wasteful?
  It seems similar to the 6809 indexed addressing modes which I liked (a long time ago).
  - d3VwsX 7 months ago |parent
    Two 16-bit numbers could be combined to form a 32-bit pointer. Used as segment<<4+offset they just form a 20-bit pointer. Feels like wasting 12 bits to me.
    - jasomill 7 months ago |parent
      You can also look at 8086 segment/offset addressing as an attempt to avoid wasting 12 bits when using pairs of 16 bit registers to represent a 20 bit address, by allowing implicitly-sized <=64k byte segments to align on arbitrary 16 byte boundaries vs. the 64k byte alignment implied by segment<<16 + offset.
- nilvi26 7 months ago |parent
  [dead]
ryukoposting 7 months ago
Funny, I've been working on a similar project, inspired by uxn. Initially, mine was also going to use 16-bit words, but I've since moved to 32-bit signed words. It'll be a bit heftier than this project in other ways, too, but not by much.
max_7 months ago
I remember coming across a JS repo that had lots of minimum viable implementations.
It has simple processors, simple compilers etc.
I wish I could find it.
- nroize 7 months ago |parent
  Not JS but there's Build Your Own X https://github.com/codecrafters-io/build-your-own-x
  - azinman2 7 months ago |parent
    This is super cool. Thanks for posting it. I dream of being retired and being able to go through each one.
nefarious_ends 7 months ago
Wow this looks cool, I’m going to try writing a program. I have recently been learning about computer architecture (specifically older 8 bit CPUs) so when I read “There are no CPU registers” I was very intrigued.
bbminner 7 months ago
Can someone explain to me the appeal of these virtual console projects? Is it like lock picking or demoscene in a sense that you need to figure how to achieve impressive things under tight constraints?
- Retr0id 7 months ago |parent
  I enjoy implementing these sorts of things from a spec. It's very rewarding to have other peoples' games boot up in your emulator.
- Lerc 7 months ago |parent
  I feel like it is the same form of intellectual stimulation that one gets from crosswords or sudoku but with an added creative dimension. Where crosswords and sudoku present a challenge with a fixed target goal, the demo scene and virtual consoles give you the difficulty with an open ended goal.
  Code golf is somewhere in the middle, Often you have a fixed goal, but there is no one 'right' solution, creativity can find many paths to achieve the desired output while reducing line or character count.
  So I do
  Dweets https://beta.dwitter.net/u/lerc/top
  My own fantasy console https://k8.fingswotidun.com/static/ide/?gist=ad96329670965dc...
  A stack machine image generator limited to 50 characters of program code https://c50.fingswotidun.com/
  Why do it? Why play with Lego? It's interesting and you get to see what you make.
  The proliferation of fantasy consoles is simply because making a fantasy console appeals to the same aesthetic. See what you can make.
- ktpsns 7 months ago |parent
  Purpose, my guess: Fun, joy of understanding, tinkering, learning and education.
  Such projects can grow and eventually somebody learning on this builds the next Language or Qemu.
stevefan1999 7 months ago
To me the simplest virtual computer that is useful, would be a Lisp machine with macro expansion
gustavopezzi 7 months ago
I like this. Thanks for sharing.
anthk 7 months ago
subleq can be done in few lines of AWK, you could just use busybox instead of a huge Rust toolchain.
otaka 7 months ago
As for me, it is bad that there is no any permanent storage like disk. Also the lack of keyboard is a big limitation
- catlifeonmars 7 months ago |parent
  You could memory map additional device drivers at the end of the memory as a way to extend with arbitrary peripherals
- d3VwsX 7 months ago |parent
  To keep things simple maybe just handle that using emulator save-states instead?
revskill 7 months ago
I thought it is excel.
7 months ago
[deleted]
ljlolel 7 months ago
Worse is better
joshu 7 months ago
stack pointer? offset addressing mode?
anfilt 7 months ago
The ISA is a bit complex for the simplest Virtual Computer. One Instruction computers are a thing: https://en.wikipedia.org/wiki/One-instruction_set_computer#I...
- ivanjermakov 7 months ago |parent
  Readme mentioned that creating the simplest possible computer wasn't the goal
- ks2048 7 months ago |parent
  I had a similar thought. Better name is SVC16: Simple Virtual Computer
- 7 months ago |parent
  [deleted]