Tanstack Start | Pure Silicon Demo Coding: No CPU, No Memory, Just 4k Gates

Pure Silicon Demo Coding: No CPU, No Memory, Just 4k Gates(a1k0n.net)

412 points by a1k0n a day ago | 70 comments

yoan9224 12 hours ago
This is absolutely wild. Rendering graphics with just combinational logic and no frame buffer is the kind of constraint that breeds creativity.
The HAKMEM sine/cosine generator is such an elegant choice - it's numerically stable in fixed-point and requires only adds and bit-shifts. Perfect for hardware. I used a similar approach once for generating test patterns in an FPGA.
The fact that you can iterate on this in simulation, then deploy to actual silicon via Tiny Tapeout for $150 is honestly mind-blowing. We're living in the future.
- tails4e 10 hours ago |parent
  How does this compare to CORDIC for sin/cos generation? Which is more accurate, etc ?
  - yoan9224 9 hours ago |parent
    Good question! CORDIC and HAKMEM Item 149 are both hardware-friendly, but have different trade-offs:
    CORDIC: - Iterative algorithm (needs multiple clock cycles) - Accuracy improves with more iterations - Generates both magnitude and phase - Typical hardware implementation: 12-16 iterations for decent precision
    HAKMEM (Item 149): - Single-cycle computation (just two adds per step) - Uses the recurrence: x' = x - εy, y' = y + εx - Accuracy depends on word width and epsilon choice - Numerically stable in exact arithmetic if ε² < 2
  - a1k0n 9 hours ago |parent
    CORDIC is more accurate, but takes as many iterations as you have bits of precision in your angle. Another demo called Warp in this contest used pipelined CORDIC to do atan2 on every pixel to create a tunnel, which is super impressive.
    https://www.youtube.com/watch?v=K9mu3getxhU&t=780s
- zahlman 9 hours ago |parent
  > The fact that you can iterate on this in simulation, then deploy to actual silicon via Tiny Tapeout for $150 is honestly mind-blowing. We're living in the future.
  It's really cool but it doesn't seem practical at all. They aren't setting up print runs, just one-offs (https://tinytapeout.com/faq/#how-many-chips-will-i-receive-c...) and $150 could get you... many orders of magnitude more power than that.
  ... For that matter, apparently the microcontroller in the dev kit is a https://en.wikipedia.org/wiki/RP2040 , which seems like a beast in comparison. And it's still available for less than $1 USD on PiShop.
  - immibis 7 hours ago |parent
    Tiny Tapeout's schtick is that for $150 you can get your chip design made at all. It's not a mass production run.
    Remind me to participate in the next one!
xphos a day ago
As a computer science guy who interlops in computer engineering i really want to find time to build something cool like this and tapeout. The retro architectures for rendering are simple but fun! I love the project
- Neywiny a day ago |parent
  I recommend getting started like the author did: simulation first, then FPGA. Honestly FPGA will take you very far. I always get a kick out of being able to design my own SoC. "Hmmm I need 9 separate I2C ports... Ok, copy block, paste paste paste..." Or if you have an operation in software that's taking forever you can write an accelerator for it
  - checker659 13 hours ago |parent
    Do you know if there are any tutorials that use bounded model checking tools from the very get go? For verilog or VHDL.
  - 8f2ab37a-ed6c a day ago |parent
    What are the best modern tools to get started with in simulation for those who have never dabbled before?
    - Neywiny a day ago |parent
      I do the vast majority of my work on xilinx and it's easiest to just use the built in simulator. It's free and supports vhdl and verilog. Most support just one. For lattice and microchip work I use what the tool provides which is usually a cut down modelsim or something
    - sehugg 17 hours ago |parent
      Try https://8bitworkshop.com/verilog to get started with dabbling
    - y1n0 21 hours ago |parent
      start here: https://github.com/YosysHQ/oss-cad-suite-build
    - sweetjuly a day ago |parent
      The other commentator mentioned Verilator (which is indispensable in larger designs) but you may also want to grab Icarus Verilog too. It's a FOSS simulator and, unlike Verilator, is 2-bit and so it handles X ("don't care") and Z ("high impedance") signals. It's ridiculously slow compared to Verilator but the greater fidelity can be valuable depending on what you're trying to do.
    - vanjoe a day ago |parent
      Verilator is very good. It's faster than anything else, and it is free. The downsides are it won't stimulate encrypted IP blocks. And it doesn't do mixed language sim, so vhdl is no bueno.
  - alfiedotwtf 15 hours ago |parent
    Are there any open or at least standard FPGAs that the open source community flock to? Last time I looked into FPGAs, it was mostly closed architecture and proprietary tools
    - Neywiny 14 hours ago |parent
      Not for anything mid to higher range, but I believe there's open source tooling for some of the older Lattice and Xilinx parts. I would say for me it's not as big a deal as on the software side, because each vendor's hardware tends to be pretty different from each other anyway.
      - alfiedotwtf 8 hours ago |parent
        Dang, sounds like there’s still a bit of lock in. That’sa shame
        Neywiny 7 hours ago |parent
        I think there will always be vendor lock in. The same way there have been architectural differences between Intel and AMD's x86, or even stuff like one specific chip/family tanking performance because one instruction was implemented differently, you won't be able to guarantee efficient utilization of different vendor/families.
        For example, I've taken code optimized for Xilinx, ran it for another vendor, and resource count ballooned because stuff that was built-in/free on one wasn't on the other. It's a lot of work to truly make generic code and usually just means switching out modules per vendor.
- oofbey a day ago |parent
  It’s amazing and wonderful to see the Internet support these tiny cliques of interest. Having everybody connected leads to homogenization of culture in some ways, but it also supports these couple dozen (?) people around the world finding each other for this amazing little competition.
  - anonymous908213 a day ago |parent
```
   Having everybody connected leads to homogenization of culture in some ways
```
    The internet may hypothetically homogenize culture relative to a society that does not have any kind of mass communication at all, but relative to the world it was actually introduced into, the internet has completely balkanised the culture. Prior to the internet, we had television, cinema, literature, radio, and newspapers, which were all centralised and controlled enough that they created a shared monoculture in nations. A signifant portion of a country's population would watch, read, and listen to the same media. The internet bucked that trend, allowing all kinds of new subcultures to pop up and to more easily cross national boundaries.
    - adrianN 20 hours ago |parent
      Then algorithms optimized content for addictiveness and we’re in a world where a large part of the world looks at the same set of „influencers“.
    - therein a day ago |parent
      Yeah, back in the day you would go to school the next day after a show that everyone watches released its new episode, it aired on the prime-time slot on the primary TV channel, and you'd discuss what happened in that episode, or have some references or new jokes. Created a common culture.
      - amarant a day ago |parent
        I remember those days. As the only kid in school who didn't watch Lost, those days sucked
RossBencina a day ago
I was curious about the long-term stability of the cited HAKMEM sin/cos generator. I found an overview here: https://news.ycombinator.com/item?id=3111501 (EDIT: I'm still not sure about stability, apparently it is stable in exact arithmetic under certain conditions.) Coincidentally it is related to the Verlet integration video I posted last week: https://news.ycombinator.com/item?id=46253592
- a1k0n a day ago |parent
  Yeah, it is exact in this specific circumstance. But yes, it's exactly the same trick; I also enjoyed that video in my Youtube recommender feed last week!
intalentive a day ago
I like how the grid pulses with the kick drum. Nice touch.
xecaz 17 hours ago
Wow, nice work!! Coming from demo/intro coding where you have memory and a driver for audio(x86), this is very impressing.
glimshe a day ago
Reminds me of college: "Hardware and Software are logically equivalent"
- amelius a day ago |parent
  Writing hardware is like writing software except parallelism is way cheaper, but mistakes are way more expensive.
  - lucyjojo a day ago |parent
    that doesnt seem like a good tradeoff...
    - Joel_Mckay 14 hours ago |parent
      Hardware takes 20 years to learn how to build properly.
      Software takes 1 year under someone smart in a production environment.
      People that conflate the two... longer or more likely never.. =3
      - jacquesm 12 hours ago |parent
        > Software takes 1 year under someone smart in a production environment.
        That's very funny.
        Joel_Mckay 4 minutes ago |parent
        Be honest, most Software people find utility in artifacts which are a mysterious black box with an emulated abstraction.
        During a career role most have no idea "why" chips were designed and built a certain way, nor require this information to work within abstract domains.
        In many ways, vibe-coders are the absurd optimization of a naive trajectory toward zero workmanship standards. =3
        https://en.wikipedia.org/wiki/Five_stages_of_grief
- Joel_Mckay 14 hours ago |parent
  Whoever said that was mistaken... or worked at Intel. lol =3
  https://en.wikipedia.org/wiki/Metastability
  https://en.wikipedia.org/wiki/Clock_domain_crossing
Archit3ch a day ago
I'm tempted to put together an FPAA with Tiny Tapeout, but it likely won't fit in the allocated area.
- Taniwha a day ago |parent
  TT allows you to pay more and build multi-block designs
- Joel_Mckay 14 hours ago |parent
  Check the switching speed specification, and shared i/o bank configuration.
  The project has a narrow scope of use-cases. =3
  - Archit3ch 13 hours ago |parent
    Switching speed: should be good enough for audio in the kHz range, even for off-chip control.
    Analog i/o pins: definitely limited, even if you purchase the highest option available (6).
datameta 21 hours ago
Very impressive stuff. I used to frequent the JS demoscene, mostly dwitter - but this is on a whole other level.
Oh shit, this prompted me to check and turns out TinyTapeout is back to life! https://tinytapeout.com/
openinfrared a day ago
Really cool!
idiotsecant a day ago
No x, no y, just Z is a pattern so often used by chatGPT it has started to bleed into common usage by people who maybe aren't even using an LLM.
- layer8 a day ago |parent
  Or maybe ChatGPT picked it up from common usage.
  - idiotsecant a day ago |parent
    It was used occasionally before chatGPT but it has exploded since then.
    - immibis 7 hours ago |parent
      Apparently ChatGPT speaks like lower-class Kenyans. You can guess why.
- anthomtb 9 hours ago |parent
  For everyone who is as dumb as I am, the comment pertains to the title.
  x=CPU y=Memory Z=4k gates
- peddling-brink a day ago |parent
  Language is fluid. This is ok.
  There are many bad things about LLMs, but a benign shift in popular language usage isn't one of them.
  - mschuster91 a day ago |parent
    > There are many bad things about LLMs, but a benign shift in popular language usage isn't one of them.
    Organic shifts in language are fine. What is not fine is Big Money (which most forms of AI are) manipulating society at large - and that's not just the AI companies' doing. Think of Tiktok leading people to say "unalive" instead of the various clear words before (e.g. kill, murder, executed, run over by car, mauled to death by animal).
  - idiotsecant a day ago |parent
    I disagree. It's a sign of what is essentially cultural contamination by an LLM. There is something vaguely gross about it, like when people start repeating advertising slogans. It's a sign that someone spent enough money that they directly rewired our brains.
    - peddling-brink a day ago |parent
      Do you get grossed out when you step on a linoleum floor? Or you ride an escalator? Or drink out of a thermos?
      Culture contaminates.
    - Marazan a day ago |parent
      > like when people start repeating advertising slogans
      but without the craft of a good advertising slogan. So worse!
      - attila-lendvai a day ago |parent
        ...and way more centralized and powerful.
- fsckboy a day ago |parent
  when i was running for 5th grade class president a number of decades ago, my campaign sign slogan was a "no x, no y, just z" snowclone.
BoredPositron a day ago
Reminds me of the time we repaired old pinball machines in trade school. Good times.
startupsfail a day ago
Wow, I'm looking at current "Open Shuttles", a license to use 4KB of SRAM in the project is $2500. But it comes with Wishbone Bus interface!
> 1024x32 Commercial SRAM > CF_SRAM_1024x32 > Commercial SRAM: 1024 words x > 32 bits (4KB) with Wishbone Bus interface > Area: 0.17mm² > GPIOs: 0 > License: Commercial - $2500 per project
Dwedit a day ago
If you have registers, it's not "no memory".
- hackernudes a day ago |parent
  If you have flip flops, it's not "no memory".
  If you have a ROM, it's not "no memory".
  Needlessly pedantic!
  I thought this was pretty cool but the first video didn't play. All this write up and I really just want to see the damn demo in action first! (Edit: reloaded the page and it worked. I still would like to see it on rela hardware!)
  - a1k0n a day ago |parent
    Ah that's what I get for self hosting. What browser?
    https://youtu.be/7xPS-0nydms
    - a1k0n a day ago |parent
      And this thread shows all of them on real hardware: https://x.com/i/status/1992802154370011595
  - jayd16 a day ago |parent
    I don't know. Analog signal processing is clearly less memory than a register, no? So a line exists somewhere and I think it's way before no RAM.
    - RossBencina a day ago |parent
      > Analog signal processing is clearly less memory than a register, no?
      You are going to have a hard time doing analog signal processing with memoryless elements. In the linear domain all you can do is apply gain and mix signals together. If you work with memoryless nonlinearities you can do waveshaping, which is generally only useful when applied to special signals (e.g. sine waves).
      Any time you want to do frequency-dependent behavior (filtering, oscillation) you need energy storing elements, usually capacitors, sometimes inductors. A capacitor is just like a register: it stores charge, similarly, inductors store energy in the magnetic field. Needless to say these devices are not memoryless. In fact, since the quantity that they remember is a continuous variable, they store a lot of information.
    - ErroneousBosh a day ago |parent
      > Analog signal processing is clearly less memory than a register, no?
      Bucket-brigade delay lines?
      - jayd16 a day ago |parent
        I'm not saying every analog signal processor is surely memory free, simply that you can imagine one that is.
        But I'm not really familiar with what that is.
        ErroneousBosh a day ago |parent
        They're a kind of analogue dynamic memory. I'd hesitate to call them RAM because the Access is not Random, but they are a kind of shift register and early computers used those for RAM.
        Imagine a pair of MOSFETs connected to a pair of capacitors, and a bunch of those joined together in a chain. All the gates of each one of the pair of MOSFETS are connected together, giving you a "left" and "right" clock input.
        When you put a signal in if you pulse the "left" and "right" inputs, it'll store the signal voltage in one capacitor, then pass it off to the next capacitor in turn, like old-timey firefighter handing buckets of water down a line of people.
        They used to use this for delaying audio signals before digital memory and analogue to digital conversion was cheap enough to use.
        fsckboy a day ago |parent
        bucket brigades were also used to read large scale sensors like a CCD camera. they are more efficient in their use of die space because you need fewer data paths; they don't need to be digital either, each bucket can be analog for "grey" scale
  - fsckboy a day ago |parent
    >Needlessly pedantic!
    if you have pedantry, it's also not "no memory"
- jonathrg a day ago |parent
  And I better not see any capacitors on there remembering any charge!
- layer8 a day ago |parent
  Even simple wires can be memory: https://en.wikipedia.org/wiki/Delay-line_memory#Electric_del...
fsckboy a day ago
>Pure Silicon Demo Coding: No CPU, No Memory, Just 4k Gates
ok, but silicon is doped so it's slightly impure, and CPUs are also silicon and memory is also silicon.
you actually meant "4K gates, no clock, no synchronization, no timing" and maybe a little "not exactly sure when the output is rea... is rea... is ready"
- chrisjj 12 hours ago |parent
  There is sync and there is timing. Else there'd be no meaningful image.