HNNewShowAskJobs
Built with Tanstack Start
Pure Silicon Demo Coding: No CPU, No Memory, Just 4k Gates(a1k0n.net)
412 points by a1k0n a day ago | 70 comments
  • yoan922412 hours ago

    This is absolutely wild. Rendering graphics with just combinational logic and no frame buffer is the kind of constraint that breeds creativity.

    The HAKMEM sine/cosine generator is such an elegant choice - it's numerically stable in fixed-point and requires only adds and bit-shifts. Perfect for hardware. I used a similar approach once for generating test patterns in an FPGA.

    The fact that you can iterate on this in simulation, then deploy to actual silicon via Tiny Tapeout for $150 is honestly mind-blowing. We're living in the future.

    • tails4e10 hours ago |parent

      How does this compare to CORDIC for sin/cos generation? Which is more accurate, etc ?

      • yoan92249 hours ago |parent

        Good question! CORDIC and HAKMEM Item 149 are both hardware-friendly, but have different trade-offs:

        CORDIC: - Iterative algorithm (needs multiple clock cycles) - Accuracy improves with more iterations - Generates both magnitude and phase - Typical hardware implementation: 12-16 iterations for decent precision

        HAKMEM (Item 149): - Single-cycle computation (just two adds per step) - Uses the recurrence: x' = x - εy, y' = y + εx - Accuracy depends on word width and epsilon choice - Numerically stable in exact arithmetic if ε² < 2

      • a1k0n9 hours ago |parent

        CORDIC is more accurate, but takes as many iterations as you have bits of precision in your angle. Another demo called Warp in this contest used pipelined CORDIC to do atan2 on every pixel to create a tunnel, which is super impressive.

        https://www.youtube.com/watch?v=K9mu3getxhU&t=780s

    • zahlman9 hours ago |parent

      > The fact that you can iterate on this in simulation, then deploy to actual silicon via Tiny Tapeout for $150 is honestly mind-blowing. We're living in the future.

      It's really cool but it doesn't seem practical at all. They aren't setting up print runs, just one-offs (https://tinytapeout.com/faq/#how-many-chips-will-i-receive-c...) and $150 could get you... many orders of magnitude more power than that.

      ... For that matter, apparently the microcontroller in the dev kit is a https://en.wikipedia.org/wiki/RP2040 , which seems like a beast in comparison. And it's still available for less than $1 USD on PiShop.

      • immibis7 hours ago |parent

        Tiny Tapeout's schtick is that for $150 you can get your chip design made at all. It's not a mass production run.

        Remind me to participate in the next one!

  • xphosa day ago

    As a computer science guy who interlops in computer engineering i really want to find time to build something cool like this and tapeout. The retro architectures for rendering are simple but fun! I love the project

    • Neywinya day ago |parent

      I recommend getting started like the author did: simulation first, then FPGA. Honestly FPGA will take you very far. I always get a kick out of being able to design my own SoC. "Hmmm I need 9 separate I2C ports... Ok, copy block, paste paste paste..." Or if you have an operation in software that's taking forever you can write an accelerator for it

      • checker65913 hours ago |parent

        Do you know if there are any tutorials that use bounded model checking tools from the very get go? For verilog or VHDL.

      • 8f2ab37a-ed6ca day ago |parent

        What are the best modern tools to get started with in simulation for those who have never dabbled before?

        • Neywinya day ago |parent

          I do the vast majority of my work on xilinx and it's easiest to just use the built in simulator. It's free and supports vhdl and verilog. Most support just one. For lattice and microchip work I use what the tool provides which is usually a cut down modelsim or something

        • sehugg17 hours ago |parent

          Try https://8bitworkshop.com/verilog to get started with dabbling

        • y1n021 hours ago |parent

          start here: https://github.com/YosysHQ/oss-cad-suite-build

        • sweetjulya day ago |parent

          The other commentator mentioned Verilator (which is indispensable in larger designs) but you may also want to grab Icarus Verilog too. It's a FOSS simulator and, unlike Verilator, is 2-bit and so it handles X ("don't care") and Z ("high impedance") signals. It's ridiculously slow compared to Verilator but the greater fidelity can be valuable depending on what you're trying to do.

        • vanjoea day ago |parent

          Verilator is very good. It's faster than anything else, and it is free. The downsides are it won't stimulate encrypted IP blocks. And it doesn't do mixed language sim, so vhdl is no bueno.

      • alfiedotwtf15 hours ago |parent

        Are there any open or at least standard FPGAs that the open source community flock to? Last time I looked into FPGAs, it was mostly closed architecture and proprietary tools

        • Neywiny14 hours ago |parent

          Not for anything mid to higher range, but I believe there's open source tooling for some of the older Lattice and Xilinx parts. I would say for me it's not as big a deal as on the software side, because each vendor's hardware tends to be pretty different from each other anyway.

          • alfiedotwtf8 hours ago |parent

            Dang, sounds like there’s still a bit of lock in. That’sa shame

            • Neywiny7 hours ago |parent

              I think there will always be vendor lock in. The same way there have been architectural differences between Intel and AMD's x86, or even stuff like one specific chip/family tanking performance because one instruction was implemented differently, you won't be able to guarantee efficient utilization of different vendor/families.

              For example, I've taken code optimized for Xilinx, ran it for another vendor, and resource count ballooned because stuff that was built-in/free on one wasn't on the other. It's a lot of work to truly make generic code and usually just means switching out modules per vendor.

    • oofbeya day ago |parent

      It’s amazing and wonderful to see the Internet support these tiny cliques of interest. Having everybody connected leads to homogenization of culture in some ways, but it also supports these couple dozen (?) people around the world finding each other for this amazing little competition.

      • anonymous908213a day ago |parent

           Having everybody connected leads to homogenization of culture in some ways
        
        The internet may hypothetically homogenize culture relative to a society that does not have any kind of mass communication at all, but relative to the world it was actually introduced into, the internet has completely balkanised the culture. Prior to the internet, we had television, cinema, literature, radio, and newspapers, which were all centralised and controlled enough that they created a shared monoculture in nations. A signifant portion of a country's population would watch, read, and listen to the same media. The internet bucked that trend, allowing all kinds of new subcultures to pop up and to more easily cross national boundaries.
        • adrianN20 hours ago |parent

          Then algorithms optimized content for addictiveness and we’re in a world where a large part of the world looks at the same set of „influencers“.

        • thereina day ago |parent

          Yeah, back in the day you would go to school the next day after a show that everyone watches released its new episode, it aired on the prime-time slot on the primary TV channel, and you'd discuss what happened in that episode, or have some references or new jokes. Created a common culture.

          • amaranta day ago |parent

            I remember those days. As the only kid in school who didn't watch Lost, those days sucked

  • RossBencinaa day ago

    I was curious about the long-term stability of the cited HAKMEM sin/cos generator. I found an overview here: https://news.ycombinator.com/item?id=3111501 (EDIT: I'm still not sure about stability, apparently it is stable in exact arithmetic under certain conditions.) Coincidentally it is related to the Verlet integration video I posted last week: https://news.ycombinator.com/item?id=46253592

    • a1k0na day ago |parent

      Yeah, it is exact in this specific circumstance. But yes, it's exactly the same trick; I also enjoyed that video in my Youtube recommender feed last week!

  • intalentivea day ago

    I like how the grid pulses with the kick drum. Nice touch.

  • xecaz17 hours ago

    Wow, nice work!! Coming from demo/intro coding where you have memory and a driver for audio(x86), this is very impressing.

  • glimshea day ago

    Reminds me of college: "Hardware and Software are logically equivalent"

    • ameliusa day ago |parent

      Writing hardware is like writing software except parallelism is way cheaper, but mistakes are way more expensive.

      • lucyjojoa day ago |parent

        that doesnt seem like a good tradeoff...

        • Joel_Mckay14 hours ago |parent

          Hardware takes 20 years to learn how to build properly.

          Software takes 1 year under someone smart in a production environment.

          People that conflate the two... longer or more likely never.. =3

          • jacquesm12 hours ago |parent

            > Software takes 1 year under someone smart in a production environment.

            That's very funny.

            • Joel_Mckay4 minutes ago |parent

              Be honest, most Software people find utility in artifacts which are a mysterious black box with an emulated abstraction.

              During a career role most have no idea "why" chips were designed and built a certain way, nor require this information to work within abstract domains.

              In many ways, vibe-coders are the absurd optimization of a naive trajectory toward zero workmanship standards. =3

              https://en.wikipedia.org/wiki/Five_stages_of_grief

    • Joel_Mckay14 hours ago |parent

      Whoever said that was mistaken... or worked at Intel. lol =3

      https://en.wikipedia.org/wiki/Metastability

      https://en.wikipedia.org/wiki/Clock_domain_crossing

  • Archit3cha day ago

    I'm tempted to put together an FPAA with Tiny Tapeout, but it likely won't fit in the allocated area.

    • Taniwhaa day ago |parent

      TT allows you to pay more and build multi-block designs

    • Joel_Mckay14 hours ago |parent

      Check the switching speed specification, and shared i/o bank configuration.

      The project has a narrow scope of use-cases. =3

      • Archit3ch13 hours ago |parent

        Switching speed: should be good enough for audio in the kHz range, even for off-chip control.

        Analog i/o pins: definitely limited, even if you purchase the highest option available (6).

  • datameta21 hours ago

    Very impressive stuff. I used to frequent the JS demoscene, mostly dwitter - but this is on a whole other level.

    Oh shit, this prompted me to check and turns out TinyTapeout is back to life! https://tinytapeout.com/

  • openinfrareda day ago

    Really cool!

  • idiotsecanta day ago

    No x, no y, just Z is a pattern so often used by chatGPT it has started to bleed into common usage by people who maybe aren't even using an LLM.

    • layer8a day ago |parent

      Or maybe ChatGPT picked it up from common usage.

      • idiotsecanta day ago |parent

        It was used occasionally before chatGPT but it has exploded since then.

        • immibis7 hours ago |parent

          Apparently ChatGPT speaks like lower-class Kenyans. You can guess why.

    • anthomtb9 hours ago |parent

      For everyone who is as dumb as I am, the comment pertains to the title.

      x=CPU y=Memory Z=4k gates

    • peddling-brinka day ago |parent

      Language is fluid. This is ok.

      There are many bad things about LLMs, but a benign shift in popular language usage isn't one of them.

      • mschuster91a day ago |parent

        > There are many bad things about LLMs, but a benign shift in popular language usage isn't one of them.

        Organic shifts in language are fine. What is not fine is Big Money (which most forms of AI are) manipulating society at large - and that's not just the AI companies' doing. Think of Tiktok leading people to say "unalive" instead of the various clear words before (e.g. kill, murder, executed, run over by car, mauled to death by animal).

      • idiotsecanta day ago |parent

        I disagree. It's a sign of what is essentially cultural contamination by an LLM. There is something vaguely gross about it, like when people start repeating advertising slogans. It's a sign that someone spent enough money that they directly rewired our brains.

        • peddling-brinka day ago |parent

          Do you get grossed out when you step on a linoleum floor? Or you ride an escalator? Or drink out of a thermos?

          Culture contaminates.

        • Marazana day ago |parent

          > like when people start repeating advertising slogans

          but without the craft of a good advertising slogan. So worse!

          • attila-lendvaia day ago |parent

            ...and way more centralized and powerful.

    • fsckboya day ago |parent

      when i was running for 5th grade class president a number of decades ago, my campaign sign slogan was a "no x, no y, just z" snowclone.

  • BoredPositrona day ago

    Reminds me of the time we repaired old pinball machines in trade school. Good times.

  • startupsfaila day ago

    Wow, I'm looking at current "Open Shuttles", a license to use 4KB of SRAM in the project is $2500. But it comes with Wishbone Bus interface!

    > 1024x32 Commercial SRAM > CF_SRAM_1024x32 > Commercial SRAM: 1024 words x > 32 bits (4KB) with Wishbone Bus interface > Area: 0.17mm² > GPIOs: 0 > License: Commercial - $2500 per project

  • Dwedita day ago

    If you have registers, it's not "no memory".

    • hackernudesa day ago |parent

      If you have flip flops, it's not "no memory".

      If you have a ROM, it's not "no memory".

      Needlessly pedantic!

      I thought this was pretty cool but the first video didn't play. All this write up and I really just want to see the damn demo in action first! (Edit: reloaded the page and it worked. I still would like to see it on rela hardware!)

      • a1k0na day ago |parent

        Ah that's what I get for self hosting. What browser?

        https://youtu.be/7xPS-0nydms

        • a1k0na day ago |parent

          And this thread shows all of them on real hardware: https://x.com/i/status/1992802154370011595

      • jayd16a day ago |parent

        I don't know. Analog signal processing is clearly less memory than a register, no? So a line exists somewhere and I think it's way before no RAM.

        • RossBencinaa day ago |parent

          > Analog signal processing is clearly less memory than a register, no?

          You are going to have a hard time doing analog signal processing with memoryless elements. In the linear domain all you can do is apply gain and mix signals together. If you work with memoryless nonlinearities you can do waveshaping, which is generally only useful when applied to special signals (e.g. sine waves).

          Any time you want to do frequency-dependent behavior (filtering, oscillation) you need energy storing elements, usually capacitors, sometimes inductors. A capacitor is just like a register: it stores charge, similarly, inductors store energy in the magnetic field. Needless to say these devices are not memoryless. In fact, since the quantity that they remember is a continuous variable, they store a lot of information.

        • ErroneousBosha day ago |parent

          > Analog signal processing is clearly less memory than a register, no?

          Bucket-brigade delay lines?

          • jayd16a day ago |parent

            I'm not saying every analog signal processor is surely memory free, simply that you can imagine one that is.

            But I'm not really familiar with what that is.

            • ErroneousBosha day ago |parent

              They're a kind of analogue dynamic memory. I'd hesitate to call them RAM because the Access is not Random, but they are a kind of shift register and early computers used those for RAM.

              Imagine a pair of MOSFETs connected to a pair of capacitors, and a bunch of those joined together in a chain. All the gates of each one of the pair of MOSFETS are connected together, giving you a "left" and "right" clock input.

              When you put a signal in if you pulse the "left" and "right" inputs, it'll store the signal voltage in one capacitor, then pass it off to the next capacitor in turn, like old-timey firefighter handing buckets of water down a line of people.

              They used to use this for delaying audio signals before digital memory and analogue to digital conversion was cheap enough to use.

              • fsckboya day ago |parent

                bucket brigades were also used to read large scale sensors like a CCD camera. they are more efficient in their use of die space because you need fewer data paths; they don't need to be digital either, each bucket can be analog for "grey" scale

      • fsckboya day ago |parent

        >Needlessly pedantic!

        if you have pedantry, it's also not "no memory"

    • jonathrga day ago |parent

      And I better not see any capacitors on there remembering any charge!

    • layer8a day ago |parent

      Even simple wires can be memory: https://en.wikipedia.org/wiki/Delay-line_memory#Electric_del...

  • fsckboya day ago

    >Pure Silicon Demo Coding: No CPU, No Memory, Just 4k Gates

    ok, but silicon is doped so it's slightly impure, and CPUs are also silicon and memory is also silicon.

    you actually meant "4K gates, no clock, no synchronization, no timing" and maybe a little "not exactly sure when the output is rea... is rea... is ready"

    • chrisjj12 hours ago |parent

      There is sync and there is timing. Else there'd be no meaningful image.