HNNewShowAskJobs
Built with Tanstack Start
Arborium: Tree-sitter code highlighting with Native and WASM targets(arborium.bearcove.eu)
216 points by zdw 21 hours ago | 42 comments
  • joshka18 hours ago

    https://fasterthanli.me/articles/my-gift-to-the-rust-docs-te... is a better link (the author's article about this rather than the artifact produced)

    • brabel17 hours ago |parent

      What a genius this dev is. Just a few weeks ago I tried to do something very similar for my blog, but quickly gave up as it’s not easy to do! Kudos to the author, they did an awesome job and went beyond by actually fixing up even the grammars and highlighting queries so it all works perfectly!

      • solarkraft14 hours ago |parent

        I agree! Not only do they write engaging blog posts, they also produce engaging videos!

        https://youtube.com/@fasterthanlime

        • falcor8411 hours ago |parent

          I suppose this isn't the case here, but found it funny how these 3 successive positive comments read almost exactly like those bot comment chains that I see on other platforms trying to pull people into crypto scams.

          • tombh11 hours ago |parent

            Some people just genuinely deserve praise.

          • brabel7 hours ago |parent

            Haha, I confess I did sound like a bot... but sorry to disappoint, I am human!

            • psychoslave5 hours ago |parent

              That's what a bot would typically say.

              On this side, I'm not a bot, and fortunately even in 2025 nobody on the internet knows that you are a dog.

  • ComputerGuru6 hours ago

    Completely appalled to learn that docs.rs lets you inject any html/css/js you want into the live site (on pages documenting your crate). I love the flexibility but shudder at the security hole the size of, oh, I don’t know, the Grand Canyon.

    It’s not a new discovery, I just didn’t know docs.rs (intentionally) wasn’t blocking this. Cf https://docs.rs/pwnies/0.0.13/pwnies/

    (This all makes more sense if you read the blog post instead of the direct arborium link: https://fasterthanli.me/articles/my-gift-to-the-rust-docs-te...)

  • catapart7 hours ago

    Daaang! On the one hand, as someone developing a code-example custom element[0] that includes highlighting, this is kind of a "so close, but not quite" situation where I really wish this was something I could use for that, but it's probably too heavy to ship around for something so "simple" (as far as users' default expectation; not implying that highlighting text is simple).

    But then, on the other hand, I had given up on a scratch code editor for a game development project I'm working on, and just loosely wrapped up the monaco editor which I'm guessing is going to be pretty bare when I actually get around to trying to code with it, in browser (I'm aware that it is robust, but from what I gather, a lot of its features come from third-party dev as a way to keep the core functionality pure). Given that I want to be able to script in C# (aside from just js/ts), I was sure I was going to have to figure something complicated out.

    But, honestly, I think this solves all of my most concerning issues! What a sweet little library!

    [0] https://magnitce-code-example-e81613.gitlab.io/ (please excuse the unfinished-ness; I'm working on a JSDoc-to-documentation library that automates the documentation for me so there are minor issues, like the install text not changing on selection)

  • mg18 hours ago

    I'm currently building an online 3D-Editor that supports OpenSCAD and Python as the input language.

    The ease of use to highlight static text via Arborium seems nice:

        <script src="arborium.iife.js"></script>
        <pre><code class="language-python">
            def hello(name):
                print("Hello " + name);
        </code></pre>
    
    But does it support editing highlighted text? If not, one would have to do some trickery by hiding a textarea and updating the <code> element on each keypress, I guess. Which probably has a thousand corner cases one would have to deal with.

    And how would one add SCAD support?

    • debazel14 hours ago |parent

      The example on their website is editable and it looks like they overlay the highlighted output on top of the textarea with `pointer-events: none` like you mentioned.

      The code isn't minified so you can see how they do it by looking at the `doHighlight()` function here https://arborium.bearcove.eu/pkg/app.generated.js

      • mg10 hours ago |parent

        Oh, you are right!

        Hmm .. and the approach already shows its weaknesses when I play with it: When I search for something on the page, it gives me twice as many hits as there are. And jumps around two times to each hit when I use the "next" button.

        I wonder if that is fixable.

        • debazel7 hours ago |parent

          There is a neat `inert` html attribute you can use to disable all interactions as well as hide the text from ctrl+f searches. (Sadly Safari is the weird one out, and does not exclude the content from searches.)

          https://developer.mozilla.org/en-US/docs/Web/HTML/Reference/...

        • knallfrosch7 hours ago |parent

          One simply needs the Highlight API. I held back, but now even Firefox ESR supports it.

          https://developer.mozilla.org/en-US/docs/Web/API/Highlight

          All the trickery vanishes and you get first-class CSS support.

          • bakkoting3 hours ago |parent

            And there's an open issue for that already: https://github.com/bearcove/arborium/issues/62

        • fsfod5 hours ago |parent

          GitHub had to solve the same problem when speeding up there code viewer.

          https://github.blog/engineering/architecture-optimization/cr...

    • metmac12 hours ago |parent

      I’m now just curious about your project

      • mg10 hours ago |parent

        Give me a few more weeks and I will probably have something online. You can find me on social media or feel free to connect via email.

  • aarol15 hours ago

    I've been toying around with tree-sitter and have seen the potential for a proper, non-regex based highlighter. It's really powerful because it actually parses the text into an AST. With the AST it's possible for example to make variables the same colour everywhere. A function passed as a parameter could be highlighted as a function even in the parameter list.

    I'm happy to see that tree sitter highlighting on the web is finally a thing. This seems really solid although the bundle size is a lot.

  • danielvaughn10 hours ago

    If you haven't tried building a grammar with tree-sitter, I highly recommend you do so. It's incredibly fun once you get into a flow state. The docs call it a zen-like experience, and that's a perfect way to describe it. It's so, so good.

  • jasonjmcghee18 hours ago

    The sponsorships achieved by the author is admirable - they really make a lot of valuable oss.

    https://github.com/sponsors/fasterthanlime

  • pornel7 hours ago

    There's also a pure-Rust implementation of a syntax highlighter, which uses TextMate/SublimeText grammars: https://lib.rs/syntect

    • zeon2565 hours ago |parent

      Tree-sitter produces more accurate highlighting tho. We used syntect for our web editing cos it’s faster (and lighter in terms of size) and tree-sitter for rendered pages in our company and the difference is stark

  • pseudo_meta16 hours ago

    Treesitter is fantastic. It has builtin support in nvim, and there are a lot of plugins that make use of it.

    My favorite is nvim-treesitter-textobjects which gives you dozens of new targets for vim motions, such as a function call or the condition of a loop.

  • Tepix16 hours ago

    Tree-sitter:

    Tree-sitter is a parser generator tool and an incremental parsing library. It can build a concrete syntax tree for a source file and efficiently update the syntax tree as the source file is edited.

  • divyeshio10 hours ago

    This is cool! How does it compares with Shiki or Highlighter.js in terms of performance?

  • f311a10 hours ago

    I had to wait for about 10 seconds for it to load on my crappy mobile connection.

    They also load 1 mb of fonts. In total, this page is close to 3 mb.

    Also, when you select a language, the grammar file gets downloaded twice.

  • mintflow15 hours ago

    Great project, I really love tree-sitter, recently I added a ini variant config profile support to my app, and just use gemini to write a grammar and combine it with another great project called runestone to support highlight the config profile, the total progress is quite smooth.

  • teo_zero19 hours ago

    Sorry, but I can't understand what this actually is. A library, a stand-alone tool, a Rust crate? What users does it target? Text editors, website creators?

    • GolDDranks16 hours ago |parent

      It's a Rust library (comprised of a bunch of crates) that wraps a high-performance, high-accuracy syntax highlighter (called Tree-sitter) with vetted support for almost 100 programming/markup languages.

      You can use it as a normal Rust library, or you can use the JavaScript/WASM wrapper to highlight source code on a web page.

      • oersted15 hours ago |parent

        > high-accuracy syntax highlighter (called Tree-sitter)

        Just wanted to note that tree-sitter is lower-level and more general: it's an incremental parser that is specialised for gracefully and efficiently parsing partially-correct code snippets or code being edited live.

        It's an important building block of the highlighter, but it needs more on top to complete the package. It can be used for anything that requires awareness of code structure in an editor.

        • debugnik9 hours ago |parent

          If only it were usable for really-correct parsing. In my experience error recovery is so aggressive it will accept broken ASTs without marking any node as an error. Plus, you can't really solve some ambiguities without C-based lexer hacks.

          I wonder if targeting the Tree-sitter ABI directly could be a viable way to write more accurate parsers in an actual programming language while piggybacking on the ecosystem. Could tree-sitter's runtime ABI be adapted for GLL parsers instead of GLR? I haven't looked deep into it yet.

          • conartist67 hours ago |parent

            Now you're in my wheelhouse! Come check out https://github.com/bablr-lang/. I'm gearing up for a big release announcement here once I fix these bugs and ship all the latest code.

        • GolDDranks15 hours ago |parent

          Thanks for the correction!

      • gorjusborg8 hours ago |parent

        Is the 'regex hater club' subtitle related to using regex to 'parse' rather than using something like tree-sitter that actually parses?

        I also had a hard time understanding the context given just the link.

    • joshka18 hours ago |parent

      See https://fasterthanli.me/articles/my-gift-to-the-rust-docs-te...

    • jasonjmcghee18 hours ago |parent

      GitHub repo is a bit more helpful, but users or building of text editors that use tree-sitter.

      Or... website text editors which historically have had imperfect syntax highlighting.

      Notice the Zed sponsorship.

      https://github.com/bearcove/arborium

      • tombh11 hours ago |parent

        > Batteries-included tree-sitter grammar collection with HTML rendering and WASM support.

        That's the best one sentence description there is and it's at the top of the Github README. I think that would fit nice at the top of https://arborium.bearcove.eu too

    • discord918 hours ago |parent

      I think this gives some context?:https://fasterthanli.me/articles/my-gift-to-the-rust-docs-te... TLDR: for rust doc highlighting stuff in document

    • Rodmine18 hours ago |parent

      Not for you then. You don't need to understand everything.

  • unrealhoang19 hours ago

    The get started section seems to be broken or missing content.

  • virajk_3120 hours ago

    This is cool, was looking for something similar