I've been building prototypes of new AI learning tools for months, but I recently learned that 3blue1brown open sourced his incredible math animation library, Manim, and that LLMs could generate code for it without any fine-tuning.
So I made a tool that automatically generates animated math/science explanations in the style of 3blue1brown using Manim from any text prompt.
Try it yourself at https://TMA.live (no signup required)
or see the demo video here: https://x.com/i/status/1874948287759081608
The UX is pretty simple right now, you just write a text prompt and then start watching the video as it's generated. Once it's done generating you can download it.
I built this because I kept finding myself spending 30+ minutes in AI chats trying to understand very specific concepts that would have clicked instantly if there were a visual explanations on YouTube.
Technical Implementation:
- LLM + prompt to use Manim well, right now this uses Gemini with grounding to ensure some level of factuality, but it works equally well with Claude
- Manim for animation generation
- OpenAI TTS for the voiceovers
- Fly.io for hosting the web app
- Modal.com for fast serverless GPUs to render the videos
- HLS protocol for streaming the videos as they are rendered
Note: This is focused on STEM education and visualization, and it is particularly good for math, but get creative and try it with anything! I used it recently to teach my partner's parents a new board game in Mandarin (which I don't speak!)
I'll be around to answer questions. Happy learning!
As usual with these things: it is impressive that stuff like this can be generated so quickly at all, but the content is very superficial and often wrong or at least misleading. It's unusable for learning, but great for spamming platforms, just like NotebookLM for instance.
As an example, I asked about the Cantor function. It generated a 1:24 video, which is laughably short, explained correctly how the Cantor set is defined but showed a flawed visual representation, then simply skipped over how the Cantor function is constructed and simply states the basic properties. Sorry, but this is garbage content.
It will happily explain non existing concepts as well: https://tma.live/video/26c9fec7-3bb7-42e5-837f-617fd5659f70
It didn’t explain soul capsules at all though, it barely mentioned them
Incredible
Those videos are apex quality videos. You might as well ask for Nobel prize literature quality essays from ChatGPT.
You can probably imitate the structure/scaffolding of a 3b1b video in a cargo cult way, but you are losing domain expert level verification of quality (which is why AI fails, because it's not a domain expert).
So heres how I'm hearing yoir question, and it answers itself: "how do I get domain expert quality from a non-domain expert AI?"....
I once heard an interesting definition of what an expert is:
"An expert is a person who is far away from home and gives advice"
More modern version: "A person from out of town with a briefcase".
What.
It's a Bible reference. Jesus supposedly said that "A prophet is honored everywhere except in his own hometown and among his own family". When you grew up with someone everything will be colored by your past experience with them. You remember all the grandiose stupid things they said in the past. If they are saying actually wise things now you will not believe them as much as you should. On the other hand if they are still saying stupid things other people have not yet learned caution.
I'm failing to see how this Bible story is related to the aforementioned truism, which seems to actually _contradict_ the story.
I guess the connection is that when you're far away from home you can project an aura of confidence/respectability that you may not have at home where people know you since the days you were not yet an expert.
Of course people can't be born experts and for every expert there must be a prior step in their personal growth when they were less expert. But that doesn't prevent people from using an imperfect heuristic for judging whether you can trust someone's expertise.
Perhaps it's possible that it's just not a reference to the bible in the first place?
It's possible. I don't know. Sometimes people just come up with the same ideas over and over.
Classic example is the pyramid structures around the world. The fact that people found that it's easy to pile rocks that way in order to build tall stable structures doesn't mean that there was a single culture that span the entire globe.
Whose voice are you 'borrowing' for this?
Compared to https://www.youtube.com/watch?v=24GfgNtnjXc this video is absurdly limited https://tma.live/video/9c8e725e-ec21-41a7-984a-317d84216497
Mildly off topic, for what I think is the best material on rainbows check out this lecture (and the entire 801+802 courses by Walter Lewin)
In his last lecture he mentions that the best part of teaching is when a student sends them a photo of a rainbow 10 years after graduating, saying that they remember the whole physics behind it.
Same for me. I think of the cones of light being formed when I see the rainbow and that amazing lecture.
The voice is just OpenAI’s default tts voice. I agree that Veritasium video is an incredible work and the ai version is absurd by comparison! This is mostly a proof of concept that this is possible at all, and as LLMs get smarter it’ll be interesting to see if the quality automatically improves. For now, the tool is really only useful for very specific or personal questions that wouldn’t already exist on YouTube.
Great but wrong answers. I asked how big tree-3 was after some helpful explaining about exponentials, it tells me its 27. This is... an underestimate.
I'm not really a bear on llms, but using it for explanations in this way needs a lot of caution. It's very easy to give an argument that "seems" right at first glance, e.g. https://xkcd.com/803/
totally fair! I like the XKCD comic as well because it hints at a potential solution - even if you can't always be correct, how you respond to critical questions can really help. I'm working on a feature for users to ask follow up questions and definitely going to consider how to make it most honest and curious
Well, I'm blown away. "Show me how information propagates through a neural net."
I feel like this is the one thing that's been missing from all the LLMs: actual visual explainers, whether image or video. python plots only get you so far, and all the Diffusion stuff is nonsensical. This is amazing.
I have to give you props for not requiring me to sign up, I’ve seen many ShowHN posts lately that require me to unnecessarily create an account which always prompts me to close the tab immediately.
MathGPT also has this (exactly in the same 3blue1brown style, so I guess they also use manim), and in my experience it does actually work better and tries to explain math and write the equations.
I think they use some extremely cheap model for writing the code, probably 4o-mini or similar.
I tried a few times and this is my experience
1. Doesn't work at all on Firefox 133.0.3 (64-bit)
2. Works on Chrome 131.0.6778.205 (Official Build) (64-bit)
3. No existing links do anything but a sub second "Generating" which disappears quickly
4. Does not work in Incognito on Chrome 131.0.6778.205 (Official Build) (64-bit)
My prompt kind of worked but ended at 48 seconds
Prompt: "Describe a honeybee learning which flower to land on in a field using Markov Decision Process with Bellman updating":
https://tma.live/video/88f535b5-0e5f-41ca-9bd8-e35e7aa8a95a
I ran it a second time and got a longer video of 1:55 but it primarily just created Text. It also didn't explain Bellman's equation and wrote it incorrectly:
https://tma.live/video/88f535b5-0e5f-41ca-9bd8-e35e7aa8a95a
The second prompt kind of worked but ends at 47 seconds and then loops the final 4 seconds forever.
Prompt: "Describe how the OODA Loop, MDP and SPA learning approaches are the same"
https://tma.live/video/ee7b5048-3fde-4f1a-8ec1-c8bb48883c75
Overall this worked as described. It's more than fast enough, but fails to deliver on consistency and graphics.
A few more iterations and fine tuning and you'll have a solid Beta. I can see this being very useful after another year or so of use and tuning.
Great work and congrats on shipping.
These are amazing examples! Thanks for all the feedback, detailed info, and persistence in trying! HN hug of death means I'm running into Gemini rate limits unfortunately :( will def make that more clear when it happens in the UI and try to find some workarounds.
The other issues are bugs with my streaming logic retrying clips which failed to generate. LLMs aren't yet perfect at writing Manim, so to keep things smooth I try to skip clips which fail to render properly. Still also have layout issues which are hard to automatically detect.
I expect with a few more generations of LLM updates, prompt iterating, and better streaming/retrying logic on my end this will become more reliable
You didn't think to add a queue (with trackable status links)?
There is a job queue on the backend with statuses, just not worth breaking the streaming experience to ask the LLM rewrite broken manim segments out of order
Most hilarious is https://tma.live/video/8eb2d318-3217-4c09-a8aa-3fc7e8bb7cca
I asked a history question - tell me about Reddy kings rule. It made up a physics rule and started talking about electrons.
The Manim is not great, words overlapping etc, the page itself needs a lot of work, generating and then nothing happens, and it seems in a lot of case your backend workflows have issues, with descriptions only starting and ending after a say 30s when it needs to go another 1-2 minutes at least
Thank you for sharing. This is a wonderful experiment and I think this concept has a lot of potential.
Whether I click an existing example or type in my own it doesn't seem to work. A dialog pops up for a second saying 'generating video' and then disappears.
sad looks like I already hit the Gemini rate limit :( Switching to Claude!
Pure Garbage: https://tma.live/video/5fe506ca-3831-4ba3-b9c6-ff899c571bf1
Although it is pretty impressive for what an LLM can generate these days.
Well, it wasn't wrong at least, it just... stopped about 1% into the explanation.
Let's see your take on it ...
Haha that’s funny, the beehive one lays out the hexagons as if they were squares - so they overlap and have empty space lol! But still it’s a promising concept.
Btw for some reason on iOS I had to download to view the video
Wow, this is awesome! Thanks for building. I didn't realize there was a protocol for streaming while rendering, though I noticed sumo.ai doing something similar for audio. Gemini with grounding is new to me also, very nice!
thanks! Streaming was actually pretty hard to get working, but it goes roughly like this as a streaming pipeline:
- The LLM is prompted to generate an explainer video as sequence of small Manim scene segments with corresponding voiceovers
- LLM streams response token-by-token as Server-Sent-Events
- Whenever a complete Manim segment is finished, send it to Modal to start rendering
- Start streaming the rendered partial video files from manim as they are generated via HLS
Pretty incredible how fast those videos are being generated. Excellent work! You can now spam YT shorts all day.
This is so cool! Streaming videos while the Manim code is still generating is super impressive - sure that took a lot of hacking.
Can I ask from curiosity how long it took you to program and design this entire project?
Hmm the initial version of the app only took me about a day to get something working, but that version took minutes to generate a single video and even then only worked a third of the time. It took a solid 2 weeks from there to add all the edge cases to the prompt to increase reliability, add GPU rendering and streaming to improve performance/latency, and shore up the infra for scaling.
Thank you!
Not working
I was running into some scaling issues, but should be all working now!
Awesome!