This is a great presentation.
Where I work we don't use any process level parallelism at the ruby level, we hoist that up to the kubernetes level and use signals (CPU load, job queue sizes, etc) to increase/decrease capacity. Workloads (replica sets) are segmented across multiple dimensions (different types of API traffic, worker queues) and are tuned for memory, cpu and thread counts according to their needs. Some heavy IO workloads can exceed a single cpu ever so slightly because db adapter isn't bound by the GVL, but practically speaking a pod/ruby process can only utilize 1 CPU, regardless of thread count.
One downside of this approach though is it takes a long time for our app to boot and this along with time to provision new nodes can cause pod autoscalers to flap/overprovision if we don't periodically tune our workloads.
In a perfect world we would be able to spawn processes/pods that are already warmed up/preloaded (similar to forking, but at the k8s level and the processes are detached from the root process) in a way that's not constrained by the CPU capacity of some underlying k8s node it is running on and instead is basically an infinite pool of CPUs that we only pay for what we use. Obviously serverless sort of offers this kind of solution if you squint but it is not a good fit for our architecture.
> One downside of this approach though is it takes a long time for our app to boot
Another is that you're leaving a lot of memory saving on the table by not benefiting from Copy on Write.
- [deleted]
In my past experience with a large rails monolith, memory usage was always the limiting factor. Just booting the app had significant memory overhead. Using in-process concurrency would have led to massive infrastructure savings, since a lot of that overhead could be shared across threads. Probably 2-3x the density compared to single threaded.
In the end, we never got quite there, due to thread safety issues. We did use a post-boot forking solution to achieve some memory savings thanks to copy-on-write memory, which also led to significant savings, but was a bit more complex.
All that to say, the naive "just let kubernetes scale it for you" is probably quite expensive.
Very interesting talk, I remember trying to setup Falcon for a simple app once but the docs was lacking and I didn't get it to work, but this was years ago. It seem to have matured a lot since then.
I also remember the Parallel gem being higher performing than Ractors, is that still the case? Does anyone know?
There is also iodine, I kinda wish it got covered, I am very curious about it. https://github.com/boazsegev/iodine?tab=readme-ov-file#iodin...
I wish it was easy to use puma for almost everything, and then iodine/falcon for web sockets. Haven’t figured out a sensible solution yet.
This was a great talk.
The Falcon web server scares me. I wish Puma had better support for web sockets/fibers.
Why does the Falcon server scare you?
It has a pre-release version number and a bus factor of approximately 1. It's tempting though.
Yeah, Puma just seems more stable, and works out of the box with Kamal.
And I am not aware of any heavy production use of Falcon with Rails. But in any case it already proves the async potential.
I would love to use a fiber based app server for Rails to gain parallelism with db queries and remote service requests. I've been burned so badly by thread safety issues that I do not want to deal with threads.
To my naive mind, YJIT is like a native numba for Ruby. Very cool.
I am not much of a video guy. Any kind of tl;dr?
Ruby video has an AI generated transcript and summarization as well: https://www.rubyvideo.dev/talks/in-depth-ruby-concurrency-na...
it just listed the ways you can handle concurrency and parallelism in ruby 3 and what they each do.
Aka Threads which are backed by OS threads but are limited by the GIL when it comes to parallelism. Fibers which are green threads aka windows 3 style concurrency. Forking Processes. and Ractor which is backed by a thread pool and offers real parallelism.
> windows 3 style concurrency
I would say that fibers, when used with Samuel’s ecosystem of gems act as node’s event loop (which I am guessing more people on here would be familiar with). They can auto switch on I/O like Ruby’s threads if you’re using a new enough version of Ruby.
Cool, thanks! Always curious to hear if there's something new out there.
I have always enjoyed using the BEAM ecosystem for concurrency, but Ruby has so much code and is great for "getting stuff done", so it's mostly what I keep using.