Hey HN! Roughly 4 years ago I started building a lazy functional iteration library for JS/TS. I had a few goals for the library:
- Supporting sync, sequential async, and concurrent async iteration
- Limiting it to a small number of orthogonal concepts that compose beautifully to solve problems
- Making it fully tree-shakeable
I built it for myself and have (mostly) been its only user as I refined it. I've used it in lots of personal projects and really enjoyed it.
I recently decided it would be nice to spread that enjoyment so I created a documentation website complete with a playground where you can try out the library.
I hope you enjoy using it as much as I do! Looking forward to hearing your thoughts :)
This is nicely done.
One function I've written that I frequently use is a generic iterate <T> function which (in JS/TS land) allows you to loop over a T[], Array<T>, ArrayLike<T>, Iterable<T>, AsyncIterable<T>, including generators and async generators.
It is just easier to always be able to write: "for await (const item of iterate(iterable))" in most places, without worrying about the type of the item I am looping over.
I like the reducers and collectors in your library! Going to try it out.
Also, something new I discovered is the Array.fromAsync() method in JS, which is like Array.from() but for async values. I don't think it is available in all browsers/runtimes yet though.
Glad you like it!
I actually have a TODO for what you're describing haha
https://github.com/TomerAberbach/lfi/blob/69cdca0b2ee2bd078f...
I'm not sure I know the use case for all of this logical complexity. Is there a specific use-case you had in mind?
Split the array of elements to process. Set up a promise.all against a function that handles sync/async function calls against each chunk and accumulates an array of results. This isn't the end of the world for any of the practical use cases I can think of. The efficiency loss and memory consumption is miniscule even across hundreds of thousands of records. This is as concurrent as anything else - the event loop is still a mandatory part of the processing loop. You can even defer every 1000 records to the event loop. Javascript is syncronous so there is no concurrency without threading. I can't think of an IO or non-IO function that would create the issues described of not yielding. We used to have chunks of 1000000+ records that would freeze the event loop. We solved this with structured yields to the event loop to allow other processes to continue without dropping TCP packets etc.
If the results are objects, you're not even bloating memory too much. 800KB for 100,000 records seems fine for any practical use case.
I like the idea of an async generator to prevent over-pre-calculating the result set before it's needed but I couldn't justify pivoting to a set of new functions called pipe, map, etc that require a few too many logical leaps for your typical ECMAScript developers to understand.
What's the win here? It must be for some abuse of Javascript that should be ported to WASM. Or some other use-case that I've somehow not seen yet.
To clarify, the point of it isn't _just_ performance.
It's a combination for writing the iteration code in a functional style, which some people like/prefer, while retaining certain performance characteristics that you'd get from writing more imperative code.
For example, the "naive" approach to functional programming + concurrency results in some unintentional bottlenecks (see the concurrent iteration example on the home page).
Their first example is basically pretending basic promise concurrency on maps doesn't exist when it's been around before Promises were an official API (Search: bluebird promise map concurrency)
And the third examples are much easier to maintain with a simple function I drafted up for sake of argument `processArrayWithStages(array: T[], limit: number, stages: Array<(item: T) => Promise<T>>)`
In my experience, unless you're doing math in promises which I would recommend shifting to WASM, you're not going to feel the overhead of these intermediary arrays at scales that you shouldn't be in WASM already (meaning over several millions of entries).
The amazing people and teams working on V8 have done well to optimize these flows in the past four years because they are extremely common in popular applications. Strings aren't even copied in the sense you might think, you have a pointer to a chunk that can be extended and overwritten which incurs a penalty but saves tons of RAM.
processArrayWithStages: https://gist.github.com/DevBrent/0ee8d6bbd0517223ac1f95d952b...
pipe and map are too complex for "ECMAScript devs?" give me a break!
> should be ported to WASM
valid.
Node.js pipe and map are standard implementations most developers understand. I do not know what these strange implementations are from this library. They do not make sense to me. That's what I mean. Why learn 3 different meanings of pipe? I already have to learn too many meanings of pipe across multiple environments and languages. I do not need more pipe and map meanings within javascript.
Ohhh, my mistake I misread you and revealed my ignorance. Thanks for clarifying.
Edit: I'll further reveal my ignorance. What's the function piping API in node? Is it in the browser? Does work like the pipe operator |> in R, Elixir, and F#?
great library! I just tried it out and compared it with iter-tools
I like Lfi better, great documentation!
Here is the example I use to compare: https://stackblitz.com/edit/stackblitz-starters-ia9ujg6m?fil...
Love to see `iter-tools` come up, though obviously less so for getting beat.
Still, I love seeing innovation in this field as much as anyone! And it's true that I've been dragging my feet on building concurrency support for iter-tools...
That said, iter-tools still has some killer APIs like peekerate that I would find it quite hard to live without, and for iter-tools 8 I have some major tricks planned like making the sync/async divide go away completely.
"lfi is a lazy functional sync, async, and concurrent iteration library for JavaScript and TypeScript"
Looks good (I understand all the word and even the whole sentence).
But what is it good for? ("What's in it for me?")
What are the use cases? How is it difference / how does it improve on / when should I use it instead of / ... lodash, ramda, functional.js, RxJs, etc.?
> But what is it good for? ("What's in it for me?") > What are the use cases?
I think the examples on the home page and in the "getting started" answer these questions to be honest. Do you feel it's unclear or something is missing?
> How is it difference / how does it improve on / when should I use it instead of / ... lodash, ramda, functional.js, RxJs, etc.?
I think some of that info is implicitly there in the docs, but I think this is good feedback. I should probably have a page comparing lfi vs other libraries.
To answer here though, for most of the libraries you mentioned, they don't provide support for _concurrent_ async iteration (and some of them don't support even sequential async iteration).
RxJS is the exception there I think. Although, I think lfi has a bit of a simpler API (that also matches the sync versions of the APIs) that's primarily designed for iteration rather than generalized observables. Plus, I don't _think_ RxJS has the same performance characteristics around tree-shakeability.
Looks useful and powerful. Nice work!
Thank you!
I don't understand (but am open to be persuaded), isn't iteration in js now trivially easy and actually quite powerful? Especially since we can now put 'await' everywhere? What are the killer use cases for this lib?
I think there are two things the library makes easier than what you're suggesting:
- Some people like/prefer writing iteration in a functional style, which this library enables
- It's pretty easy to create unintentional async/concurrency bottlenecks with the simple `Promise.all` approach (see the home page example)
for the record sprinkling await everywhere is not something you should take lightly. await is still probably the single most high-level high-overhead construct in JS, and putting it inside tight loops is a recipe for perf disaster
Depends what you are waiting for.
Maybe an in memory cache hit---that you explicitly code---causes a sync resolve to happen, woohoo!
But if it goes to IO and you are waiting for the event loop to schedule you back in you could be in for a shock even if the IO op is tiny. Especially doing it alot.
Only matters probably on a high CPU usage server which hopefully you avoid by scaling up on cpu and using all the cores.
On the other hand... you are being cooperative :)
But tight loop IO is an antipattern ... as is doing a lot---millions---of IO ops from node (per request or job) that you need to worry about this (use a different language for your DB server!).
How does this compare to Effect, or Fluture? There have been so many attempts at improving these things in JS that it's hard to keep track.
easy differences: Effect does everything and is huge, this does iteration only and is tiny
Yeah, that's true.
I think it just has completely different goals. Read through the home page examples and getting started and you'll see it's pretty different.
Were you aware of IxJS [0]? I've used that to good success over the years. It's relationship to RxJS keeps it familiar.
Any thoughts on what Lfi does better/different than IxJS?
I was not aware of it!
As far as I can tell, it has some similar ideas (and seems pretty nice!), but I noticed the following differences:
- I don't _think_ it supports "concurrent" iteration [1]
- It doesn't have a "reducer" concept with composition and what not [2]
- It doesn't have a concept of "optionals" [3]
[1] https://lfi.dev/docs/concepts/concurrent-iterable
> I don't _think_ it supports "concurrent" iteration
I believe what you are calling a "concurrent" iteration is one of the use cases for Observables, so would be on the RxJS side (the "parent" project).
> It doesn't have a "reducer" concept with composition and what not
Interesting. I'll read further on what you are doing there. Seems possibly similar to Lenses/Prisms in FP, with different rules.
> It doesn't have a concept of "optionals"
Good point. So far I've not found a JS approach to optionals I'm entirely happy with. I've had some suggest I should give Effect [0] a deeper look, but given I'm mostly happy with RxJS the bulk of Effect doesn't appeal to me. The dual of an Option being an iterator with 0 or 1 values is something I've seen before and something to keep in mind.
> I believe what you are calling a "concurrent" iteration is one of the use cases for Observables, so would be on the RxJS side (the "parent" project).
Ah, got it. I don't think I quite understood the relationship between this project and RxJS. I did know about RxJS's observables, yes. I do sort of hint at the relationship between concurrent iterables and the observables [1], but I tried to make concurrent iterables a bit more specialized to what you'd normally use an iterable for.
> Interesting. I'll read further on what you are doing there. Seems possibly similar to Lenses/Prisms in FP, with different rules.
I think I've heard of lenses (but not prisms), but don't think I ever actually learned what they are. Would be curious if you find a connection between those and my reducers :)
> Good point. So far I've not found a JS approach to optionals I'm entirely happy with. I've had some suggest I should give Effect [0] a deeper look, but given I'm mostly happy with RxJS the bulk of Effect doesn't appeal to me. The dual of an Option being an iterator with 0 or 1 values is something I've seen before and something to keep in mind.
Yeah, I liked representing optionals as iterables because you get a lot of functionality for "free" [2]
[1] https://lfi.dev/docs/concepts/concurrent-iterable#:~:text=A%....
[2] https://lfi.dev/docs/concepts/optional#why-use-iterables-to-...
Prisms are Lenses through Options. They compose very similarly to Lenses and it is easy to convert a Lens to a Prism to compose Prisms from existing Lenses. Every Lens is a very simple "degenerate" Reducer (a Lens is just a getter and a setter from one "state" to another), but Prisms feel even more like "real" Reducers because of the optional states that can be involved. So Prisms and Lenses are very common composable Reducers in some styles of FP.
Just so I understand: Could this library be used for iterating large collections without blocking the main thread? Basically as a replacement for a worker with synchronous iteration?
Yes, but I'd add a caveat.
It will be very efficient when the async operations are few, and slower. It will not be very efficient when the async operations are many and fast.
That's because the await keyword itself blocks the main thread every time a function call is made. It has to because the `await` keyword is defined to return to the event loop and resume processing only during the next "tick" and after other queued ticks have run.
For a large collection the overhead on the event loop can be calculated as: COLLECTION_SIZE * FUNCTION_CALLS_PER_ELEMENT. Since function calls per element goes up as you wrap layers of helper functions, even with this library you would face a strong perverse incentive to avoid using (async) function calls or nested iterators (if at all possible) when handling large collections, especially with high desired throughput, especially in situations where you want the event loop to stay unclogged so that you can, for example, redraw the UI.
There's a rather famous blog post that cuts to the heart of all this: https://blog.izs.me/2013/08/designing-apis-for-asynchrony/
In the most ridiculous scenario you try to process two high-throughput async data streams concurrently and you end up with this nightmarish "bouncing" where each "thread" of processing can only process until it sees an await keyword before being forced to cede control back the other thread. At this point it would be reasonable to expect that the amount of overhead would exceed the amount of real work being done, and in such a way that the text of the EcmaScript spec makes impossible to optimize or fix.
Thank you for taking the time to explain! While the details seem to be well over my head, I think I‘ll stick to a worker then, as the specific thing I was talking about was taxonomy faceting of a pretty large dataset. Thank you again!
Having had a quick read of the concept, I believe it's concurrent in the sense that Promise.all() is compared to awaiting in a for loop?
In other words, put on the stack at the same time but not a different thread.
Do correct me if I misunderstood though.
Interesting. How is the concurrency achieved? I don’t see any info about concurrency implementation on the website
Some info on this page: https://lfi.dev/docs/concepts/concurrent-iterable
Recommend reading it all, but this part is probably most useful: https://lfi.dev/docs/concepts/concurrent-iterable#how-does-i...
Ok. So it’s all single threaded, right? Then it only looks like run concurrently from the consumer side. There’s no actual work being done at the same time
Another fp-ts
I'm somewhat familiar with fp-ts, and I think lfi is pretty different.
fp-ts seems more focused on "pure" functional programming in the style of languages like Haskell. It's much more opinionated on how you should write your code, including the data structures you should use. Plus, it's not really concerned with concurrency in the way that lfi is.
I think lfi is a lot less invasive/opinionated on how you write your code.
thanks for this explanation. fp-ts is not specifically concerned with concurrency, but does run async computations concurrently (in parallel) by default where possible, but you can opt out from it by using seq (sequntial versions) of functions
Call me an idiot, but nowhere on the home page does it say it's for JavaScript. The code samples looked like JS, but I don't know every language, so I wasn't sure. It's not until I got into the docs and saw "npm" was a sure we were talking about JS...
I came to comment the same thing, having clicked through from the link and not seen the OP’s text. In my case, I searched for and found the GitHub link at the bottom of the landing page and saw the language in the repo details.
Please add “for JavaScript” to the title!
> nowhere on the home page does it say it's for JavaScript
Yeah, and it is not mentioned in the post title. It's like JS is considered the default programming language.
should probably say "JS" in the title too - there are all sorts of devs on HN afterall
it says "lfi is a lazy functional sync, async, and concurrent iteration library for JavaScript and TypeScript" when you click on the link though
has it been added since?
Yup, I added it in response to the feedback :)
That's fair! Agreed it would be good to mention on the home page :)
I did mention in the HN post description, but probably should also say so in the title
Actually, looks like HN doesn't allow me to update the HN title at this point. Sorry about that
Was gonna post some snark about assumed languages, countries, etc, now i think there could be an interesting post in that..
Should be renamed to lfi.js