I was able in an afternoon to implement a pretty decent completely async Swift DNS resolver client for my app. DNS clients are simple enough to build that rolling your own async is not a big deal anymore.
Yes, there is separate work to discern what DNS server the system is currently using: on macOS this requires a call to an undocumented function in libSystem - that both Chromium and Tailscale use!
A lot of folks think this, but did you also implement EDNS0?
The golang team also thought DNS clients were simple, and it led to almost ten years of difficult to debug panics in Docker, Mesos, Terraform, Mesos, Consul, Heroku, Weave and countless other services and CLI tools written in Go. (Search "cannot unmarshal DNS message" and marvel at the thousands of forum threads and GitHub issues that all bottom out at Go implementing the original DNS spec and not following later updates.)
The first linked article was recently discussed here: RIP pthread_cancel (https://news.ycombinator.com/item?id=45233713)
In that discussion, most of the same points as in this article were already discussed, specifically some async DNS alternatives.
See also here the discussion: https://github.com/crystal-lang/crystal/issues/13619
I am always amused when folks rediscover the bad idea that is `pthread_cancel()` — it’s amazing that it was ever part of the standard.
We knew it was a bad idea at the time it was standardized in the 1990s, but politics — and the inevitable allure of a very convenient sounding (but very bad) idea — meant that the bad idea won.
Funny enough, while Java has deprecated their version of thread cancellation for the same reasons, Haskell still has theirs. When you’re writing code in IO, you have to be prepared for async cancellation anywhere, at any time.
This leads to common bugs in the standard library that you really wouldn’t expect from a language like Haskell; e.g. https://github.com/haskell/process/issues/183 (withCreateProcess async exception safety)
What's crazy is that it's almost good. All they had to do was make the next syscall return ECANCELED (already a defined error code!) rather than terminating the thread.
Musl has an undocumented extension that does exactly this: PTHREAD_CANCEL_MASKED passed to pthread_setcancelstate.
It's great and it should be standardized.
It’s extremely easy to write application code in Haskell that handles async cancellation correctly without even thinking about it. The async library provides high level abstractions. However your point is still valid as I do think if you write library code at a low level of abstraction (the standard library must) it is just as error prone as in Java or C.
IO can fail at any point though, so that’s not particularly bad.
For those using it in Python, Gevent provides a pluggable set of DNS resolvers that monkey-patch the standard library's functions for async/cooperative use, including one built on c-ares: https://www.gevent.org/dns.html
gevent. Man that's a blast from the past
Just curious how you approached performance bottlenecks — anything surprising you discovered while testing?
Who can fix getaddrinfo?
There are steps that three different parties can take, which do not depend on other parties to cooperate:
POSIX can specify a new version of DNS resolution.
libcs can add extensions, allowing applications to detect when they are targeting those systems and use them.
Applications on Linux and Windows can bypass libc.
What about macOS?
It's weird to me that event-based DNS using epoll or similar doesn't have a battle-tested implementation. I know it's harder to do in C than in Rust but I'm pretty sure that's what Hickory does internally.
it’s a weird problem, in that (1) DNS is hard, and (2) you really need the upstream vendor to solve the problem, because correct applications want to use the system resolver.
If you don’t use the system resolver, you have to glue into the system’s configuration mechanism for resolvers somehow … which isn’t simple — for example, there’s a lot of complex logic on macOS around handling which resolver to use based on what connections, VPNs, etc, are present.
And the there’s nsswitch and other plugin systems that are meant to allow globally configured hooks plug into the name resolution path.
(1) DNS is hard
It's really not.
Just because some systems took something fundamentally simple and wrapped a bunch of unnecessary complexity around it does not make it hard.
At its core, it's an elegant, minimal protocol.
Another related article: https://ziglang.org/devlog/2025/#2025-10-15