Somehow torvalds/linux is in Fronterra, next to JS projects, awesome-X lists, and frontend checklists.
Either kernel hackers unexpectedly love frontend, or more likely the people that write the code don't overlap much with the people that star Github projects!
Jaccard similarity is not particularly good for "celebrity" projects.
They are similar because they are popular, not because there is semantic relationship.
It's the same problem I faced with the map of reddit (https://anvaka.github.io/map-of-reddit/ ) - all popular subreddits are just "similar" to each other.
Stil works great for smaller, non-celebrity projects :D
I wonder if code embeddings might have been a better way to organize the projects, although probably infeasible given the amount of resources required to download and compute embeddings for each file.
Perhaps the same reason heat maps are often really the underlining population map https://xkcd.com/1138/
Because of react ?
Live link: https://anvaka.github.io/map-of-github/
Yeah, this should be the link - not the repo.
"Sussex" as the name of the Among Us section had me laughing
The funniest one I saw was "Lispaña"
Surprised at how small Rustland is. Barely a province in Clouderra.
Also, interesting how both Bevy and Veloren are in Rustland. Probably, the stars come more from the Rust community than the game dev community. Which I guess makes sense: the Rust ecosystem is still relatively small and feels like a lot of people doing X but in Rust.
I'm also shocked how small "nodelandia" is and that its not even its own continent. I guess we all overestimate the size of our bubbles
I can see many osdev Rust projects in "PlusPlus Nation" near other kernels, which mean that "X but in Rust" might be in "X" instead of "RustLand".
Not that surprised. Rust is known for being evangelized by a very loud minority.
The data is a from March 2023 according to OP so a lot of the more recent rust projects just won't be included yet.
Happy to see bevy between them though! :)
Also, lol at Zig being a suburb of Rust
Very fun to be able to find my own project there (mapbox-gl-utils):
https://anvaka.github.io/map-of-github/#12/24.78947/18.85186
A fun minigame is trying to find a particular project using the map only, without the search feature :-)
or start with one project and find your way to another, you can imagine there are shipping lines :)
love it =)!
As a fan of Julia, surprised to see how julialang/julia has so few links. It's a niche language; how isolated it is on this map is maybe not so unrepresentative of the user or developer experience.
There's a JuliaLand to the west of the island where julialang/julia is.
The fact that julialang/julia ended up near tensorflow and opencv, and actual Julia packages ended up elsewhere, probably reflects a difference between aspirational users and real users: a lot of people who starred the Julia project itself were numeric Python users who were looking for a new Python, but then mostly stuck to Python itself, so their other stars are in the numeric Python land. Those who starred the JuliaLand packages are the actual Julia users who aptly enough ended up near Moleculandia and AstroSpace and Quantumia.
Very neat and creative approach but I'm honestly conflicted whether the country/map metaphor is the best choice. In many cases the names are not that clear, so one has to zoom in to understand what they represent. It would perhaps be more interesting to do hierarchical clustering and show something like average connectiveness between the (super)clusters with lines, possibly with more descriptive/faithful LLM-generated labels for each cluster.
Quitlessia and NeoQuitlessia... These names are evil. (Doom Emacs lies in NeoQuitlessia instead of Emacsia, which surprisingly makes sense. :)
How are connections between repos determined? I checked some of my repos and don't see any references in either direction for some of the connections.
I'm not sure why BinanceLand is in AILandia though, please dont encourage them XD
Its good to see all those "why is X in Y?" type comments.
Remember that feeling when deploying algorithms, especially when those affect people (which hopefully in not the case with this nice project.
A mechanism to explain how specific results came about is as much part of the project as the more technical machine learning choices involved.
The author of this also made other outstanding vizualisations.
A while back ngraph blew my mind. I built a taxonomy biz off ngraph:
Wikimedia is right next to GPT Nation. I think an invasion is imminent.
Very interesting that HTMX (bigskysoftware/htmx), which is backend-agnostic, lives in Pythonia->Djangonia and not in e.g. Fronterra.
Does this mean that HTMX is mostly used by Django devs?
Django is in the middle of Pythonia, and not in Djangonia. Weird!
Lispaña is a really excellent name for a lisp country :)
I do have the theory that the more untyped the language is, the larger the islands are: Fronterra (JavaScript), Cloudderra (YAML), AILandia (Python) are way bigger than Java, Swift, DotNet, etc. even though the prejudice saying goes that the problem of software engineering is stale old enterprise code in Java/DotNet.
That might be the case, but the libraries seems to be more reusable!
Cool visualisation!
It was somewhat amusing that MicroPython isn't in MicroPythonia but Arduinoria...and CircuitPython is in PicoPythonia. :)
Kudos to the author for the amazing idea!
The only problem I see is that projects don't fit so nicely in the division between languages (Pythonia, Javaland, Clojuria, etc) and applications (Gamedonia, AILandia, etc). There's a lot of intersection between them.
But the visualization is super-cool nonetheless. :)
But stargazers are absolutely meaningless, since most of them are bots that give stars behind payment and like random stuff to throw off detection.
And as usual important libraries don't get as much attention as flash little leaf projects.
this looks really great!
I tried something similar a few weeks ago, using the embedding vectors of the Github project descriptions.
"Stop the war" looks like a very small territory, you don't even need to think what kind of message they send. It's so small in the grand scheme
Interesting that azureland is under l33t nation and not clouderia
> In the second phase I computed exact Jaccard Similarity between each repository.
Using what inputs? The repo seems to have only the frontend code.
Why was jaccard similarity preferred here i would love to learn more about the choice process. Fantastic Work though love it
I've been thinking something similar for identifying ownership areas within an organization would be cool.
ZH.Pyscrapia had an island of its own.
docker-minecraft is under Adulttopia. I wonder what made it make that connection
Nice, but kind of weird to find piku/Piku in Fronterra.
This is truly a work of art! Great job!
yay anvaka reaches front page!
fun times from reddit map
> Homelabia
Definitely some unique naming choices there lol
Very well done, loads quickly and is usable even from mobile.
I love this sort of concept map and I am typically disappointed by the execution.
"The GitHub Archipelago"
This is phenomenal!
couldn't find any of my stuff so that means i gotta do more lol
Cool
Interesting how one fork of Magisk lands in "AndroModLand" and another in some gaming space.
FORTRAN and COBOL Programming is a part of the AI island, lol.
Looks like AI is already trashing the place, lol.
[dead]
Some might say that PHP is dead (and I’d be one of them too), but there is a PHP kingdom on the map! :) I think we might have all been mistaken.