My most recent example of this is mentoring young, ambitious, but inexperienced interns.
Not only did they produce about the same amount of code in a day that they used to produce in a week (or two), several other things made my work harder than before:
- During review, they hadn't thought as deeply about their code so my comments seemed to often go over their heads. Instead of a discussion I'd get something like "good catch, I'll fix that" (also reminiscent of an LLM).
- The time spent on trivial issues went down a lot, almost zero, the remaining issues were much more subtle and time-consuming to find and describe.
- Many bugs were of a new kind (to me), the code would look like it does the right thing but actually not work at all, or just be much more broken than code with that level of "polish" would normally be. This breakdown of pattern-matching compared to "organic" code made the overhead much higher. Spending decades reviewing code and answering Stack Overflow questions often makes it possible to pinpoint not just a bug but how the author got there in the first place and how to help them avoid similar things in the future.
- A simple, but bad (inefficient, wrong, illegal, ugly, ...) solution is a nice thing to discuss, but the LLM-assisted junior dev often cooks up something much more complex, which can be bad in many ways at once. The culture of slowly growing a PR from a little bit broken, thinking about design and other considerations, until its high quality and ready for a final review doesn't work the same way.
- Instead of fixing the things in the original PR, I'd often get a completely different approach as the response to my first review. Again, often broken in new and subtle ways.
This lead to a kind of effort inversion, where senior devs spent much more time on these PRs than the junior authors themselves. The junior dev would feel (I assume) much more productive and competent, but the response to their work would eventually lack most of the usual enthusiasm or encouragement from senior devs.
How do people work with these issues? One thing that worked well for me initially was to always require a lot of (passing) tests but eventually these tests would suffer from many of the same problems
> - Many bugs were of a new kind (to me), the code would look like it does the right thing but actually not work at all, or just be much more broken than code with that level of "polish" would normally be.
This reminded me of a quarter million dollar software project one of my employers had contracted to a team in a different country. On the face of it - especially if you go and check by the spec sheet - everything was there but the thing was not a cohesive whole. They did not spend one second beyond the spec sheet and none of the common sense things that "follow" from the spec were there. The whole thing was scrapped immediately.
With LLMs this kind of work now basically becomes free to do and automatic.
I'm expecting to see so much more poor quality software being made. We're going to be swimming in an ocean of bad software.
Good experienced devs will be able to make better software, but so many inexperienced devs will be regurgitating so much more lousy software at a pace never seen before, it's going to be overwhelming. Or as the original commenter described, they're already being overwhelmed.
> Good experienced devs will be able to make better software
I lowkey disagree. I think good experienced devs will be pressured to write worse software or be bottlenecked by having to deal with bad software. Depends on company and culture of course. But consider that you as expereinced dev now have to explain things that go completely over the head of the junior devs, and most likely the manager/PO, so you become the bottleneck, and all pressure will come down on you. You will hear all kinds of stuff like "80% there is enough" and "dont let perfect be the enemy of good" and "youre blocking the team, we have a deadline" and that will become even worse. Unless you're lucky enough to work in a place with actually good engineering culture.
I think the recent post about the Cloudflare engineer who built an OAuth implementation, https://news.ycombinator.com/item?id=44159166, shows otherwise (note the Cloudflare engineer, kentonv, comments a bunch in the discussion). The author, who is clearly an expert, said it took him days to complete what would have taken him weeks or months to write manually.
I love that thread because it clearly shows both the benefits and pitfalls of AI codegen. It saved this expert a ton of time, but the AI also created a bunch of "game over" bugs that a more junior engineer probably would have checked in without a second thought.
There was also a review of that code about a week later [0] which highlights the problems with LLM-generated code.
Even looking strictly at coding, the hard thing about programming is not writing the code. It is understanding the problem and figuring out an elegant and correct solution, and LLM can't replace that process. They can help with ideas though.
> There was also a review of that code about a week later [0] which highlights the problems with LLM-generated code.
Not really. This "review" was stretching to find things to criticize in the code, and exaggerated the issues he found. I responded to some of it: https://news.ycombinator.com/item?id=44217254
Unfortunately I think a lot of people commenting on this topic come in with a conclusion they want to reach. It's hard to find people who are objectively looking at the evidence and drawing conclusions with an open mind.
Thanks for responding. I read that dude's review, and it kind of pissed me off in an "akshually I am very smart" sort of way.
Like his first argument was that you didn't have a test case covering every single MUST and MUST NOT in the spec?? I would like to introduce him to the real world - but more to the point, there was nothing in his comments that specifically dinged the AI, and it was just a couple pages of unwarranted shade that was mostly opinion with 0 actual examples of "this part is broken".
> Unfortunately I think a lot of people commenting on this topic come in with a conclusion they want to reach. It's hard to find people who are objectively looking at the evidence and drawing conclusions with an open mind.
Couldn't agree more, which is why I really appreciated the fact that you went to the trouble to document all of the prompts and make them publicly available.
Thank you for answering, I haven't seen your rebuke before. It does seem that any issues, even if there would be any (your arguments about CORS headers sound convincing to me, but I'm not an expert on the subject - I study them every time I need to deal with this) were not a result of using LLM but a conscious decision. So either way, LLM has helped you achieve this result without introducing any bugs that you missed and Mr. Madden found in his review, which sounds impressive.
I won't say that you have converted me, but maybe I'll give LLMs a shot and judge for myself if they can be useful to me. Thanks, and good luck!
To be fair, there was a pretty dumb CVE (which had already been found and fixed by the time the project made the rounds on HN):
https://github.com/cloudflare/workers-oauth-provider/securit...
You can certainly make the argument that this demonstrates risks of AI.
But I kind of feel like the same bug could very easily have been made by a human coder too, and this is why we have code reviews and security reviews. This exact bug was actually on my list of things to check for in review, I even feel like I remember checking for it, and yet, evidently, I did not, which is pretty embarrassing for me.
You touch on exactly the point that I try to make to the AI-will-replace-XXX-profession crowd: You have to already be an expert in XXX to get the most out of AI. Cf. Gell-Mann Amnesia.
I'm showing my age, but this is almost exactly analogous to the rise of Visual Basic in the late nineties.
The promise then was similar: "non-programmers" could use a drag-and-drop, WYSIWYG editor to build applications. And, IMO, VB was actually a good product. The problem is that it attracted "developers" who were poor/inexperienced, and so VB apps developed a reputation for being incredibly janky and bad quality.
The same thing is basically happening with AI now, except it's not constrained to a single platform, but instead it's infecting the entire software ecosystem.
We turned our back on VB. Do we have the collective will to turn our back on AI? If so I suspect it’ll take a catalyzing event for it to begin. My hunch tells me no, no we don’t have the will.
We didn't turn our back on VB. Microsoft killed it when it became a citizen of the .NET ecosystem; pairing C# concepts, requiring extensive code changes and an IDE that was read-only during debug (yah, you couldn't edit the code while debugging) killed the product.
Greed (wanting an enterprise alternative to Java and C++ builder) killed VB, not the community.
Fwiw I honestly think it was a mistake to turn our back on vb.
Yes there were a lot of crappy barely functioning programs made in it. But they were programs that wouldn’t have existed otherwise. Eg. For small businesses automating things vb was amazing and even if the program was barely functional it was better than nothing.
When the Derecho hit Iowa and large parts of my area were without power for over a week we got to discover just how many of our very large enterprise processes were dependent to some degree on "toy" apps built in "toy" technologies running on PCs under people's desks. Some of it clever but all of it fragile. It's easy to be a strong technical person and scoff at their efforts. Look how easily it failed! But it also ran for years with so few issues it never rose to IT's attention before a major event literally took the entire regional company offices offline. It caused us some pain as we had to relocate PCs to buildings with sufficient backup power. But overall the effort was far smaller than building all of those apps with the "proper" tools and processes in the first place.
Large companies can be a red tape nightmare for getting anything built. The process overload will kill simple non-strategic initiatives. I can understand and appreciate less technical people who grab whatever tool they can to solve their own problems when they run into blockers like that. Even if they don't solve it in the best way possible according to experts in the field. That feels like the hacker spirit to me.
Please don’t stop at building “toy” prototypes, it’s a great start, but take some time to iterate, rebuild, bring it to production standards, make it resilient and scalable.
You’d be surprised how little effort it is compared to having to deal a massive outage. E.g. You did eventually had to think about backup power.
Came here looking for this comment!
I think we will need to find a way to communicate “this code is the result of serious engineering work and all tradeoffs have been thought about extensively” and “this code has been vibecoded and no one really cares”. Both sides of that spectrum have their place and absolutely will exist. But it’s dangerous to confuse the two
There's a simple way to communicate it. Just leave in the emoticons added in comments by the LLM.
Wrote it initially as a joke, but maybe it's not that dumb? I already do it on LinkedIn. I'm job hunting and post slop from time to time to game LinkedIn algorithms to get better positioning among other potential candidates. And not to waste anybody's time, I leave in the emotes at beginning of sentences just so people in the know know it's just slop (so as not to waste their time).
Interesting thought. Yeah.. the whole LLM-generated thing might end up being a boon. It is (reasonably) distinctive. At least for now. And rightly or wrongly it triggers defensive reflexes
Drag and drop GUI builders were awesome. Responsive layouts ruined GUI programming for me. It made it too much of a fuss to make anything "professional".
> Do we have the collective will to turn our back on AI?
Why do you believe we should "turn our back on AI"? Have you used it enough to realize what a useful tool it can be?
Wouldn't it make more sense to learn to turn our backs on unhelpful uses of AI?
It's the exact same thing every time a technical bar is lowered and more people can participate in something. From having to manually produce your own film to having film processing readily available on demand to not needing to process film at all and everyone has a camera in their pocket. The number of people taking photos has absolutely exploded. The average quality of photos has to have fallen through the floor. But you've also got a ton of people who couldn't participate previously for one reason or another who go on to do great things with their new found capabilities.
Software is a very different beast though because this crappy technical debt lives on, it often grows "tentacles" with poorly defined boundaries, people and companies come to depend on it, and then the mess must eventually be cleaned up.
Take your photos example. Sure, the number of photos taken has exploded, but who cares if there are now reams and reams of crappy vacation photos - it's not like anyone is really forced to look at it.
With AI-generated code, I think it's actually awesome for small, individual projects. And in capable hands, they can be a fantastic productivity enhancer in the enterprise. But my heart bleeds for the poor sap who is going to eventually have to debug and clean up the mountains of AI code being checked in by folks with a few months/years of experience.
> , people and companies come to depend on it, and then the mess must eventually be cleaned up.
I have found time and again that enough technological advancement will make previously difficult things easy that when it's time to clean up the old stuff, it's not such a huge issue. Especially so if you do not need to keep a history of everything and can start fresh. This probably would not fly in a huge corp but it's fine for small/medium businesses. After all, whole companies disappear and somehow we live on.
There are ways to fight it though. Look at Linux kernel for instance - they have been overwhelmed with poor contributions long before LLMs. The answer is to maintain standards that put as much burden on the contributor as possible, and normalizing unapologetic "no" from reviewers.
Does that work as well with non-strangers who are your coworker? I'm not sure.
Also if you're organizationally changing the culture to force people to put more effort in writing the code, why are you even organizationally using LLMs...?
> Does that work as well with non-strangers who are your coworker? I'm not sure.
Simply hire people who score high on the Conscientiousness, but low on the Agreeableness personality trait. :-)
> Does that work as well with non-strangers who are your coworker?
Yeah, OK, I guess you have to be a bit less unapologetic than Linux kernel maintainers in this case, but you can still shift the culture towards more careful PRs I think.
> why are you even organizationally using LLMs
Many people believe LLMs make coders more productive, and given the rapid progress of gen AI it's probably not wise to just dismiss this view. But there need to be guardrails to ensure the productivity is real and not just creating liability. We could live with weaker guardrails if we can trust that the code was in a trusted colleague's head before appearing in the repo. But if we can't, I guess stronger guardrails are the only way, aren't they?
I don’t want to just dismiss the productivity increase. I feel 100% more productive on throw away POCs and maybe 20% more productive on large important code bases.
But when I actually sit down and think it through, I’ve wasted multiple days chasing down subtle bugs that I never would have introduced myself. It could very well be that there’s no productivity gain for me at all. I wouldn’t be at all surprised if the numbers showed that was the case.
But let’s say I am actually getting 20%. If this technology dramatically increases the output of juniors and mid level technical tornadoes that’s going to easily erase that 20% gain.
I’ve seen codebases that were dominated my mid level technical tornadoes and juniors, no amount of guardrails could ever fix them.
Until we are at the point where no human has to interact with code (and I’m skeptical we will ever get there short of AGI) we need automated objective guardrails for “this code is readable and maintainable”, and I’m 99.999% certain that is just impossible.
My point in that second question was: Is the human challenge of getting a lot of inexperienced engineers to fully understand the LLM output actually worth the time, effort and money to solve vs sticking to solving the technical problems that you're trying to make the LLM solve?
Usually organizational changes are massive efforts. But I guess hype is a hell of an inertia buster.
The change is already happening. People graduating now are largely "AI-first", and it's going to be even worse if you listen to what teachers tell. And management often welcomes it too. So you need to deal with it one way or another.
> Does that work as well with non-strangers who are your coworker? I'm not sure.
I imagine if you have a say in their performance review, you might be able to set "writes code more thoughtfully" as a PIP?
No, because that's not measurable
It's measurable in the number of times you have to spend >x minutes to help them go through something they should have written up by themselves. You can count the number of times you have to look at something and tell them "do it again, but without LLM this time". At some point you fire them.
That’s not measurable either. Your opinion on someone is not data.
My opinion on someone is how I decide whether I want to work with them and help them grow or fire them/wait for them to fail on their own merit (if somebody else is in charge of hiring/firing).
Yes, but some of us have seen this coming for a long time now.
I will have my word in the matter before all is said and done. While everyone is busy pivoting to AI I keep my head down and build the tools that will be needed to clean up the mess...
Any hints on what kind of tools you're creating for the inevitable mess?
https://github.com/bablr-lang/
I'm building a universal DOM for code so that we should see an explosion in code whose purpose is to help clean up other code.
If you want to write code that makes changes to a tree of HTML nodes, you can pretty much write that code once and it will run in any web browser.
If you want to write code that makes a new program by changing a tree of syntax nodes, there are an incredible number of different and wholly incompatible environments for that code to run in. Transform authors are likely forced to pick one or two engines to support, and anyone who needs to run a lot of codemods will probably need to install 5-10 different execution engines.
Most people seem not to notice or care about this situation or realize that their tools are vastly underserving their potential just because we can't come up with the basic standards necessary to enable universal execution of codemod code, which also means there are drastically lower incentives to write custom codemods and lint rules than there could/should be
Who is the consumer for the JSX noise that is happening here? https://github.com/bablr-lang/language-en-ruby/blob/550ad6fd...
As two nits, https://docs.bablr.org/reference/cstml and https://bablr.org/languages/universe/ruby are both 404, but I suspect that latter one is just falling into the same trap as many namespaces make of using a URL when they meant it as a URN
We're cleaning up the broken links as time goes on, but it is probably obvious to you from browsing around that some parts of the site are still very much under construction.
The JSX noise is CSTML, a data format for encoding/storing parse trees. It's our main product. E.g. a simple document might look something like `<*BooleanLiteral> 'true' </>`. It's both the concrete syntax and the semantic metadata offered as a single data stream.
The easiest way to consume a CSTML document is to print the code stored in it, e.g. `printSource(parseCSTML(document))`, which would get you `true` for my example doc. Since we store all the concrete syntax printing the tree is guaranteed to get you the exact same input program the parser saw. This means you can use this to rearrange trees of source code and then print them over the original, allowing you to implement linters, pretty-printers, or codemod engines.
These CSTML documents also contain all the information necessary to do rich presentation of the code document stored within (syntax highlighting). I'm going to release our native syntax highlighter later today hopefully!
[dead]
Where does this "universal DOM for code" sit in relation to CSTs and ASTs?
It's an immutable btree-based format for syntax trees which contain information both abstract and concrete. Our markup language for serializing the trees is Concrete Syntax Tree Markup Language, or CSTML.
A faster command to recursively unlink files.
>We're going to be swimming in an ocean of bad software
I think we already are. We're about to be drowning in a cesspit. The support for the broken software is going to be replaced by broken LLM agents.
> I'm expecting to see so much more poor quality software being made. We're going to be swimming in an ocean of bad software.
That's my expectation as well.
The logical outcome of this is that the general public will eventually get fed up, and there will be an industry-wide crash, just like in 1983 and 2000. I suppose this is a requirement for any overly hyped technology to reach the Plateau of Productivity.
> Good experienced devs will be able to make better software,
No, they won't. It's a race to the bottom.
I can take extra time to produce something that won't fall over on the first feature addition, that won't need to be rewritten with a new approach when the models get upgraded/changed/whatever and will reliably work for years with careful addition of new code.
I will get underbid by a viber who produced a turd in an afternoon, and has already spent the money from the project before the end of the week.
Please, somebody make the Is MongoDB webscale? video for LLMs...
And for extra credit, create it using an LLM.
I'm waiting for someone to use an LLM to handle all AWS deployment, without review, and eventual bankrupcy as the result.
Even better if the accountants are using LLMs.
Or even better, hardware prototyping using LLMs with EEs barely knowing what they are doing.
So far, most software dumbassery with LLMs can at least be fixed. Fixing board layouts, or chip designs, not as easy.
AWS itself is currently polluting their online documentation with GenAI generated snippets...I can only imagine what horrors lurk on their internal code base. In a move similar to the movie War Games, maybe humans are now out of the loop, and before a final commit LLMs are deciding....
Honestly, I expect LLM’s or the combination of algorithms that make them usable (Claude Code), to get better fast enough that we’ll never reach that phase. All the good devs know what the current problem with LLM assisted coding are, and a lot of them are working to mitigate and/or fix those problems.
Did anyone say react in the windows start menu?
Folks, we already have bad software. Everywhere.
And nobody cares.
Windows usage share is slowly and surely falling. People care, they're just slow to realise.
https://gs.statcounter.com/os-market-share/desktop/worldwide...
If you want to sell high quality software, then you must be patient. Several decades worth of patient.
People care, it's just that they're not the ones shipping as often.
- [deleted]