I tend to be sceptical when it comes to LLM based coding tools but many people seem to be raving about huge productivity gains which I wouldn’t mind as well.
However when trying cc it left me vey disappointed. For context I’m working on a relatively greenfield rust project and gave it tasks that I would consider appropriate for a junior level colleague like:
- change the return type of a trait and all it’s impls
- refactor duplicate code into a helper function
- replace some of our code with an external crate
it didn’t get any of them correct and took a very long time. Am I using the tool wrong?
How are you using cc or other agentic tools?
The more open-source code the LLM has read (ie is in its training data), the better it will be. Rust is not a very popular language compared to others, so Claude might not be able to perform as well as it does with, say, Python or Javascript. Also, the more recent models might be better than the older ones.
I've been having fun with Claude Code and VSCode's agent. Any reasonably experienced engineer should be able to use it for a subset of languages without too many issues, but they definitely need to hydrate the context (eg. using Claude.md) and have a sensible set of system prompts set up. Good, well-written and broken-down-into-steps user prompts are non-negotiable.
In my limited experience, it’s better to create a full plan for what you want first, then ask your agent to build pieces of it at a time. Larger plans tend to fail because of the complexity. I will ask it in some small detail to work on part a), while I create instructions for part b). I then review the result of a), before letting it continue with b), and so on. In my experience this has worked well, but you should always review the results very carefully of course.
I recently used claude code and go to build a metric receiving service for a weather station project. I had it build all the code for this. It created the SQL statements, the handlers, the deployment scripts, the systemd service files, the tests, a utility to generate api keys.
I think you are using the wrong language to be honest. LLMs are best at languages like Python, Javascript and Go. Relatively simple structures and huge amounts of reference code. Rust is a less common language which is much harder to write.
Did you give claude code tests and the ability to compile in a loop? It's pretty good in go at least at debugging and fixing issues when allowed to loop.
How do you give cc the ability to compile in a loop?
simply tell it what command it should call to compile, or better yet add that info to Claude.md
I can't live without it now. I think its ability to develop a new project(I use it to build websites) from zero to one is very strong, but for refactoring old projects, it might need to get more familiar with the context first. You can open multiple treminal simultaneously to execute tasks in parallel. Also, I recommend using the ultrathink keyword - this is very useful when Claude Code needs to understand complex problems.
It's very good at refactoring, creating boilerplate, making big changes with moderate levels of precision.
Current LLM's at least a reasonable percentage of the time still get stuck on race conditions and bugs not obvious via static analysis. If you can explain the exact source of a bug to an LLM they can get it, but if there's a seemingly obvious solution that isn't the correct one, they will try to fix things the wrong way.
It's best to use AI in areas where a lack of specificity or precision isn't a major hinderance, and all abstraction is a closed loop that won't hurt you in the future due to not knowing how it works.
Totally get where you’re coming from. I had a similar first impression, it felt slow, and the output wasn’t great unless I hand-held it through every step.
What helped me was shifting how I use it. I don’t treat it like a junior dev anymore, I treat it more like a second brain. For example:
I use Claude Code to explore options before I commit to a design. I’ll ask it “what are 3 ways to abstract this logic?” and sometimes that alone gives me a better direction.
It’s pretty good at turning rough notes or comments into starter code or test cases. That saves time on boilerplate.
If I feed it a clean, self-contained chunk of code and ask for a targeted change (e.g., “convert this to async”), it often nails it. But yeah, across a codebase, not so much.
I had a great experience with it refactoring a project to use more modern patterns. A Go project moving from a deprecated and unwieldy query builder to using SQLC and protobuf for UI types etc. It did a great job just working its way through the entire system.
Had less luck on generating new features. It's great for prototyping UI but I routinely end up writing it myself.
It's also quick to forget how I like to do things or what libraries and packages it should use. So I either have to keep reminding it or fix up the work myself. While I'm unsure whether it still ends up being quicker, that's really immaterial for me because it absolutely kills the enjoyment of the work.
With Cursor for the last 6 months, and ClaudeCode for the last 2 weeks, I went from building one/two apps/websites a year for me and my friends, to launching a new MVP in a few days.
I now have 5-10 small services running, whatever "thing" I think I need I create it and self host it.
It's such a revolution.
I used Cursor (with ChatGPT) for the first time last week to build two web apps and it worked well. Working results first go, then a few hours of tweaking and adding features. These were simple CRUD apps using PHP and SQLite on Apache. I had to clean up the front-end code a bit, but it all worked. I’m not strong on SQL, so it was neat to have all the database code working on the first pass, and helped me keep up the momentum and finish them, instead of getting bogged down and frustrated. Definitely plan to keep using Cursor, and try other AI tools.
Also feeling let down by it.
Have been using it to build a DSL in JS. Greenfield. I’ve followed the commonly touted “plan, act, evaluate” approach; I’ve got it to generate a clear project vision, scope, and feature checklist. Then told it to refer to that for context. I’ve been descriptive and explicit in my prompting, way more so than previously.
It has gotten the broad strokes right, I’ve got an exceptionally barebones DSL, made up of 5 entities, working…just.
It has now started to spin its wheels on small issues and can’t fix them without breaking something else. The codebase isn’t even big (~8 main functions across a few files). Troubleshooting the code is difficult because it’s convoluted and I lack the same intuition for it I would have had I written it myself. I’ve decided to rewrite everything with less control ceded to the LLM.
When it works, it feels great. When it doesn’t, which is often, the spell is broken and I feel I’ve wasted a bunch of time and have not much to show for it.
Maybe someone has better insights but what I have seen is that Claude Code is not amazing on greenfield. I mean it will generate something, which will probably work, but the solution can often be over-engineered, or hacky.
I think we have to build up enough code for it to start appearing like brownfield, before Claude knows how to engineer correctly. Which kind of makes sense if we view Claude Code as a junior engineer with infinite stamina.
I also actually like to spin up Claude Code and Gemini in parallel to see what each one comes up with. Gemini will often do the simpler approach, but not often fully featured, and my solution often ends up taking the 2 solutions and refining in Cursor to come up with the final solution.
It greatly depends on what I am doing. If I am using common libraries especially when using JS it is particularly handy, similar with Python. When you branch out into the more obscure it's accuracy and knowledge ends up being unsurprisingly limited.
Same, tried Claud code over the past month. Let it do different tasks.
It’s useful as an built-in quick docs / search that can spit out small code fragments.
Every time I gave it more space results were disappointing.
I've been using Claude Code to decent results. Its not 10xing my productivity but it can do tasks like that. I haven't done much to configure it, I just regularly clear the context and @reference the relevant files.
What language are you using?
The original post says Rust.
> I’m working on a relatively greenfield rust project
I haven't had good luck using LLMs with Rust, but it may just be me.
[dead]