I'm trying to figure out the right and effective way to use Claude Code. So far, my experience hasn't matched what many in the community seem to rave about.
Often, it either generates convoluted implementations when simpler ones clearly exist, or it produces code that's riddled with bugs — despite confidently claiming it's correct. I'm wondering if I'm just not using it properly.
Here's my current workflow:
- I first talk to Gemini to gradually clarify and refine my requirements and design.
- I ask Gemini to summarize everything. Then I review and revise that summary.
- I paste the final version into Claude Code, use plan mode, and ask it to generate an implementation plan.
- I review the plan, make adjustments, and then let Claude Code execute it.
- Long wait…
- Review Claude’s output and clean up the mess.
For refactors and bugfixes, I usually write some tests in advance. But for new features, I often don’t.
It often feels like opening a loot box — 50% of the time it does a decent job, the other 50% is pretty bad. I really want to understand how to use it properly to achieve the kind of magical experience people describe.
Also, I’m on the Pro plan, and I rarely hit the rate limit — mainly because there’s a lot of prep work and post-processing I need to do manually. I’m curious about those who do hit rate limits quickly: are you running lots of tasks in parallel? Machines can easily parallelize, sure — but I don’t know how to make myself work in parallel like that.
You’re definitely not alone “AI code loot box” is a great description! I’ve been experimenting with Claude Code (and the other major models) since late last year, and my success rate seems to track yours unless I’m deliberate about “prompt engineering” and workflow. Here are a few things that have helped me get better, more reliable results:
1. Be uncomfortably explicit in prompts: Claude Code in particular is very sensitive to ambiguity. When I write a prompt, I’ll often:
Specify coding style, performance constraints, and even “avoid X library” if needed.
Give sample input/output (even hand-written).
Explicitly state: “Prefer simplicity and readability over cleverness.”
2. Break down problems more than feels necessary: If I give Claude a 5-step plan and ask for code for the whole thing, it often stumbles. But if I ask for one function at a time, or have it generate stub functions first, then fill in each one, the output is much more solid.
3. Always get it to generate unit tests (and run them immediately): I now habitually ask: "Write code that does X. Then, write at least 3 edge-case unit tests." Even if the code needs cleanup, the tests usually expose the gaps.
4. Plan mode can work, but human-tighten the plan first: I’ve found Claude’s “plan” sometimes overestimates its own reasoning ability. After it makes a plan, I’ll review and adjust before asking for code generation. Shorter, concrete steps help.
5. Use “summarize” and “explain” after code generation: If I get a weird/hard-to-read output, I’ll paste it back and ask “Explain this block, step by step.” That helps catch misunderstandings early.
Re: Parallelization and rate limits: I suspect most rate-limit hitters are power-users running multiple agents/tools at once, or scripting API calls. I’m in the same boat as you — the limiting factor is usually review/rework time, not the API.
Last tip: I keep a running doc of prompts that work well and bad habits to avoid. When I start to see spurious/overly complex output, it’s nearly always because I gave unclear requirements or tried to do too much in one message.
[dead]
Make sure your project has a CLAUDE.md or CLAUDE.local.md. You can use Claude to help you come up with this. Use Claude to maintain a list of common, simple workflows and reference them in CLAUDE.md. It’s not great yet for large scale work or refactors but it’s getting better month-by-month. It may never be able to do big work in one-shot.
I run tasks in parallel and definitely hit the rate limits.
I might be misunderstanding how it works, but from what I’ve seen, CLAUDE.md doesn’t seem to be automatically pulled into context. For example, I’ve explicitly written in CLAUDE.md to avoid using typing.Dict and prefer dict instead — but Claude still occasionally uses typing.Dict.
Do I need to explicitly tell Claude to read CLAUDE.md at the start of every session for it to consistently follow those preferences?
No, Claude Code will automatically read CLAUDE.md. LLMs are still hit or miss at following specific instructions. If you have a linter, you can put a rule there and tell Claude to use the linter.
If you posted some samples that would help us quiet a bit.
Claude successfully makes code edits for me 90% of the time. My two biggest pieces of advice blind are
1. Break down your task into smaller chunks - 30 minutes worth of human coding max. 2. On larger code bases git hints on what files to edit.
the number one important thing: ask it to write tests first, then the code. and instruct it to not overmock or change code to make tests succeed.
besides this: I have great results in combination with https://github.com/BeehiveInnovations/zen-mcp-server but ymmv of course and it requires also o3 and gemini api keys but the token usage is really low and the workflow works really great if properly used
Thanks a lot! I’ll definitely give that a try.
Is it possible to develop an intuition where it would do a decent job and use it only for these type of tasks or the result is always random?
I'd say yes. I am at a stage where I know some stuff will get screwed regardless. I do get surprised from time to another when I give it a really simple task and it fails miserably. I usually try to find the sweet spot where it can flow and let it run.
That's a great point. I think there is some pattern to when it works well or not, but I’m not sure if that’s universal or just tied to how I use it. Different prompting styles or workflows might lead to very different outcomes.
This sounds like a decent work flow. What makes you think it’s ineffective?
That’s the problem — about 50% of the time, the result is so messy that cleaning it up takes more time than just writing it. So I wonder is there a better way to prompt or structure things so that I consistently get clean, usable code?
> That’s the problem — about 50% of the time, the result is so messy that cleaning it up takes more time than just writing it.
Are you using git? as in "git checkout ." or "git checkout -b claude-trial-run"
This is my experience as well, and that of many people I’ve talked to. The ones who breathlessly state how awesome it is seem to all be business people to me rather than engineers. It keeps throwing me into doubt.
I’ve seen so many people praise Claude Code so highly that my first instinct was to assume I must be using it wrong. I’ve tried quite a few different workflows and prompting styles — but still haven’t been able to get results anywhere near as good as what those people describe.
[dead]