HNNewShowAskJobs
Built with Tanstack Start
AI documentation you can talk to, for every repo(deepwiki.com)
165 points by jicea 4 days ago | 126 comments
  • blopker4 days ago

    I took a look at a project I maintain[0], and wow. It's so wrong in every section I saw. The generated diagrams make no sense. The text sections take implementation details that don't matter and present them to the user like they need to know them. It's also outdated.

    I hope actual users never see this. I dread thinking about having to go around to various LLM generated sites to correct documentation I never approved of to stop confusing users that are tricked into reading it.

    [0]: https://deepwiki.com/blopker/codebook

    • andybak4 days ago |parent

      I just tried it on several of my repos and I was rather impressed.

      This is another one of those bizarre situations that keeps happening in AI coding related matters where people can look at the same thing and reach diametrically opposed conclusions. It's very peculiar and I've never experienced anything like it in my career until recently.

      • DrewADesign4 days ago |parent

        > at the same thing

        But you’re not looking at the same thing — you’re looking at two completely different sets of output.

        Perhaps their project uses a more obscure language, has a more complex architecture, resembles another project that’s tripping up the interpretation of it. You have have excellent results without it being perfect for everything. Nothing is perfect and it’s important for people making these things to know how, right?

        In my career I’ve never seen such aggressive dismissal of people’s negative experiences without even knowing if their use case is significantly different.

      • statusfailed4 days ago |parent

        Which repos worked well? I've had the same experience as op- unhelpful diagrams and bad information hierarchy. But I'm curious to see examples of where it's produced good output!

      • esperent4 days ago |parent

        > people can look at the same thing and reach diametrically opposed conclusions. It's very peculiar and I've never experienced anything like it in my career until recently

        React vs other frameworks (or no framework). Object oriented vs functional. There's loads of examples of this that predate AI.

        • alansammarone4 days ago |parent

          I dont think it's quite the same. The cases you mention are more like two alternative but roughly functionally equivalent things. People still argue and use both, but the argument is different. Even if people don't explicitly acknowledge it, at some level they understand it's a difference in taste.

          This feels to me more like the horses vs cars thing, computers vs... something (no computers?), crypto vs "dollar-pegged" money, etc. It's deeper. I'm not saying the AI people are the "car" people, just that...there will be one opinion that will exist in 5-20 years, and the other will be gone. Which one... we'll see.

          • esperent4 days ago |parent

            > People still argue and use both, but the argument is different

            React vs no framework is at least in the same ballpark as AI vs no AI. Some people are determined to prove to the world that React/AI/functional programming solves everything. Some people are determined to prove the opposite. Most people just quietly use them without feeling like they need to prove anything.

        • Xss33 days ago |parent

          This is such an apples to oranges comparison that it makes me suspicious of your motives here.

          Bad documentation full of obvious errors and nonsense is very different to having an opinion on OO vs Functional programming.

          Even that sentence sounds insane because who would ever compare the two?!

      • oblio4 days ago |parent

        You could link your docs so we can compare them to OP's docs.

        No need to guess.

        • reconnecting3 days ago |parent

          Same here.

          Original: https://docs.tirreno.com/

          Deepwiki: https://deepwiki.com/tirrenotechnologies/tirreno

          Github: https://github.com/tirrenotechnologies/tirreno

      • 4 days ago |parent
        [deleted]
      • naveen994 days ago |parent

        [flagged]

    • frumiousirc4 days ago |parent

      I have a fairly large code base that has been developed over a decade that deepwiki has indexed. The results are mixed but how they are mixed gives me some insight into deepwiki's usefulness.

      The code base has a lot of documentation in the form of many individual text files. Each describe some isolated aspect of the code in dense, info-rich and not entirely easily consumable (by humans) detail. As numerous as these docs are, the code has many more aspects that lack explicit documentation. And there is a general lack of high-level documentation that tie each isolated doc into some cohesive whole.

      I formed a few conclusions about the deepwiki-generated content: First, it is really good where it regurgitates information from the code docs while being rather bad or simply missing for aspects not covered by the provided docs. Second, deepwiki is so-so for providing a high layer of documentation that sort of ties things together. Third, it is highly biased about the importance of various aspects by their code docs coverage.

      The lessons I take from this are: deepwiki does better ingesting narrative than code. I can spend less effort on polishing individual documentation (not worrying about how easy it is for humans to absorb). I should instead spend that effort to fill in gaps, both details and to provide higher-level layers of narrative to unify the detailed documentation. I don't need to spend effort on making that unification explicit via sectioning, linking, ordering, etc as one may expect for a "manual" with a table of contents.

      In short, I can interpret deepwiki's failings as identifying gaps that need filling by humans while leaning on deepwiki (or similar) to provide polish and some gap putty.

      • Xss33 days ago |parent

        If documenting the why rather than the how you often end up tying high level concepts together.

        E.g. If you describe how the user service exists you wont necessarily capture where it is used.

        If you document why the user service exists you will often mention who or what needs it to exist, the thing that gives it a purpose. Do this throughout and everything ends up tied together at a higher level.

    • NewsaHackO4 days ago |parent

      > The text sections take implementation details that don't matter and present them to the user like they need to know them. It's also outdated.

      The point of the wiki is to help people learn the codebase so they can possibly contribute to the project, not for end users. It absolutely should explain implementation details. I do agree that it goes overboard with the diagrams. I’m curious, I’ve seen other moderately sized repo owners rave about how DeepWiki did very well in explaining implementation details. What specifically was it getting wrong about your code in your case? Is it just that it’s outdated?

      • blopker4 days ago |parent

        I dunno, it seems to be real excited about a VS Code extension that doesn't exist and isn't mentioned in the actual documentation. There's just too many factual errors to list.

        • NewsaHackO4 days ago |parent

          >I dunno, it seems to be real excited about a VS Code extension that doesn't exist and isn't mentioned in the actual documentation. There's just too many factual errors to list.

          There is a folder for a VS Code extension here[0]. It seems to have a README with installation instructions. There is also an extension.ts file, which seems to me to be at least the initial prototype for the extension. Did you forget that you started implementing this?

          [0] https://github.com/blopker/codebook/blob/c141f349a10ba170424...

          • curl-up4 days ago |parent

            This thread should end up in the hall of fame, right next to the Dropbox one.

            From a fellow LLM-powered app builder, I wish you best of luck!

            • raincole3 days ago |parent

              Yeah, this is a thread worth saving. Even just as an example of multiple people who can't read as well as an LLM.

            • oblio4 days ago |parent

              Plot twist, OP has a doc mentioning it as unreleased.

          • sceptic1234 days ago |parent

            In that folder is CHANGELOG.md[0] that indicates that this is unreleased. I'd say that including installation instructions for an unreleased version of the extension is exactly the issue that is being flagged.

            [0] https://github.com/blopker/codebook/blob/main/vscode-extensi...

            • NewsaHackO4 days ago |parent

              You are going to want to reread the file you are quoting buddy. That changelog is indicative that the extension has been released. The Unreleased section seems to list features that are not yet included in the released version of the VS Code extension, and the future plans are features that have not been developed yet.

          • throwaway2904 days ago |parent

            here the maintainer says it doesn't exist. there's basically no way another interpretation is "more correct". presence or files can be not intended for use, deprecated, internal, WIP, etc. this is why we need maintainers.

            • NewsaHackO4 days ago |parent

              Maintainers are not gods, and don't get to rewrite plainly true facts. In the Changelog, it actually says it is a "Initial release of Codebook VS Code extension".

              • throwaway290a day ago |parent

                compared to an llm they are an authoritative source...

          • blopker3 days ago |parent

            I brought up this issue because I thought it illustrated my previous points nicely.

            Yes, there is a VS Code folder in that repo. However, it doesn't exist as an actual extension. It's an experiment that does not remotely work.

            The LLM generated docs has confidently decided that not only does it exist, but it is the primary installation method.

            This is wrong.

            Edit: I've now had to go into the Readme of this extension to add a note to LLMs explicitly to not recommend it to users. I hate this.

            • ninininino3 days ago |parent

              Is it possible that a random person who discovered your repo from Google search would make the same mistake the LLM did and assume it works and not realize it was an unfinished experiment?

              • rng-concern3 days ago |parent

                Yes, and so the value of the persons opinions on the repo is low. Far lower than real documentation written by someone who knows more, that would not have made that mistake.

                The value proposition here is that these llm docs would be useful, however in this case they were not.

                • NewsaHackO3 days ago |parent

                  >Far lower than real documentation written by someone who knows more, that would not have made that mistake.

                  But his own documentation did said that there was a VSCode extension, with installation instructions, a README, changelog, etc. From what he said, it doesn't even compile or remotely work. It would be extremely aggravating to attempt to build the project with the maintainer's own documentation, spend an hour trying to figure out what's wrong, and then contact the maintainer for him to say, "oh yeah, that documentation not correct, that doesn't even compile even though I said it did 2 months ago lol." It is extremely ironic that he is so gungho about DeepWiki getting this wrong.

                  • ninininino3 days ago |parent

                    Yes, this is my point. It seems like the creator was a little bit lazy to create such a full fledged readme.md with so much polish but -entirely neglect to mention the whole thing is broken and unfinished-.

                    That seems about as annoying as a random wiki mis-explaining your system.

                    That being said, I am still biased towards empathizing with the library author since contributing to open source should be seen as being a great service already in and of itself, and I'd default to avoiding casting blame at an author for not doing things "perfectly" or whatever when they are already doing volunteer work/sharing code they could just keep private.

                    • blopker3 days ago |parent

                      This.

                      The WIP code was committed with the expectation that very few people would see it because it was not linked anywhere in the main readme. It's a calculated risk, so that the code wouldn't get out of date with main. The risk changed when their LLM (wrongly) decided to elevate it to users before it was ready.

                      It's clear DeepWiki is just a sales funnel for Devin, so all of this is being done in bad faith anyway. I don't expect them to care much.

                    • NewsaHackO3 days ago |parent

                      >That being said, I am still biased towards empathizing with the library author since contributing to open source should be seen as being a great service already in and of itself, and I'd default to avoiding casting blame at an author for not doing things "perfectly" or whatever when they are already doing volunteer work/sharing code they could just keep private

                      This is true, and the only reason for this was more so his dismissive view of DeepWiki than a criticism of the project itself or of the author as a programmer. LLMs hallucinate all the time, but there is usually a method to the way they do so. Particularly, for it to just say a repo had a VSCode extension portion with nothing pointing to it would not be typical at all for an LLM like DeepWiki.

          • Phelinofist4 days ago |parent

            What a plot twist

            • NewsaHackO4 days ago |parent

              It’s funny, I accidentally put a link to the commit instead of the current repo file because I was investigating whether or not he committed it versus he recently took over the project and didn’t realize the previous owner had started one. But he is the one who actually committed the code. I guess LLMs are so good now that they’re stopping developers from hallucinating about code they themselves wrote.

          • raincole4 days ago |parent

            Wow. Better advertisement for LLM in three comments than anything OpenAI could come up with.

            • lionkor4 days ago |parent

              It might be internal, unfinished, a prototype, in testing and not yet for public use. It might exist but do something else.

              This is not an ad for LLMs. If you think this is good, you should probably not ever touch code that humans interact with.

    • rmnclmnt4 days ago |parent

      I fear the consequences will be even darker:

      - Users are confused by autogenerated docs and don’t even want to try using a project because of it

      - Real curated project documentation is no longer corrected by users feedback (because they never reach it)

      - LLMs are trained on wrong autogenerated documentation: a downward spiral for hallucinations! (Maybe this one could then force users go look for the official docs? But not sure at this point…)

      • vissi4 days ago |parent

        > LLMs are trained on wrong autogenerated documentation: a downward spiral for hallucinations! (Maybe this one could then force users go look for the official docs? But not sure at this point…)

        On this, I think, we should have some kind of AI-generated meta-tag, like this: https://github.com/whatwg/html/issues/9479

        • bt1a4 days ago |parent

          I wonder what incentives for adherence to the use of this meta-tag might exist? For example, imagine I send you my digital resume and it has an AI-generated footer tag on display? Maybe a bad example- I like the idea of this in general, but my mind wanders to the fact that large entities completely ignored the wishes of robots.txt when collecting the internet's text for their training corpuses

          • mrdevlar4 days ago |parent

            Large entities aside, I would use this to mark my own generated content. Would be even more helpful if you could get the LLM to recognise it which would allow you to prevent ouroboros situations.

            Also, no one is reading your resume anymore and big corps cannot be trusted with any rule as half of them think the next-word-machine is going to create God.

    • onion2k4 days ago |parent

      I went to the lodash docs and asked about how I'd use the 'pipeline' operator (which doesn't exist) and it correctly pointed out that pipeline isn't a thing, and suggested chain() for normal code and flow() for lodash fp instead. That's pretty much spot on. If I was guessing I'd suggest that the base model has a lot more lodash code examples in the training data, which probably makes a big difference to the quality of the output.

      • billyp-rva4 days ago |parent

        The lack of a pipeline operator in JS (and JS libraries like lodash) has also been discussed online a lot.

        • onion2k4 days ago |parent

          Exactly the point. If there's a lot of data in the training set the results will be better.

          • billyp-rva4 days ago |parent

            I guess I'm trying to emphasize the distinction between information in the repo (code) vs. information elsewhere (discussions) that the model looks at.

    • skissane4 days ago |parent

      > It's so wrong in every section I saw.

      Not talking about this tool, but in general-incorrect LLM-generated documentation can have some value - developer knows they should write some docs, but are starring at a blank screen and not sure what to write so they don’t. Then developer runs an LLM, gets a screenful of LLM-generated docs, notices it is full of mistakes, starts correcting them-suddenly, a screenful of half-decent docs.

      For this to actually work, you need to keep the quantity of generated docs a trickle rather than a flood-too many and the developer’s eyes glaze over and they miss stuff or just can’t be bothered. But a small trickle of errors to correct could actually be a decent motivator to build up better documentation over time.

      • aswegs84 days ago |parent

        At some point it will be less wrong (TM) and it'll be helpful. Feels generally like a good bet.

        • Xss33 days ago |parent

          Will it though?

          Fundamentally this is an alignment problem.

          There isnt a single AI out there that wont lie to your face, reinterpret your prompt, or just decide to ignore your prompt.

          When they try to write a doc based off code, there is nothing you can do to prevent them from making up a load of nonsense and pretending it is thoroughly validated.

          Do we have any reason to believe alignment will be solved any time soon?

          • aswegs82 days ago |parent

            Why should this be an issue? We are producing more and more correct training data and at some point the quality will be sufficient. To me its not clear what speaks against this.

            • Xss3a day ago |parent

              Look up AI safety and THE aligment problem.

              This isnt a matter of training data quality.

              • skissanea day ago |parent

                We don’t expect 100% reliability from humans-humans will slack off, steal, defraud, harass each other, sell your source code to a foreign intelligence service, turn your business behind your back into a front for international drug cartels-some of that is very low probability, but never zero probability-so is it really a problem if we can’t reduce the probability to literally zero for AIs either?

    • rwmj4 days ago |parent

      I tried it on a big OCaml project (https://deepwiki.com/libguestfs/virt-v2v) and it seems correct albeit very superficial. It helps that the project is extensively documented and the code well commented, because my feeling is that it's digesting those code comments along with the documentation to produce the diagrams. It seems decent as a starting point to understanding the shape of the project if I'd never seen it before. This is the sort of thing you could do yourself but it might take an hour or more, so having it done for you is a productivity gain.

    • vultour4 days ago |parent

      > I hope actual users never see this

      I have bad news for you, this website has been appearing near the top of the search results for some time now. I consciously avoid clicking on it every time.

    • bulbar3 days ago |parent

      Please don't correct the AI documentation. Just let those projects die as they deserve.

    • NicoJuicy4 days ago |parent

      What model did you use?

    • ewoodrich4 days ago |parent

      > The text sections take implementation details that don't matter and present them to the user like they need to know them.

      Yeah this seems to be a recurring issue on each of the repos I've tried. Some occasionally useful tables or diagrams buried in pages of distracting irrelevant slop.

    • blibble3 days ago |parent

      they will

      it's the first result on google for just about anything technical I search for

    • bn-l4 days ago |parent

      This is made by “Devin” I believe.

  • jasonjmcghee4 days ago

    This gets posted pretty frequently.

    231 points | 77 days ago | 53 comments

    https://news.ycombinator.com/item?id=45002092

    • cuuupid4 days ago |parent

      YMMV, my experience with DeepWiki is that it’s decent but the DX of the documentation is horrible and the diagrams are often just incorrect.

      Worth mentioning this is a Cognition / Devin on-ramp and has been posted on HN a few times in just a couple months, feels a little sales-y to me.

      • jorvi4 days ago |parent

        From the title I assumed it would generate docs to put in the repo.

        But it's docs outside the dev's purview on a deepwiki url, used to shepherd people into Devin. Wow. Talk about slimy.

        • 63stack4 days ago |parent

          Just another parasitic way of extracting value out of open source

      • oblio4 days ago |parent

        These comments should be pushed to the top of the pile.

  • 63stack4 days ago

    This is one of those sites I filtered out from my Kagi search results. Too often I stumble onto this when I'm looking for something, and it's never ever useful. There is never a time I want to look at flowcharts when looking for documentation, a solution to an error message I'm facing, or a syntax for something.

    • eloisius3 days ago |parent

      Same. This is just the next iteration of all those spam sites back in the 2010s that used to mirror GitHub issues, but wowee it uses AI and you can chat with it! Forever grateful to Kagi for the ability to block sites from my results.

  • WhyNotHugo4 days ago

    I tried a few different repositories (both my own and various other people’s projects). They all yield the same:

        No repositories found
    
        No repositories matching "https://git.sr.ht/~whynothugo/ImapGoose" were found.
    
    Probably broken/down right now?
    • frumiousirc4 days ago |parent

      deepwiki doesn't spider. Repos are indexed upon request. The request dialog accepts a non-github URL.

      • moffkalast4 days ago |parent

        Now I'm wondering who requested my repo lmao.

    • h4ck_th3_pl4n3t4 days ago |parent

      Maybe they only support github?

      • Vinnl4 days ago |parent

        Yeah, I wanted to try it on my (GitLab) repo as well, but it also said "No repositories found". Clicking "Index any public repo" pops up a dialog that says "Search for a GitHub repository" and "or Enter the URL of a public GitHub repository".

        So looks like it's not actually any repository.

        • WhyNotHugo3 days ago |parent

          Yeah, the site is pretty ambiguous. It says “any repository”, but they require a GitHub URL.

          That explains why none of the projects which I tried worked.

          I wonder why they’d use a descentralised protocol but then only support a single host.

    • dataviz10004 days ago |parent

      I've looked at mine and it take 10 to 15 minutes to process.

  • ofalkaed4 days ago

    I am quite impressed, even if it was not completely right and provided a few rather humorous charts, it is close enough assuming you are following along with the code. A great improvement over the alternatives for getting oriented in unfamiliar code, should save me a great deal of time.

    The only issues I have with it are that they layout is not great on small screens, poor experience on my 13" laptop, and I really wish you could hide the "Ask Devin" dialog. The experience is pretty good on my tablet though, I would prefer to use the tablet for reading/annotating the code and have deepwiki on the laptop but not that big of a deal.

  • ytreister4 days ago

    I tried it with my repo and it is actually really nice. I kind of want to link to this so that anyone wanting to make contributions to my repo can learn about the code structure.

    My repo has a plugin structure (https://github.com/ytreister/gibr), and I love how it added a section about adding a new plugin: https://deepwiki.com/ytreister/gibr/7.4-adding-a-new-issue-t...

  • tylerrecall4 days ago

    Interesting approach. The challenge I keep hitting with AI-generated documentation is that it lacks the persistent context of how the codebase actually evolved - the decisions, the "why we didn't do X" knowledge, the patterns that emerged over time.

    I'm working on RecallBricks (memory infrastructure for AI coding tools) and seeing similar problems: AI tools are great at answering questions about code right now, but they don't remember the conversation you had last week about why you chose this architecture over that one.

    For documentation specifically, have you thought about combining the AI-generated docs with a memory layer that captures decision history? Like "this API endpoint exists because of issue #247 where users needed X functionality." That context makes docs way more useful than just describing what the code does.

    Curious how you're handling the "outdated docs" problem mentioned above - do you have triggers to regenerate when code changes significantly?

  • shevy-java4 days ago

    How many errors does that contain - anyone knows stats for that?

    I see "AI summaries" on github all the time. It's like a wall of text and seems to be designed to be super-verbose but without seemingly being very informative.

    • portaouflop4 days ago |parent

      It’s very bad. So bad that you need to filter it out of search results; but it’s being pushed hard on HN, I wonder if there is some concerted bot action that influences this

    • sceptic1234 days ago |parent

      "If I had more time I would have written a shorter letter"

  • aDyslecticCrow4 days ago

    This is a nice idea in theory. But you need excellent docs in the firstplace for it to work.

    And if a human spent painstaking effort writing excellent docs, the least bit of respect i can give them is read it.

    • andybak4 days ago |parent

      > But you need excellent docs in the first place for it to work.

      Are you sure? I just tried it on projects of mine that have almost zero documentation it did a fairly good job.

      • aDyslecticCrow3 days ago |parent

        Really? How large is your project?

        There is a very clear point in codebase size where LLMs tend to falter without very clear written down overview descriptions of the system structure. I have a hard time seeing that this system would be immune to that.

        i have encountered LLMs seeminly knowing more about a system than it should because there are many similar in its training set; but that just lead me to be extra sceptical when it pulls up functions that dont exist. (Ive fought LLMs about json libraries quite a bit)

  • afro884 days ago

    Do we need this, when we have tools like Claude Code, Codex etc that you can talk to about the codebase they are started in?

    • ijustlurk4 days ago |parent

      Agreed, nice idea in theory. But as a codebase owner I’d rather build tailored markdown files with a CLI agent to publish as my docs. And as a codebase consumer I probably only care about a codebase if I’m modifying or running it, which means a CLI agent makes the most sense and I can ask questions/generate .md files as we go.

    • anuramat4 days ago |parent

      > codebase they are started in

      what about the dependencies? you could just clone them as well (which is what I do occasionally), but deepwiki is faster (for indexed repos) and free

  • dvt4 days ago

    As always, these kinds of things are good for "simple" stuff (e.g. stuff you don't really need AI for) but totally suck for "complicated" or "weird" things. For example, I curiously ran it on one of my OSS projects: https://github.com/dvx/lofi

    It's a cute little Electron-based mini Spotify player that gets maybe like 200 users a day and has 1.3k stars on GitHub. Code quality is pretty high and it's more or less "feature-complete." There's a lot of simple/typical React stuff in there, but there's also some weird stuff I had to do. For example, native volume capture is weird. But even weirder is having to mess with the Electron internal window boundaries (so people can move their Lofi window where-ever they want to).

    We're essentially suppressing window rect constraints using some funky ObjectiveC black magic[1]. The code isn't complicated[1], but it's weird and probably very specific to this use case. When I ask what "constraints" does, DeepWiki totally breaks, telling me it doesn't even have access to those source files[2] (which it does).

    Visualizations were also actually disabled on MacOS a few versions ago (because of the janky way you need to hook into the audio driver), but, again DeepWiki doesn't really notice[3]. There have been issues/patch notes about this, so I feel those should be getting crawled.

    [1] https://github.com/dvx/lofi/blob/master/src/native/black-mag...

    [2] https://deepwiki.com/search/what-is-constraints_cc5c0478-e45...

    [3] https://deepwiki.com/search/how-do-macos-visualizations-wo_d...

  • cyberax4 days ago

    I insta-banned this site in Kagi. The trigger for me: utter disrespect for the user with unhideable glassy floating chatbox at the bottom of the page.

    And WTF with these floating boxes popping up everywhere?!? They are tailor-made to trigger anxiety in people with OCD. They look like a notification that keep grabbing your attention as you scroll the text. Example: https://aws.amazon.com/blogs/aws/secure-eks-clusters-with-th...

    • supriyo-biswas4 days ago |parent

      Help yourself with https://secure.fanboy.co.nz/fanboy-annoyance.txt or one of its variants.

    • walterbell4 days ago |parent

      > floating boxes

      Will need boxblock.

  • CyberShadow4 days ago

    Looks like it's impossible for me to use this service - when I try to submit the form, I get a reCAPTCHA challenge. By the time I complete it (Google requires me to make several attempts, each one being several pages), the page errors out in the background with "reCAPTCHA execution timeout".

    • lionkor4 days ago |parent

      Try solving it slowly, some captchas love that.

  • billyp-rva4 days ago

    From a year ago: AI can't diagram codebases, and it isn't even close [0]

    [0] https://www.ilograph.com/blog/posts/diagrams-ai-can-and-cann...

  • theletterf4 days ago

    You need great pre-existing docs for something like this to work properly.

    AI must RTFM. https://passo.uno/from-tech-writers-to-ai-context-curators/

    • alansammarone4 days ago |parent

      It certainly helps, but in my experience you get 60-80% of the benefit just with code (except in legacy or otherwise terrible code, for example with misleading/outdated comments everywhere, bad variable/function names, etc - in that case more like 40%).

      • theletterf3 days ago |parent

        Why stopping at 60%?

  • 4 days ago
    [deleted]
  • rckt4 days ago

    This doesn't work. It's better to prompt an agent with specific questions per subject. Having this general AI interpretation of a doc can be amazingly misleading. Nice idea, but unfortunately absolutely useless and even time wasting at the moment.

  • dkersten4 days ago

    I don’t want to talk to my documentation. I just want the facts searchable and easily readable.

    • input_sh4 days ago |parent

      I agree wholeheartedly, at best I want a "smarter" search bar where I don't have to guess the exact wording of what I'm looking for, but the reply should still be a verbatim quote from the docs, not something regurgitated to be less accurate.

  • mingodad4 days ago

    I've asked to index my project https://github.com/mingodad/parsertl-playground and the result https://deepwiki.com/mingodad/parsertl-playground seems to be reasonable good (still going through in more detail but overall impressive).

  • juliangmp4 days ago

    I mean no offense to the people that created this, but this has been a domain I blocked in duckduckgo's search results for a while now.

    I really don't like how AI summaries creep up in SEO rankings and make it harder for me to find the actual, official documentation.

  • deevus4 days ago

    Works pretty well for gdzig

    https://deepwiki.com/gdzig/gdzig/1-overview

    • roflcopter694 days ago |parent

      Hi! Cool to see you commenting here, great work on gdzig btw :)

      • deevus4 days ago |parent

        Thanks! Who knew I would be known for that. Not me!

  • 4 days ago
    [deleted]
  • df0b9f169d544 days ago

    I wanted to try the tool with a repo I know. After a few attempts to select cars, bus,crosswalks, I got "capchat timeout error".

  • Ultimatt4 days ago

    This worked well for me for some things I've recently been learning/working on. One improvement I'd add is the citations of where information have come from aren't hyperlinks it would be very useful if they were!

  • alansammarone4 days ago

    This is an interesting threads. There are many instances of "this is bad, doesn't work, don't like it", and many instances of "it works reasonably well here, look: <url>".

    Seems like a consistent pattern.

    • portaouflop4 days ago |parent

      It’s a propaganda and psyop operation on HN if you ask me. This stuff is laughably bad and I wonder who would actually use it for real work beyond a “huh this is cool” at first glance.

      HN is super susceptible to propaganda in the AI age unfortunately; I think at this point a lot of the comments and posts on here are from bots as well

    • internet_points4 days ago |parent

      There was some article here on how llm's are like gambling, in that sometimes you get great payouts and oftentimes not, and as psych 101 taught us, that kind of intermittent reward is addictive.

      • alansammarone4 days ago |parent

        Interesting point, never thought of it like that, and I think there is some truth to that view. On the other hand, IIRC, this works best in instances where it's pure chance (you have no control over the likelihood of reward) and the probability is within some range (optimal is not 50%, I think, could be wrong).

        I don't think either of this is true of LLMs. You obviously can improve its results with the right prompt + context + model choice, to a pretty large degree. The probability...hard to quantify, so I won't try. Let's just say that you wouldn't say you are addicted to your car because you have a 1% chance of being stuck in the middle of nowhere if it breaks down and 99% chance of a reward. The threshold I'm not sure.

  • 13174 days ago

    I tried it with my repo, it was impressive at first

    but then as i kept going along it just got tiring, it kept calling everything sophisticated even when it wasn't

    it's the same as all the other AI slop, it's really impressive the first time you see it

    and then you keep seeing it and get tired of its patterns of speech etc and oh it's just making up nonsense

    and now the ai slop "documentation" is up on the public internet for all to see with no way for me to remove it :)

  • vijaybritto4 days ago

    The diagrams generated are arbitrary and make no sense. This needs improvements

  • typpilol4 days ago

    I find it's better than context7, but that's not saying much

    • bn-l4 days ago |parent

      Context7 uses the real documentation of I’m not mistaken and just provides you a RAG mcp

      • typpilol4 days ago |parent

        I don't think so as it generates stuff even for projects without any documents

  • ramon1564 days ago

    I've seen this idea before claude code gemini cli etc were a thing. This is not relevant anymore (unless you surpass these tools).

    Cool idea, bad timing

    • alansammarone4 days ago |parent

      I don't know the specifics of this particular tool, I assume it's at most using a couple of passes of (some frontier model with specific system prompt + custom tools, for example code-specific rag + some form of "summarize"). By at most I mean "probably isn't doing anything crazier than that".

      But it seems to be producing docs that are better than I tend to see with basic "summarize this repo for me"-style prompts, which is what I usually use on a first pass.

  • virajk_314 days ago

    Is the documentation generated using LLMs? Anyway this would only work if the documentation is truly top notch and completely accurate

  • bittermandel4 days ago

    I use this heavily to navigate the neondatabase/neon repo and it has been invaluable

  • twp4 days ago

    deepwiki.com is untrustworthy AI slop. A true cancer.

    deepwiki.com's generated page on my project contains several glaring errors. I hate to think of the extra support burden I will have to bear because of deepwiki.com publishing wrong information.

    I asked the authors of the site (Andrew Gao) to remove their page on my project, but they ignored my request.

  • ufko_org4 days ago

    I'm very curious how this will turn out, and especially when :)

    https://github.com/cameyo42/newLISP-Code

    • ufko_org4 days ago |parent

      Looks really promising:

      https://deepwiki.com/cameyo42/newLISP-Code/3.1-newlisp-99-pr...

  • killerstorm4 days ago

    Tangentially related: AI assistants (ChatGPT, Claude and Gemini) are unable to get public code from GitHub. (I.e. specifically assistants you use via web site, not Codex, Claude Code, etc.)

    Again, as it might be hard to believe, as situation is rather insane: flagship AI assistants cannot get publicly available code. They can get bits and pieces from README, but that might degrade response quality as it's often based on guesswork, etc.

    Example via GPT-5 Thinking, my request:

    ``` Can you read code from https://github.com/killerstorm/auto-ml-runner/blob/master/ru... ?

    If yes, show me some port of code and how you got it. ```

    (That's normal URL user can access from the browser, you also get same result if you post top-level repo URL https://github.com/killerstorm/auto-ml-runner/).

    Thinking: 2 minutes. (IT WAS THINKING FOR TWO MINUTES JUST TO ACCESS ONE FILE!)

    ``` Short answer: yes...

    Why I’m not pasting a snippet right this second:

    In this environment, GitHub’s code pages are loading their chrome but not returning the file body ```

    So, actually, no, it cannot read it, but it believes it can. That's rather problematic.

    Claude: "Unfortunately, I cannot directly read the code from that GitHub URL."

    Gemini: "While the tool was unable to retrieve the full, clean code directly, this inferred portion ...". I.e. it just imagined the code. The snippet has nothing to do with code in repo.

    This is a rather fucktacular situation as agents are not sure if they read the code, and they might hallucinate subtly wrong code trying to be helpful.

    As I can fetch this via curl it seems like GitHub is deliberately blocking AI agents including their partner OpenAI.

  • marginalia_nu4 days ago

    So I gave it a spin on two of my repos.

    One is the extremely sprawling MarginaliaSearch repo[M1].

    Here it did a decent job of capturing the architecture, though it is to be fair well documented in the repo itself. It successfully identifies the most important components, which is also good.

    But when describing the components, it only really succeeds where the components themselves are very self-contained and easy to grok. It did a decent job with e.g. the buffer pool[M2], but even then fails to define some concepts that would have made it easier to follow, e.g. what is a pin count in buffer management? This is standard terminology and something the model should know.

    I get the impression it lifts a lot of its fact from the comments and documentation that already exists, which may lead it to propagate outdated falsehoods about the code.

    [M1] https://deepwiki.com/MarginaliaSearch/MarginaliaSearch

    [M2] https://deepwiki.com/MarginaliaSearch/MarginaliaSearch/5.2-b...

    The other is the SlopData[S1] repo, which contains a small library for columnar data serialization.

    This one I wasn't very impressed with. It produced more documentation than was necessary, mostly amending what was already there with incorrect statements it seems to have pulled out of its posterior[2][3].

    The library is very low-abstraction, and there simply isn't a lot of architecture to diagram, but the model seems to insist that there must be a lot of architecture and then produces excessive diagrams as a result.

    [S1] https://deepwiki.com/MarginaliaSearch/SlopData

    [S2] https://deepwiki.com/MarginaliaSearch/SlopData#storage-types (performance numbers are completely invented, in practice reading compressed data is typically faster than plain data)

    [S3] https://deepwiki.com/MarginaliaSearch/SlopData/6.3-zip-packa... (the overview section is false, all these tables are immutable).

    So overall it gives me a bit of a broken clock vibe. When it's right, it's great. When it isn't, it's not very useful. Good at the stuff that is already easy, borderline useless for the stuff that isn't.

  • darthvaden3 days ago

    This is nothing

  • voodooEntity4 days ago

    So i just tried this on 2 repositories.

    1. On (https://github.com/voodooEntity/gits) -> https://deepwiki.com/voodooEntity/gits

    This is a longterm golang project i work on and it has a very very detailed documentation already.

    While going through the AI docs of deepwiki, i could see how it profitted from my existing documentation, most stuff is just different words same content. What i liked about it was the visualisations (even if some of them are well "special") it shows some insides in workflows that i have in my mind but might give a benefit to others not beein the author

    While trying out the search/chat i have to admit it gave better answers than i expected.

    Due to having a very fond knowledge of how to do stuff efficiently with the lib, i tested the chat on telling me whats the most efficient way to achieve XYZ. While it listed me all possibilities (all of them correct) it also correctly pointed out whats the most "efficient" way.

    Also i gave it some question that, i know from experience when others first tried the lib, could be confusing. But it was resolved correctly.

    Allover a pleasant result

    2. On (github.com/electronicarts/CnC_Renegade/) -> https://deepwiki.com/electronicarts/CnC_Renegade/

    For those who dont know , CnC Renegade is a very old game (~2000) which was coded by the original Westwood. Its mainly in C++ (some c) and a through and through plain code. There is no real documentation in the repo other than some base info for dependencies etc.

    First of all i saw that the resulting documentation well.... lacked documentation i guess? It just in multiple pages explaind whats in the main Readme (which is not really alot). So from the "docs generating" perspective, no gain here.

    Than i tried to chat with it about it - and it seemed like it has a basic understanding of the code. For me its harder to validate the results (tbh i only read over the code once when it was released - curiosity) but it seemed like it was no total loss.

    Conclusion: To me it seems like, to get a very good basic documentation out of it, it already must have a good basic documentation. Apart from the graphics it added, i didn't really see a gain compared to the already existing documentation.

    Based on the chat results i'd say, those might be decent and helpfull if you dig into a new codebase especially a more complex one and you are searching for a specific thing in 1000s of loc in multiple files.

    Would i use it in the future? Ill maybe try, but only the chat feature - for the generated docs as elaborated i don't see any use.

  • esafak4 days ago

    It works! I love using it for open source repos.