Prepend tomcp.org/ to any URL to instantly turn it into an MCP server.
You can either chat directly with the page or add the config to Cursor/Claude to pipe the website/docs straight into your context.
Why MCP? Using MCP is better than raw scraping or copy-pasting because it converts the page into clean Markdown. This helps the AI understand the structure better and uses significantly fewer tokens.
How it works: It is a proxy that fetches the URL, removes ads and navigation, and exposes the clean content as a standard MCP Resource.
Repo: https://github.com/Ami3466/tomcp (Inspired by GitMCP, but for the general web)
I’m a bit confused because I don’t clearly understand the value this tool adds. Could you help me understand it?
From what I can see, if the content I want to enrich is static, the web fetch tool seems sufficient. Is this tool capable of extracting information from dynamic websites or sites behind login walls, or is it essentially the same as a web fetch tool that only works with static pages?
I am not quite clear why this adds value over a simple web fetch tool which does not require configuration per site.
I think this is a good idea in general, but perhaps a bit too simple. It looks like this only works for static sites, right? It then performs a JS fetch to pull in the html code and then converts it (in a quick and dirty manner) to markdown.
I know this is pointing to the GH repo, but I’d love to know more about why the author chose to build it this way. I suspect it keeps costs low/free. But why CF workers? How much processing can you get done for free here?
I’m not sure how you could do much more in a CF worker, but this might be too simple to be useful on many sites.
Example: I had to pull in a docs site that was built for a project I’m working on. We wanted an LLM to be able to use the docs in their responses. However, the site was based on VitePress. I didn’t have access to the source markdown files, so I wrote an MCP fetcher that uses a dockerized headless chrome instance to load the page. I then pull the innerHTML directly from the processed DOM. It’s probably overkill, but an example of when this tool might not work.
But — if you have a static site, this tool could be a very simple way to configure MCP access. It’s a nice idea!
Who is tom and why is he copying?
I thought this is what the web_fetch tools already did? Tools are configured through MCP also, right? So why am I prepending a URL, and not just using the web_fetch tool that already works?
Does this skirt the robots.txt by chance? Not being to fetch any web page is really bugging me and I'm hoping to use a better web_fetch that isn't censored. I'm just going to copy/paste the content anyway.
I think the idea here is that the web_fetch is restricted to the target site. I might want to include my documentation in an MCP server (from docs.example.com), but that doesn’t mean I want the full web available.
This is a clever solution to a real problem. I could use this for quick turn around from webpage kb to the mcp. Thanks for sharing.
Fun idea although I thought the industry is leaning towards using llms.txt.
Isn’t that for scraping? I think this is for injecting (or making that possible) to add an MCP front end to a site.
Different use cases, I think.
Cool :)