This is a nice tool. A game I liked to play announced end of service back in 2023. They gave enough notice to let me capture some logs from their cooridinator service.
I captured them in mitmproxy and ran those through this to help me identify all the endpoints and their general structure. (A few things were a misnomer, like the examples suggesting certain values were able to be floats when they could only be integers)
I was able to get a team together and we were able to stand up private servers as a result.
Amazing! What game was this for? I was involved in the RE efforts around UO way back in the day.
Gundam Evolution, going by comment history.
Different plot/game mechanics but armored core 6 is great if you like mecha
Gundam Evolution, as someone else noted from my comment history.
did i miss something or why are there TWO (2) "magically reverse engineer REST APIs" projects on the HN front page right now? is there some offline beef going on?
(screenshot in case this goes away https://x.com/swyx/status/1874762725383188502)
Presumably, because the closed source one got some traction, so people are pointing out the open source alternative.
Likely because of this comment[1] in the other thread which made people submit this link, and when multiple independent people submit the same link in a short period of time you're very likely to end up on the front page (this exact situation happened to me once)
Yeah, that's where I got the link from.
Offtopic and meta, but, you share a screenshot using Twitter/X? That's really bizarre to me. That is all, just had to say that.
how is it worse than photobucket or imgur
Again, this is the very easy part of the reverse engineering API process that most tools can do, similar to API Parrot and the rest of them. This is not hard to do.
The hard part is that inevitably, all these internal APIs will just add aggressive CAPTCHAs, Device Check, fingerprinting, etc to prevent common drive by re'ing. Easy to add these on the defence side, and extremely difficult to bypass on the other side.
I can imagine all developer teams now upping their security with the combination of the above mentioned to prevent this.
Depends on the age of the tool. We work with a lot of legacy systems that actually want us to integrate with them but don’t have the dev resources to build a proper API surface. As a result, we end up doing a lot of painful reverse engineering. These tools look promising for purposes like this.
I curious as to why people would have a public API to begin with if they wanted to protect it from people using it. Then again, why would anyone have a public undocumented API in 2024 when a LLM can give you a cli tool to auto-generate 90% of the OpenAPI spec in a couple of hours? The last question isn't serious, I've worked in enterprise for decades and almost none of the tools organisations end up buying have good documentation for their API's. Not that those are publicly available, but still.
I think you have a misunderstanding here.
The API needs to be "public" because the app uses the internet to communicate back to the home server.
The API is not "public" in the sense that the app developers want anybody to use it; they just want their app to use this API. So they don't write publicly accessible documentation about it because they don't want to encourage its use.
A tool like MitmProxy2Swagger lets you run the app and record all of its API calls so that you can use this unadvertised API.
Why wouldn’t you add authentication to an API you don’t want others to use?
The web app probably authenticates using an API as well, in which case it's trivial to add that to your shadow client as long as you have the credentials.
Laziness / skill issue.
How many apps have you seen only do client-side protection?
Making a mitmproxy dump from a manual browsing session is more or less unblockable, barring some TPM or similar fuckery.
Usage of the API even with the protocol known OTOH can be quite easily made really hard.
There are many cases where users are behind a forward proxy for security/compliance reasons. Most applications need to support these types of users.
I looked through this earlier today when I saw it mentioned in that thread about the closed source tool for the same purpose.
Having done a good bit of this type of reverse engineering the hard way over the years, it's a very exciting find. I had been talking with my partner about building something similar for the past six months. How exciting to learn that it's already out there and open source too!
I've used this tool in the past with success. Not perfect but it accelerates the work greatly if you can launch a mitm proxy quickly and are familiar with the tool.
I've been fighting lately with an API, though. It's not very, let's say, RESTy. It has only one endpoint, and the different "sections" of the API are defined in parameters, so MitmProxy2Swagger doesn't detect them properly :(
> It's not very, let's say, RESTy. It has only one endpoint,
To be fair, from what I understand an actual(tm) REST API would only have a single defined endpoint[1]: the entry point. With every other endpoint being discovered from the responses. And also from your message I'm guessing a URI still uniquely identifies a resource (specifically through the "query" part of the URI, instead of the more common "path").
So, technically, assuming there's nothing too weird with that API, it seems like MitmProxy2Swagger is failing to detect a REST API.
[1]: Corollary: If an API is RESTful, it should be possible to rename any endpoint (except the entry point) at any moment in time without prior notice, and clients would not break as long as the response types/schemas are still supported by the clients. In-flight requests might fail with a 4xx, but after a retry they should go to the correct endpoint without any code change required.
This is HATEOAS, basically the core feature of REST that very few people actually use. Most of what the industry calls REST or RESTful is just structured and inefficient RPC.
True, I almost never see the endpoint discovery thing, I almost forgot about it...
I don't think anyone has ever used REST in the way you are using it - the sibling comment points out that HATEOAS is probably what you mean - this generally embeds links to all resources, full data navigation, next/prev links, and so on. It is true that a proper HATEOAS client should be able to navigate an endpoint completely with just a starting address.
Yeah unfortunately despite it being part of the REST definition, nowadays "REST" has become a term that means "REST but without HATEOAS". Similar to how "API" now means specifically "HTTP API that returns JSON", or "AI" now means "Generative AI specifically".
Nothing is RESTy
I was wondering how it would take in graphql endpoints and convert it to swagger, since its just a single POST API with change in params. But thats more of a swagger issue than the tools. Has anyone dealt with this? Would be really helpful if you could share your ideas too :)
Why would you tho?
If you're working against an GraphQL based API, you should be able to pull a schema file. And use that to implement your own API.
All you would get from an Mitmproxy is example queries and mutations. With the additional complexity of extra tooling to stich together the schema file
Pulling the schema file can, and often is, disabled server side. And GraphQL APIs can, and often do, decline to serve other than persisted queries, and those can't be really inferred even with known schema.
So I am working with a new company that has a ton of graphql queries. What I wanted to do was write an integration test for them in the fastest and easiest way possible.
I don't want to sit and read each query to identity where it is in the user flow. So I was thinking if I run this in the background and go through a happy flow, I can get the APIs in order and write an integration test.
If only someone could automate[1] the clicking and navigating part by writing in plaintext something like "Open airbnb and explore all the features as much as possible" :)
1. https://github.com/BandarLabs/clickclickclick - It does that and I am one of the authors.
perhaps a n00b question, but would this work, or is there something similar for apps, specifically android apps?
I've used this specific tool to help me reverse engineer the private API of an Android App.
The thing is, depending on how hardened the app is, you'll have to play with Android to allow this interception, mostly because of certificate pinning. Also I remember something about apps not using the system wide trusted certificates you install (IIRC).
I remember using a rooted device with LineageOS, and downloading the APK and modifying it with a tool so the self signed certificate for the mitm proxy works with it.
The mitm proxy docs have some links to tools that can do that [0] and you could also use an Android emulator if you don't have an extra phone to mess with it [1]
0: https://docs.mitmproxy.org/stable/concepts-certificates/ 1: https://docs.mitmproxy.org/stable/howto-install-system-trusted-ca-android/
A MITM proxy isn't specific to any app, it's a forward proxy for your outgoing network connection. In case of an Android app you'd need to run mitmproxy on a machine in your network and setup the connection as proxy in your Android's network settings. Then you'd need follow http://mitm.it to install mitmproxys root certificate on the Android device (to trust the connection with TLS) and off you go.
EDIT: or rather follow the docs[0]
[0]: https://docs.mitmproxy.org/stable/howto-install-system-trust...
Depends on the app. If it uses some online functionality probably yes. You could also try decompilation, it’s decent on java apps like android’s.
I use burp suite combined with Frida (which can remove root check and override ssl pinning).
Yes, this. The Frida tools method to remove cert pinning is the only method that has worked for me. The mitmproxy docs for android (as referred to by another commenter) didn't work for any apps I tried.
This is so cool. Thanks for sharing !
Obvious question: How to protect against this ?
Build your API assuming anything public facing will be known. This includes anything downloaded to a device.
Your first line of defence should be a secure API where an attacker doesn't gain anything by knowing it.
You can add obfuscation, but ultimately if the client is shipped to the user you must assume an attacker can reverse engineer it.
What specifically do you want to protect?
for me, we cant 100% protect again this type of usage but we can minimize with good observarbility and monitoring tools that always check if user is run this via verified way (signed app,web or etc) or RE'ing the api <<
because guess what??? we are the creator of such system, its easy to detect bot/such case when you have good analytical data because this type of way does not give any "traces"
I find this confusing because the point of an API is to be known, yes? Otherwise who's accessing it?
It's a valid desire, but you have to be really dedicated to the effort to block it, in practice.
You might intend your API to be consumed only by your own clients. E.g. your published mobile apps.
A well-designed API won't allow a third-party client to do anything that your own client wouldn't allow of course. Permissions are always enforced on the back end.
But there are many cases where a user might want a custom/different client:
If your mobile apps are not awesome, or if they deprioritize a specific use case, or if they serve ads ... or even if your users want to automate some action in your service...
If your service is popular enough (or you attract a certain kind of user), you will have some people building their own clients.
Those sound like bad use cases for a client-server model with public endpoints, then? I mean, you could cert-pin yourself in the client app, I guess.
Not sure what you mean here. All endpoints are equally public.
Not necessarily. A common pattern is to build a 'private API' intended to be used by one's own front-end applications. For example: most client-rendered applications, like the Airbnb example on this page.
Modern APIs are actually most of the times poor man's RPC, they don't need to exist, much less known.
Yeah - does this get nullabilities right?
Coool!
This is something that would be easy to do an ordinary job of, missing lots of edge cases and not making something thorough and complete.
A really professional and thorough job would be extremely time consuming and hard.
I do this a lot for my work. A tool like this that can help get me to a nice starting point is huge. Instead of developing a mental model of the API in my head by manually looking through API requests/responses in ProxyMan, this can start me off much more quickly. From there, the edge cases can be worked out.