I use DO's load balancers in a couple of projects, and they don't list Cloudflare as an upstream dependency anywhere that I've seen. It's so frustrating to think you're clear of a service then find out that you're actually in their blast radius too through no fault of your own.
It is mentioned in their list of subprocessors: https://www.digitalocean.com/trust/subprocessors
I find stuff like this all the time, railway.com recently launched an object storage service, but it's simply a wrapper for wasabi buckets under the hood, and they don't mention this anywhere... not even the subprocessors page https://railway.com/legal/subprocessors - customers have no idea they are using wasabi storage buckets unless they dig around the dns records. so i have to do all this research to find upstream dependencies and go subscribe to status.wasabi.com alerts etc.
dig b1.eu-central-1.storage.railway.app +short
s3.eu-central-1.wasabisys.com.
eu-central-1.wasabisys.com.
Hey, I'm the person that was responsible for adding object storage to Railway. It was my onboarding project, basically a project I was able to choose myself and implemented in 3 weeks in my 3rd month after joining Railway.
Object Storage is currently in Priority Boarding, our beta program. We can and will definitely do better, document it and add it to the subprocessor list. I'm really sorry about the current lack of it. There was another important project that I had to do between the beta release of buckets and now. I'm oncall this week, but will continue to bring Buckets to GA next week. So, just to give this context. There's no intentional malevolence or shadiness going on, it's simply because there's 1 engineer (me) working on it, and there's a lot of stuff to prioritize and do.
It's also super important to get user feedback as early as possible. That's why it's a beta release right now, and the beta release is a bit "rushed". The earlier I can get user feedback, the better the GA version will be.
On the "simply a wrapper for wasabi buckets" - yes, we're currently using wasabi under the hood. I can't add physical Object Storage within 3 weeks to all our server locations :D But that's something we'll work towards. I wouldn't say it's "simply" a wrapper, because we're adding substantial value when you use Buckets on Railway: automatic bucket creation for new environments, variable references, credentials as automatic variables, included in your usage limits and alerts, and so on.
I'll do right by you, and by all users.
slight off topic: I used DO LBs for a little while but found myself moving away from that toward a small droplet with haproxy or nginx setup. Worked much better for me personally!
The point of an LB for these projects is to get away from a single point of failure, and I find configuring HA and setting up the networking and everything to be a pain point.
These are all low-traffic projects so it's more cost effective to just throw on the smallest LB than spend the time setting it up myself.
If they are small projects, why are they behind a load balancer to begin with?
Usually because of SSL termination. It's generally "easier" to just let DO manage getting the cert installed. Of course, there are tradeoffs.
I use the LB's for high availability rather than needing load balancing. The LB + 2 web back-ends + Managed DB means a project is resilient to a single server failing, for relatively low devops effort and around $75/mo.
Are both servers deployed from the exact same repo/scripts? Or are they meaningful different, and/or balanced across multiple data centers?
Did your high availability system survive this outage?
I have a couple of instances of this same pattern for various things that have been running for 5+ years, none of them have suffered downtime caused by the infrastructure. I use ansible scripts for the web servers, and the DO API or dashboard to provision the Load Balancer and Database. You can get it all hooked up in a half hour, and it really doesn't take any maintenance other than setting up good practices for rotating the web servers out for updates.
They wouldn't survive DO losing a DC, they're not so mission critical that it's worth the extra complexity to do that, and I don't recall DO losing a DC in the past 10 years or so.
They did stay up during this outage, which was apparently mostly concentrated on a different product called the 'global load balancer', which ironically is exactly the extra complexity I mentioned to in theory survive a DC outage.
Keep in mind these are "important" in the sense that they justify $100/mo on infra and monitoring, but not "life critical" in that an outage is gonna kill somebody or cost millions of bucks an hour. Once your traffic gets past a certain threshold, DO's costs don't scale that well and you're better off on a large distributed self-managed setup on Hetzner or buying into a stack like AWS.
To me their LB and DB products hit a real sweet spot -- better reliability than one box, and meaningfully less work than setting up a cluster with floating IP and heartbeats and all that for a very minimal price difference.
Regional LBs do not have Cloudflare as an upstream dependency.
They don't name names but it's probably due to the ongoing Cloudflare explosion. I know the DigitalOcean Spaces CDN is just Cloudflare under the hood.
Just spaces CDN, not spaces - you'd think they'd just turn the CDN off for a bit.
You can't just "turn off CDN" on the modern internet. You'd instantly DDOS your customers' origins. They're not provisioned to handle it, and even if they were the size of the pipe going to them isn't. The modern internet is built around the expectation that everything is distributed via CDN. Some more "traditional" websites would probably be fine.
Might be just me, but I can think of many origins under my control which could live without a (non-functional) CDN for a while.
CDN is great for peak-load, latency reductions, and cost - but not all sites depend on it for scale 24/7
If you are DO you could, you just decided not to bother. They control the origins it's spaces (s3), so they could absolutely spin up further gateways or a cache layer and then turn the CDN off.
Either you are wrong and they do not have the capacity to do that, or they have decided it is acceptable to be down because a major provider is down
I imagine a cache layer cannot be that easy to spin up - otherwise why would they outsource it?
You outsource it because clouflare have more locations than you so offer lower latency and can offer it at a cost that's cheaper or the same price as doing it yourself.
Which suggests its expensive enough for it to be unlikely they just have the capacity lying around to spin up.
To the contrary, CDN pricing will usually beat cloud provider egress fees.
Common example: you can absolutely serve static content from an S3 bucket worldwide without using a CDN. It will usually scale OK under load. However, you're going to pay more for egress and give your customers a worse experience. Capacity isn't the problem you're engineering around.
For a site serving content at scale, a CDN is purpose-built to get content around the world efficiently. This is usually cheaper and faster than trying to do it yourself.
That is not what I said. I said DO will not have the spare capacity because its too expensive. Can you please tell me who DO pay egress fees to?
They will be doing a mix of peering both across free PNIs and very low cost IXP ports, with the reminder going down transit like Colt or cogent. Probably average cost of the order of about $1 per 20TB of egress in Europe and NA markets.
The thing is with edge capacity is that you massively overbuild on the basis that;
It's generally a long ISH lead time to scale capacity (days not minutes or hours).
Transit contracts are usually 12-60 months
Traffic is variable, not all ports cover all destinations
Redundancy
So if you are doing say 100Gbps 95%ile out of say London then you will probably have at least 6+ 100Gb ports, so you do have quite a bit of latent capacity if you just need it for a few hours.
nit: that's more DoS (from a handful of DO LBs) than DDoS.
Yes all sites showing the CloudFlare error due to the massive outage. Seems their outages are getting more frequent and taking down the internet in new ways each time.
Man, it really seems like the cloud providers are having some tough times lately. Azure, AWS, and Cloudflare! Is everything just secretly AWS?
I have two projects on DO using droplets and they are still running fine.
Droplets are fine.
> This incident affects: API, App Platform (Global), Load Balancers (Global), and Spaces (Global).
It seems mostly a CludFlare related issue.
My DOs are working fine as well.
Are you using their "reserved IPs"? I was thinking of starting to use them, but now I wonder if it is part of their load balancing stack under the hood.
So yesterday Azure got hit hard, today CF and DO are down, bad week or something else?
Azure DDoS event happened in October. Blog post about the attack was published yesterday, and was quickly picked up by news sites.
DDOS, but I don't really understand why in particular.
Having known people like this, its either flexing about who has the more powerful botnet or advertising who can do what.
NATO testing internal infra, or Russian hackers stepping it up after aggressive sabotage efforts in Eastern Europe?
I would also like to know people’s opinion on this.
Year-end promotion cycle is the worst time for end-users and the best one for engineers greedy for promotions.
Don't blame individual engineers who want to do what will be rewarded instead of company performance policies that reward this type of behavior.
shoot, there are also end of year layoffs and reorgs to pump up those year end numbers
what engineers, mate? they AI now
and they're doing just spectacular
I knew it, DigitalOcean CDN is using Cloudflare behind the scenes. Why DO ?
Cloudflare outage.
Who is next?
my guesses would be look at who has a FedRAMP capable service first.
maybe also GCP, hetzner, akamai
Dominos falling into dominos falling into dominos…