For me, what is more common is the likelihood of doing something in the wrong environment (e.g. lab, dev, stage, prod). To help make things a bit more clear, our images now override the PS1 with a `(environment)` at the beginning, which is a different color, lab=green, dev=purple, prod=red. If it saves me once, it was worth it.
That's really cool!
I alias cd to always show the first few files in a directory, but a better chance of noticing if I'm in the wrong place, and I always make internal IoT controls show the hostname in the top nav bar.
If I have made a mistake once, or imagined a mistake that could be possible, I pretty much always start thinking of technical countermeasures.
I’ve tried using iTerm2’s automatic profile switching feature to adjust the theme depending on the connection, but I’ve never been able to get it to work reliably.
I think such a thing can be achieved easily with the starship prompt!
I've used this successfully in the past for Terminator: https://github.com/GratefulTony/TerminatorHostWatch
I too use iTerm's profiles to change background colors, fonts etc. to indicate where I am.
Reduces number of fat fingering disasters.
Double deep environment protection. True prod servers can't be deployed to except from special terminal which only appears visible to the ProdGlasses(TM). If you are wearing ProdGlass(TM) you receive mild brain wave stimulation which is collected for "product control purposes"(TM) future product development(patent pending).
I have iTerm set up to change the background colour of a shell based on whether it's local, docker, staging or production, and on top of that prompt colours change depending on how privileged the user is. Without all this, I'm sure I'd be making terrible mistakes on a regular basis.
I'm always running in tmux panes so often iterm and those settings seem to conflict. It would be great to go a step further and include a background color change--will have to give it another shot. Thanks for the tip.
What I really want is a way to get a read only root shell. I do a lot of work on a "traditional" multi-user unix host, where hundreds of scientists share a powerful computer. I often need to become root to look at files. I want to be able to do that without the ability to screw anything up.
If it's Linux, you can bind-mount the rootfs read-only and chroot into it:
mahler ~ # mkdir /roroot mahler ~ # mount -o bind,ro / /roroot mahler ~ # chroot /roroot mahler / # passwd Enter new password: Re-type new password: passwd: Authentication token lock busy passwd: password unchanged
Oh, that's a good idea.
I've used this command to achieve that in the past:
[systemd-nspawn](https://man7.org/linux/man-pages/man1/systemd-nspawn.1.html) would normally be used to run a command inside of a container (a directory), but in some modes you can specify the system root as the "container" path.sudo systemd-nspawn --directory=/ --read-only --ephemeral --volatile=yes
I believe this specific set of options relies on (BTRFS) file system snapshots for performance. It's possible that you can get it to work on non-BTRFS systems by providing another combination of command line variables, but the default setting is to copy the file system tree to a temporary path.
You can also pass parameters like --volatile=state (so you can write to /var) and --volatile=overlay (so you can "write" to state, but all changes are discarded after the container exits). --volatile=state is useful for extracting data from a temporary read-only system, --volatile=overlay is useful for running tools that crash if they run on a read-only filesystem.
> I believe this specific set of options relies on (BTRFS) file system snapshots for performance. It's possible that you can get it to work on non-BTRFS systems by providing another combination of command line variables, but the default setting is to copy the file system tree to a temporary path.
You can use overlayfs + chroot.
--read-only would be a nice option for `run0`
I've used seccomp in the past to create a read-only root.
I created a seccomp DSL to make this kind of stuff easier [0] (an example of dropping network access is at [1])
[0] https://chiselapp.com/user/rkeene/repository/bash-drop-netwo...
[1] https://chiselapp.com/user/rkeene/repository/bash-drop-netwo...
Can you create a group and set a sticky bit for ACLs for your root for the filesystem with read permissions for that group? That way all files created under the filesystem root will have read permissions for that user, but you cannot modify anything.
Not really, there are tons of different filesystems mounted from all over the place. It's a mess.
Can you create a second read-only mount of root?
Maybe this would help:
https://serverfault.com/questions/136515/read-only-bind-moun...
But the point is I want my 'root' session to have no ability to do anything destructive.
Though as someone just suggested in another thread, chrooting into such a thing isn't a terrible idea. It certainly limits the damage I can do.
MicroOS has something in that spirit I think ? Maybe I'm wrong tough, don't even know if MicroOS is usable in host
Why should my user have the right to shut down a production box?
Having a tool to warn me that I'm about to do something I shouldn't be allowed to do is okay. Having proper access control would be better.
(Fun: 'boulette' is French for 'dumb mistake')
> Having proper access control would be better.
Some of us live on the "granted" side of that access control. I'm an SRE; I have access to a fair number of things. I worry about making dumb mistakes like that in the article due to sleep deprivation. The warning here might very well be a warning that you're about to do something that you are allowed to do.
In the specific example … sometimes rebooting a production machine is what needs doing. Ideally, there's a second VM, ideally, the services on the machine would gracefully drain, ideally, it wouldn't matter. Ideally. But the reality is that often they don't, whether because of developer laziness, incompetence, some ugly confluence of bullshit bugs that conspire to prevent sanity, etc., more often than not I want to be careful even when the process/theory would like to say "this is a fully automated & completely safe operation" — and half-asleep is not careful, and accidental allegedly safe commands while half asleep is where Murphy likes to rear his head.
I've definitely used techniques like this. Red prompts that scream "You're SSH'd into PROD dummy" also come to mind. I keep my kubectl context fixed to "local laptop" so that I am forced to always type out --context the_production_cluster when I want that.
Temet nosce.
The idea is to protect you from yourself. You already have root access to the production server, because you are the admin and you need it to do your job.
Fun followup: the image is from the French movie "Le dîner de cons", where the character depicted says "Oh la boulette", after making a gaffe.
When you work at a bank and have multiple production boxes, this is pretty typical
I mean, you're probably going to need to give the reboot rights to someone. For instance, the people in charge of applying kernel updates. And those people will, at this point, be prone to making a boulette.
We used to have special user for such tasks, with long names and password that should make clear what you want to do. For example shutdown/youwillshutdownproductionXXXXXXX (with Xs being a “real” password) because the random part of the password and sometimes the “fix” part was changed, there was no way that muscle memory would take over.
Isn't this a little over engineered? You can accomplish the same thing with a 5-line bash script. put protect.sh somewhere in your path:
Then in your bashrc or zshrc:#!/bin/bash if [[ $SSH_TTY ]]; then read -p 'You are in SSH. Are you sure (enter hostname for yes)? ' [[ $REPLY == $(hostname) ]] || exit 2 fi exec "$@"
alias shutdown='protect.sh shutdown' alias reboot='protect.sh' alias sudo='sudo ' # Don't allow sudo to bypass the protection. Can do the same with doas
What if you alias ssh on the local machine to open a tmux with the top pane being a red warning, and then the whole thing could automatically close when ssh ends so the experience is just like normal ssh?
I've never actually used tmux, but maybe I will try it out!
It's mentioned in the bottom of the page, but I'd like to highlight that "molly-guard" [0] provides a very similar functionality for at least 10 years.
Since it's so old, it is present in all the systems - "sudo apt install molly-guard" on your server, and your shutdown, reboot, etc.. are all protected, no need for 3rd party tools.
I use that in combination with base16-shell [0] to set different color profiles as a protection.
I like the idea, but my immediate comment here, in a spirit of thinking about what would get a tool like this wider adoption, is that "boulette" is a clunky thing to type on a terminal. You can be more assertive about the importance of your tool, and give it a shorter, punchier (command-line) name; it might see more use if you do.
You're not really supposed to type it everytime. Just declare an alias prefixed with boulette replacing the actual risky commands.
If you're in a mental state where you can shutdown a prod server while thinking you're on a dev one, you certainly won't think to type "wtf command", no matter "wtf" actually is.
For what it’s worth, the name is funny and easy to remember to French-speaking people.
Loving the name "boulette" here, which means "blinder" or "slip-up".
Why does it take the command as a single string argument? Seems like it would make more sense to take it as an argument list, like other "wrapper" commands like sudo, nohup, etc., do.
Probably to avoid mistakes such as `boulette some command || some other command`, which would not happen with `boulette "some command || some other command"`.
It just didn't cross my mind! You've got a point here! I'll rethink the argument parsing.
I guess the name and Pignon's picture only resonate with a French audience, thus why so many serious comments here.
Moi ça m'a fait sourire !
I've basically stopped these problems by coloring the servers. My production shells have a red PS1 background. Datagrip gives a red color to my prod databases. HTML admin consoles got some red css if configurable.
I've a shell alias for this, requiring valid hostname for reboot, halt etc...
Running "shutdown" on remote hosts isn't something I have ever needed to do routinely.
Is this for ephemeral dev boxes? Does shutdown suspend billing on AWS/cloud type hosts?
Boulette isn't only for safeguarding "shutdown", it is for safeguarding what needs to be. This is to prevent yourself from some harmeful reflexes.
I my case, I use atuin a lot and automatically retype long commands and sometimes as root.
The regularity of this actions tend to lower my attention, but the fact that many users depend on what I do on big machines and with lots of priviledges isn't less true.
Danger comes from getting used to it.
This for the case where you think you type it locally but are actually typing it remotely. But I’d personally never type this locally so this use case is not convincing at all for me.
This! Another use case: I use it on nixos-rebuild to update the system, when I don't want to rebuild from tiny hosts but rather from a powerfull machine and then send binaries,because it would take to much time to compile otherwise.
I have a laptop that just runs a bare ubuntu terminal, no DE, and I've accidentally shut down a home server before when I tried to shut down the laptop. Quite annoying, since I was away from home at the time
Why do you shut your laptop down instead of just closing the lid?
Many laptops have terrible power usage during sleep, so if you just close the laptop and throw it in your bag, you'll open it tomorow with 30% less battery
Because the battery is absolutely knackered
ZFS snapshots with bootloader support (other CoW filesystem should also work).
For the specific purpose of shutdown, there has been a solution for decades: molly-guard.
Seems like just aliasing `shutdown` to something with a confirmation on servers and not your local would suffice.
I've worked as a sysadmin for most of my professional career, around 25 years. Sure I've screwed up, but I've learned each time. Covering everything in bubble wrap to shield you from consequences of mistakes is counter-productive. All that happens is that you learn that there are safety nets and not to think twice about what you're doing. Also these tools are non-standard, so you'll expect them everywhere. Don't do this.
Just remove `/usr/sbin/shutdown` and `/usr/sbin/poweroff`
Not enough, because those are just wrapper scripts that call systemd.
Sure, but it would be enough to stop them being run by mistake
I would like to register a prediction of futility here, just a little more detailed than the trope of "when you try to idiot proof something the universe always makes a better idiot."
The number of software systems I've seen designed to allow the clueless to bumble their way through operations is much higher than the number of companies I've seen that attempts to train better operators.
Maybe its the experience of working on tractors, where the PTO has fun side effects like "degloving" where, sure theres a guard but more importantly theres attention not to put your hand, hair or clothing near the spinning torque monster of doom. Theres no option to make that purely safe economically. Same goes for machine tools and other heavy equipment. The people who work around such things have been to my experience more capable at accomplishing difficult, diverse tasks.
Where was the last company that had anything equivalent to the lowly "forklift certified" for prod? Its a very rare shop I've seen invest in any sort of across the board training for command line skills, outages, pre-requirements for delicate operations. We don't invest in people being better, being more capable. I think because we have internalized an owner/management point of view that workers are fungible and training is a waste, while software system guards are investments.
As a worker, I don't agree with that. I don't agree with building systems to be powered by the lowest educated, lowest paid meat popsicles yet I think thats the strategy behind this.
Next time you go to build a system like this, consider who's logged into production and can they be trained to be more capable, more attentive operators. In the long run, I think it will end up with a better industry overall.
To be fair, the specific use case mentioned on the github is typing "shutdown -h now" into the wrong terminal window, and the way it's solved is quite clever: it asks you to enter the hostname of the system you're actually trying to shutdown.
This is something that could conceivably happen to people who are properly trained. This also means it doesn't fall prey to the usual "do you really want to do this? yes - no" prompt, where you just get used to automatically hitting "yes." Even if you habitually enter the hostname, it'll be wrong if you execute the command in the wrong terminal.
I have no doubt this would stop some instances of (say) accidentally rebooting a production server.
But as an idiot, I’m here to tell you it isn’t idiot proof. In a moment of suboptimal attention, I could easily ask myself “What host am I on?” rather than the more appropriate question “What host am I trying to shutdown?”
In my case it may be quite a bit worse because I have hostnames in PS1 specifically to avoid running any command on the wrong host. With the hostname right in front of me I could easily accidentally habitualize just typing the hostname I see.
Over time, I imagine this would befome more likely rather than less, so although I think it probably helps, I am sympathetic to GP’s view of futility.
Maybe have it prompt: what host do you NOT want to shut down?
/s
Framing it as an "idiots vs not idiots" is the wrong way to think about it.
When you are working with the PTO you know you are working with the PTO. You are standing next to the tractor. The PTO should always be handled with the understanding that it can be dangerous. Here people are doing something mundane, everyday and not dangerous (shutting down their own laptop) and suddenly it becomes dangerous because they mixed up the terminal they typed their command in.
It is as if sometimes your alarm clock would be replaced with a PTO. Your alarm beeps, you reach out to turn it off as always and bam your hand is gone. That is the situation we are talking about here.
> I think because we have internalized an owner/management point of view that workers are fungible and training is a waste
This is not something you can train out of people. Your best trained, and most skillful operator can do this mixup. This is not that someone doesn't know what "sudo shutdown -h now" does. They do very much know it. Like the palm of their hand. What they don't notice is that they are typing it into the wrong terminal.
Weirdly enough, I have thought about this and to a degree, this can be interpreted as normalization of deviancy and normalization of danger.
Like, yes, I am a console jockey and prefer working in shells with a tiling window manager, keyboard only control and such. I will however always shutdown my workstation with the mouse (or trackball rather) through some UI of the desktop manager, or a desktop manager specific way.
Otherwise, I am normalizing the use of sla-dangerous commands like "shutdown". It takes that little bit of fear and respect out of those commands if you use them daily for no good reason or if better choices exist. Like, don't turn your alarm off by cutting it's wire and re-soldering it later.
And similar, if I need to reboot production systems, I'll much rather reach to some control interface of the virtualization, or use something like ansible to dry-run these dangerous tasks first.
Or it terrifies me how care-free some people are with "sudo rm -rf". I've caused myself so much pain with rm. "sudo -u app-user rm -rf" is right there, or even better, "sudo -u app-user find -name foo -print > stuff; cat stuff" and later some "xargs -i rm < stuff" and "xargs -v rm < stuff". Yes it takes a minute more to do, but it prevents ... accidents.
It’s the “Do you want to share XXX” popup all over again. Once you get used to see it, it will be a part of the shutdown command (to take the example in the readme).
Hahaha, yes I thought of it while writting it! This is a crazy annoying terminal popup! But it can maybe be useful if used correctly and with parsimony
As a counterpoint, aircraft designers have learnt through bitter experience that interface design is extremely important, as in stressful situations, or even due to a momentary lapse in attention, even a very highly trained, careful individual can make a fatal mistake. Quite frankly the only reason tractors don't get the same treatment is that generally the operator only kills themselves, not a few hundred other people.
(This isn't to say that training is useless, just that more training isn't the best and only solution to all problems, nor will a lack of careful interface design magically create more capable operators)
I think that's a great counterpoint, but it is leading to the same issue. Its allowing a class of pilots to fly who know less and less, and are more reliant on the automation. With deadly results.
Separately the only reason tractors don't get the same treatment is because society doesn't care about rural men in the same way they don't care about soldiers. In comparison cars have been largely regulated for safety, because the people who die in car crashes come from a wider swath of society.
> Its allowing a class of pilots to fly who know less and less, and are more reliant on the automation. With deadly results.
This is a plausible hypothesis, but it is reflected in the data? Flying has gotten safer and safer over the years, but of course that's got a multitude of effects contributing to it, not just the skill of the pilots. Reliance on automation is effectively a requirement for modern aircraft given the number of control systems which are critical to the pilot having any control of the aircraft. I've seen Boeing criticized for their approach here: while Airbus's interface is more or less "you are directing a series of control systems, not flying the plane directly", Boeing has essentially tried to concoct an elaborate illusion that a gigantic airliner is a Cessna, which a leaky abstraction even if it makes the pilot feel like they are "closer to the metal". (I could draw a comparison to C programmers who feel the same thing despite the great honking illusion of an optimizing compiler in between).
> The number of software systems I've seen designed to allow the clueless to bumble their way through operations is much higher than the number of companies I've seen that attempts to train better operators.
I agree, and that's a problem. Now with LLM, we're training everybody to not know how to find information by themselves (we call that "prompt engineering" so it cannot be bad, right? /s).
I'm using molly-guard (htps://salsa.debian.org/debian/molly-guard, mentioned on Boulette's github btw as inspiration) since years for this after I've remotely shut down our file server thinking I was on another tmux pane. It saved me once or twice since. Btw, molly-guard doesn't require setting up aliases.
For anyone wondering how this works, molly-guard uses package diversions to actually rename the shutdown/reboot/... binaries and replace them with molly-guard: https://git.launchpad.net/ubuntu/+source/molly-guard/tree/de...
This is a cool ability of dpkg that is not well known. You can use it without making a package as well.
Didn't know what specific technique it used to do that, thanks for mentioning this!
Btw for those who wonder where the name comes from: https://news.ycombinator.com/item?id=26633320
Yes! Molly guard is genius! Unfortunately I couldn't get it to work on nixos which is not FHS-complient!
This says it's inspired by molly-guard (which I love and install everywhere and has saved my bacon countless times) but I don't see what's different about it? Molly-guard is a single `apt install` away with no further config needed.
Also, the problem with a Y/N question is that when you are bored and/or in a hurry, you only skim the question and muscle memory takes over, and you hit Y and then you realise a few seconds later that you rebooted the wrong machine. This is why molly-guard makes you enter the hostname of the host you want to shut down.
You can set the challenge type to hostname as well and some others. The only pb with Molly-guard is that I can't have it working properly on my system.
Combination of bash profiles and alias can be used to achieve this without installing any external thing. You can prevent commands like rm, chmod, cat etc even for root user. You can also prevent root users from accessing directories where the vault, database etc. Data is written.
Ps. At Adaptive (http://adaptive.live), we have kind of productized something like this.
I don’t understand: you say one can do this without an external thing, and then promote your own external thing?
It is not external things, but rather a centralized place to manage the policies.
i don't need anything reminding me that i'm in a remote shell session: i know because it lags. i also know which one from the delay's length.
> i don't need anything reminding me that i'm in a remote shell session: i know because it lags.
It doesn’t if you use a tool like Mosh [1] (I’m not affiliated)
[1]: https://mosh.org/
Replacing SSH with a custom protocol sounds quite scary
Is it as secure as regular SSH when configured properly?
https://news.ycombinator.com/item?id=10220080
Here's a conversation on HN about it. Mosh uses an initial SSH connection to establish a session, and I located elsewhere that the communication thereafter is handled via AES-128 encrypted UDP traffic. The server process itself seems to only live the life of the session, and doesn't require escalated permissions.
I can't imagine enterprise or government adding it to their stacks, but for connecting to personal stuff doesn't really seem like a big risk.
You still authenticate and kick off the Mosh session via SSH, so it shouldn't be any worse than plain SSH.
> Mosh doesn't listen on network ports or authenticate users. The mosh client logs in to the server via SSH, and users present the same credentials (e.g., password, public key) as before. Then Mosh runs the mosh-server remotely and connects to it over UDP.
> Mosh doesn't listen on network ports
...
> ... Then Mosh runs the mosh-server remotely and connects to it over UDP.
Are UDP sockets immune from port scanning? Regardless of the answer to that, the sentence should be rewritten because it sounds like nonsense as is