Reading List
The most recent articles from a list of feeds I subscribe to.
What happens when you press a key in your terminal?
I’ve been confused about what’s going on with terminals for a long time.
But this past week I was using xterm.js to display an
interactive terminal in a browser and I finally thought to ask a pretty basic
question: when you press a key on your keyboard in a terminal (like Delete, or Escape, or a), which
bytes get sent?
As usual we’ll answer that question by doing some experiments and seeing what happens :)
remote terminals are very old technology
First, I want to say that displaying a terminal in the browser with xterm.js
might seem like a New Thing, but it’s really not. In the 70s, computers were
expensive. So many employees at an institution would share a single computer,
and each person could have their own “terminal” to that computer.
For example, here’s a photo of a VT100 terminal from the 70s or 80s. This looks like it could be a computer (it’s kind of big!), but it’s not – it just displays whatever information the actual computer sends it.
Of course, in the 70s they didn’t use websockets for this, but the information being sent back and forth is more or less the same as it was then.
(the terminal in that photo is from the Living Computer Museum in Seattle which I got to visit once and write FizzBuzz in ed on a very old Unix system, so it’s possible that I’ve actually used that machine or one of its siblings! I really hope the Living Computer Museum opens again, it’s very cool to get to play with old computers.)
what information gets sent?
It’s obvious that if you want to connect to a remote computer (with ssh or
using xterm.js and a websocket, or anything else), then some information
needs to be sent between the client and the server.
Specifically:
- the client needs to send the keystrokes that the user typed in (like
ls -l) - the server needs to tell the client what to display on the screen
Let’s look at a real program that’s running a remote terminal in a browser and see what information gets sent back and forth!
we’ll use goterm to experiment
I found this tiny program on GitHub called
goterm that runs a Go server that lets you
interact with a terminal in the browser using xterm.js. This program is very insecure but it’s simple and great for learning.
I forked it to make it work with the latest xterm.js, since it was last updated 6 years ago. Then I added some logging statements to print out every time bytes are sent/received over the websocket.
Let’s look at sent and received during a few different terminal interactions!
example: ls
First, let’s run ls. Here’s what I see on the xterm.js terminal:
bork@kiwi:/play$ ls
file
bork@kiwi:/play$
and here’s what gets sent and received: (in my code, I log sent: [bytes] every time the client sends bytes and recv: [bytes] every time it receives bytes from the server)
sent: "l"
recv: "l"
sent: "s"
recv: "s"
sent: "\r"
recv: "\r\n\x1b[?2004l\r"
recv: "file\r\n"
recv: "\x1b[?2004hbork@kiwi:/play$ "
I noticed 3 things in this output:
- Echoing: The client sends
land then immediately receives anlsent back. I guess the idea here is that the client is really dumb – it doesn’t know that when I type anl, I want anlto be echoed back to the screen. It has to be told explicitly by the server process to display it. - The newline: when I press enter, it sends a
\r(carriage return) symbol and not a\n(newline) - Escape sequences:
\x1bis the ASCII escape character, so\x1b[?2004his telling the terminal to display something or other. I think this is a colour sequence but I’m not sure. We’ll talk a little more about escape sequences later.
Okay, now let’s do something slightly more complicated.
example: Ctrl+C
Next, let’s see what happens when we interrupt a process with Ctrl+C. Here’s what I see in my terminal:
bork@kiwi:/play$ cat
^C
bork@kiwi:/play$
And here’s what the client sends and receives.
sent: "c"
recv: "c"
sent: "a"
recv: "a"
sent: "t"
recv: "t"
sent: "\r"
recv: "\r\n\x1b[?2004l\r"
sent: "\x03"
recv: "^C"
recv: "\r\n"
recv: "\x1b[?2004h"
recv: "bork@kiwi:/play$ "
When I press Ctrl+C, the client sends \x03. If I look up an ASCII table,
\x03 is “End of Text”, which seems reasonable. I thought this was really cool
because I’ve always been a bit confused about how Ctrl+C works – it’s good to
know that it’s just sending an \x03 character.
I believe the reason cat gets interrupted when we press Ctrl+C is that the
Linux kernel on the server side receives this \x03 character, recognizes that
it means “interrupt”, and then sends a SIGINT to the process that owns the
pseudoterminal’s process group. So it’s handled in the kernel and not in
userspace.
example: Ctrl+D
Let’s try the exact same thing, except with Ctrl+D. Here’s what I see in my terminal:
bork@kiwi:/play$ cat
bork@kiwi:/play$
And here’s what gets sent and received:
sent: "c"
recv: "c"
sent: "a"
recv: "a"
sent: "t"
recv: "t"
sent: "\r"
recv: "\r\n\x1b[?2004l\r"
sent: "\x04"
recv: "\x1b[?2004h"
recv: "bork@kiwi:/play$ "
It’s very similar to Ctrl+C, except that \x04 gets sent instead of \x03.
Cool! \x04 corresponds to ASCII “End of Transmission”.
what about Ctrl + another letter?
Next I got curious about – if I send Ctrl+e, what byte gets sent?
It turns out that it’s literally just the number of that letter in the alphabet, like this:
Ctrl+a=> 1Ctrl+b=> 2Ctrl+c=> 3Ctrl+d=> 4- …
Ctrl+z=> 26
Also, Ctrl+Shift+b does the exact same thing as Ctrl+b (it writes 0x2).
What about other keys on the keyboard? Here’s what they map to:
- Tab -> 0x9 (same as Ctrl+I, since I is the 9th letter)
- Escape ->
\x1b - Backspace ->
\x7f - Home ->
\x1b[H - End:
\x1b[F - Print Screen:
\x1b\x5b\x31\x3b\x35\x41 - Insert:
\x1b\x5b\x32\x7e - Delete ->
\x1b\x5b\x33\x7e - My
Metakey does nothing at all
What about Alt? From my experimenting (and some Googling), it seems like Alt
is literally the same as “Escape”, except that pressing Alt by itself doesn’t
send any characters to the terminal and pressing Escape by itself does. So:
- alt + d =>
\x1bd(and the same for every other letter) - alt + shift + d =>
\x1bD(and the same for every other letter) - etcetera
Let’s look at one more example!
example: nano
Here’s what gets sent and received when I run the text editor nano:
recv: "\r\x1b[Kbork@kiwi:/play$ "
sent: "n" [[]byte{0x6e}]
recv: "n"
sent: "a" [[]byte{0x61}]
recv: "a"
sent: "n" [[]byte{0x6e}]
recv: "n"
sent: "o" [[]byte{0x6f}]
recv: "o"
sent: "\r" [[]byte{0xd}]
recv: "\r\n\x1b[?2004l\r"
recv: "\x1b[?2004h"
recv: "\x1b[?1049h\x1b[22;0;0t\x1b[1;16r\x1b(B\x1b[m\x1b[4l\x1b[?7h\x1b[39;49m\x1b[?1h\x1b=\x1b[?1h\x1b=\x1b[?25l"
recv: "\x1b[39;49m\x1b(B\x1b[m\x1b[H\x1b[2J"
recv: "\x1b(B\x1b[0;7m GNU nano 6.2 \x1b[44bNew Buffer \x1b[53b \x1b[1;123H\x1b(B\x1b[m\x1b[14;38H\x1b(B\x1b[0;7m[ Welcome to nano. For basic help, type Ctrl+G. ]\x1b(B\x1b[m\r\x1b[15d\x1b(B\x1b[0;7m^G\x1b(B\x1b[m Help\x1b[15;16H\x1b(B\x1b[0;7m^O\x1b(B\x1b[m Write Out \x1b(B\x1b[0;7m^W\x1b(B\x1b[m Where Is \x1b(B\x1b[0;7m^K\x1b(B\x1b[m Cut\x1b[15;61H"
You can see some text from the UI in there like “GNU nano 6.2”, and these
\x1b[27m things are escape sequences. Let’s talk about escape sequences a bit!
ANSI escape sequences
These \x1b[ things above that nano is sending the client are called “escape sequences” or “escape codes”.
This is because they all start with \x1b, the “escape” character. . They change the
cursor’s position, make text bold or underlined, change colours, etc. Wikipedia has some history if you’re interested.
As a simple example: if you run
echo -e '\e[0;31mhi\e[0m there'
in your terminal, it’ll print out “hi there” where “hi” is in red and “there” is in black. This page has some nice examples of escape codes for colors and formatting.
I think there are a few different standards for escape codes, but my understanding is that the most common set of escape codes that people use on Unix come from the VT100 (that old terminal in the picture at the top of the blog post), and hasn’t really changed much in the last 40 years.
Escape codes are why your terminal can get messed up if you cat a bunch of binary to
your screen – usually you’ll end up accidentally printing a bunch of random
escape codes which will mess up your terminal – there’s bound to be a 0x1b
byte in there somewhere if you cat enough binary to your terminal.
can you type in escape sequences manually?
A few sections back, we talked about how the Home key maps to \x1b[H. Those 3 bytes are Escape + [ + H (because Escape is
\x1b).
And if I manually type Escape, then [, then H in the
xterm.js terminal, I end up at the beginning of the line, exactly the same as if I’d pressed Home.
I noticed that this didn’t work in fish on my computer though – if I typed
Escape and then [, it just printed out [ instead of letting me continue the
escape sequence. I asked my friend Jesse who has written a bunch of Rust
terminal code about this and Jesse told me
that a lot of programs implement a timeout for escape codes – if you don’t
press another key after some minimum amount of time, it’ll decide that it’s
actually not an escape code anymore.
Apparently this is configurable in fish with fish_escape_delay_ms, so I ran
set fish_escape_delay_ms 1000 and then I was able to type in escape codes by
hand. Cool!
terminal encoding is kind of weird
I want to pause here for a minute here and say that the way the keys you get pressed get mapped to bytes is pretty weird. Like, if we were designing the way keys are encoded from scratch today, we would probably not set it up so that:
Ctrl + adoes the exact same thing asCtrl + Shift + aAltis the same asEscape- control sequences (like colours / moving the cursor around) use the same byte
as the
Escapekey, so that you need to rely on timing to determine if it was a control sequence of the user just meant to pressEscape
But all of this was designed in the 70s or 80s or something and then needed to stay the same forever for backwards compatibility, so that’s what we get :)
changing window size
Not everything you can do in a terminal happens via sending bytes back and forth. For example, when the terminal gets resized, we have to tell Linux that the window size has changed in a different way.
Here’s what the Go code in goterm to do that looks like:
syscall.Syscall(
syscall.SYS_IOCTL,
tty.Fd(),
syscall.TIOCSWINSZ,
uintptr(unsafe.Pointer(&resizeMessage)),
)
This is using the ioctl system call. My understanding of ioctl is that it’s
a system call for a bunch of random stuff that isn’t covered by other system
calls, generally related to IO I guess.
syscall.TIOCSWINSZ is an integer constant which which tells ioctl which
particular thing we want it to to in this case (change the window size of a
terminal).
this is also how xterm works
In this post we’ve been talking about remote terminals, where the client and
the server are on different computers. But actually if you use a terminal
emulator like xterm, all of this works the exact same way, it’s just harder
to notice because the bytes aren’t being sent over a network connection.
that’s all for now!
There’s definitely a lot more to know about terminals (we could talk more about colours, or raw vs cooked mode, or unicode support, or the Linux pseudoterminal interface) but I’ll stop here because it’s 10pm, this is getting kind of long, and I think my brain cannot handle more new information about terminals today.
Thanks to Jesse Luehrs for answering a billion of my questions about terminals, all the mistakes are mine :)
Monitoring tiny web services
Hello! I’ve started to run a few more servers recently (nginx playground, mess with dns, dns lookup), so I’ve been thinking about monitoring.
It wasn’t initially totally obvious to me how to monitor these websites, so I wanted to quickly write up what how I did it.
I’m not going to talk about how to monitor Big Serious Mission Critical websites at all, only tiny unimportant websites.
goal: spend approximately 0 time on operations
I want the sites to mostly work, but I also want to spend approximately 0% of my time on the ongoing operations.
I was initially very wary of running servers at all because at my last job I was on a 24⁄7 oncall rotation for some critical services, and in my mind “being responsible for servers” meant “get woken up at 2am to fix the servers” and “have lots of complicated dashboards”.
So for a while I only made static websites so that I wouldn’t have to think about servers.
But eventually I realized that any server I was going to write was going to be very low stakes, if they occasionally go down for 2 hours it’s no big deal, and I could just set up some very simple monitoring to help keep them running.
not having monitoring sucks
At first I didn’t set up any monitoring for my servers at all. This had the extremely predictable outcome of – sometimes the site broke, and I didn’t find out about it until somebody told me!
step 1: an uptime checker
The first step was to set up an uptime checker. There are tons of these out there, the ones I’m using right now are updown.io and uptime robot. I like updown’s user interface and pricing structure more (it’s per request instead of a monthly fee), but uptime robot has a more generous free tier.
These
- check that the site is up
- if it goes down, it emails me
I find that email notifications are a good level for me, I’ll find out pretty quickly if the site goes down but it doesn’t wake me up or anything.
step 2: an end-to-end healthcheck
Next, let’s talk about what “check that the site is up” actually means.
At first I just made one of my healthcheck endpoints a function that returned
200 OK no matter what.
This is kind of useful – it told me that the server was on!
But unsurprisingly I ran into problems because it wasn’t checking that the API was actually working – sometimes the healthcheck succeeded even though the rest of the service had actually gotten into a bad state.
So I updated it to actually make a real API request and make sure it succeeded.
All of my services do very few things (the nginx playground has just 1 endpoint), so it’s pretty easy to set up a healthcheck that actually runs through most of the actions the service is supposed to do.
Here’s what the end-to-end healthcheck handler for the nginx playground looks like. It’s very basic: it just makes another POST request (to itself) and checks if that request succeeds or fails.
func healthHandler(w http.ResponseWriter, r *http.Request) {
// make a request to localhost:8080 with `healthcheckJSON` as the body
// if it works, return 200
// if it doesn't, return 500
client := http.Client{}
resp, err := client.Post("http://localhost:8080/", "application/json", strings.NewReader(healthcheckJSON))
if err != nil {
log.Println(err)
w.WriteHeader(http.StatusInternalServerError)
return
}
if resp.StatusCode != http.StatusOK {
log.Println(resp.StatusCode)
w.WriteHeader(http.StatusInternalServerError)
return
}
w.WriteHeader(http.StatusOK)
}
healthcheck frequency: hourly
Right now I’m running most of my healthchecks every hour, and some every 30 minutes.
I run them hourly because updown.io’s pricing is per healthcheck, I’m monitoring 18 different URLs, and I wanted to keep my healthcheck budget pretty minimal at $5/year.
Taking an hour to find out that one of these websites has gone down seems ok to me – if there is a problem there’s no guarantee I’ll get to fixing it all that quickly anyway.
If it were free to run them more often I’d probably run them every 5-10 minutes instead.
step 3: automatically restart if the healthcheck fails
Some of my websites are on fly.io, and fly has a pretty standard feature where I can configure a HTTP healthcheck for a service and restart the service if the healthcheck starts failing.
“Restart a lot” is a very useful strategy to paper over bugs that I haven’t
gotten around to fixing yet – for a while the nginx playground had a process
leak where nginx processes weren’t getting terminated, so the server kept
running out of RAM.
With the healthcheck, the result of this was that every day or so, this would happen:
- the server ran out of RAM
- the healthcheck started failing
- it get restarted
- everything was fine again
- repeat the whole saga again some number of hours later
Eventually I got around to actually fixing the process leak, but it was nice to have a workaround in place that could keep things running while I was procrastinating fixing the bug.
These healthchecks to decide whether to restart the service run more often: every 5 minutes or so.
this is not the best way to monitor Big Services
This is probably obvious and I said this already at the beginning, but “write one HTTP healthcheck” is not the best approach for monitoring a large complex service. But I won’t go into that because that’s not what this post is about.
it’s been working well so far!
I originally wrote this post 3 months ago in April, but I waited until now to publish it to make sure that the whole setup was working.
It’s made a pretty big difference – before I was having some very silly downtime problems, and now for the last few months the sites have been up 99.95% of the time!
Notes on running containers with bubblewrap
Hello! About a year ago I got mad about Docker container startup time. This was because I was building an nginx playground where I was starting a new “container” on every HTTP request, and so for it to feel reasonably snappy, nginx needed to start quickly.
Also, I was running this project on a pretty small cloud machine (256MB RAM), a small CPU, so I really wanted to avoid unnecessary overhead.
I’ve been looking for a way to run containers faster since then, but I couldn’t find one until last week when I discovered bubblewrap!! It’s very fast and I think it’s super cool, but I also ran into a bunch of fun problems that I wanted to write down for my future self.
some disclaimers
- I’m not sure if the way I’m using bubblewrap in this post is maybe not how it’s intended to be used
- there are a lot of sharp edges when using bubblewrap in this way, you need to think a lot about Linux namespaces and how containers work
- bubblewrap is a security tool but I am not a security person and I am only doing this for weird tiny projects. you should definitely not take security advice from me.
Okay, all of that said, let’s talk about I’m trying to use bubblewrap to run containers fast and in a relatively secure way :)
Docker containers take ~300ms to start on my machine
I ran a quick benchmark to see how long a Docker container takes to run a
simple command (ls). For both Docker and Podman, it’s about 300ms.
$ time docker run --network none -it ubuntu:20.04 ls / > /dev/null
Executed in 378.42 millis
$ time podman run --network none -it ubuntu:20.04 ls / > /dev/null
Executed in 279.27 millis
Almost all of this time is overhead from docker and podman – just running ls
by itself takes about 3ms:
$ time ls / > /dev/null
Executed in 2.96 millis
I want to stress that, while I’m not sure exactly what the slowest part of Docker and podman startup time is (I spent 5 minutes trying to profile them and gave up), I’m 100% sure it’s something important.
The way we’re going to run containers faster with bubblewrap has a lot of limitations and it’s a lower level interface which is a lot trickier to use.
goal 1: containers that start quickly
I felt like it should be possible to have containers that start essentially instantly or at least in less than 5ms. My thought process:
- creating a new namespace with
unshareis basically instant - containers are basically just a bunch of namespaces
- what’s the problem?
container startup time is (usually) not that important
Most of the time when people are using containers, they’re running some long-running process inside the container like a webserver, so it doesn’t really matter if it takes 300ms to start.
So it makes sense to me that there aren’t a lot of container tools that optimize for startup time. But I still wanted to optimize for startup time :)
goal 2: run the containers as an unprivileged user
Another goal I had was to be able to run my containers as an unprivileged user instead of root.
I was surprised the first time I learned that Docker actually runs containers
as root – even though I run docker run ubuntu:20.04 as an unprivileged user (bork), that
message is actually sent to a daemon running as root, and the Docker container
process itself also runs as root (albeit a root that’s stripped of all its
capabilities).
That’s fine for Docker (they have lots of very smart people making sure that they get it right!), but if I’m going to do container stuff without using Docker (for the speed reasons mentioned above), I’d rather not do it as root to keep everything a bit more secure.
podman can run containers as an non-root user
Before we start talking about how to do weird stuff with bubblewrap, I want to quickly talk about a much more normal tool to run containers: podman!
Podman, unlike Docker, can run containers as an unprivileged user!
If I run this from my normal user:
$ podman run -it ubuntu:20.04 ls
it doesn’t secretly run as root behind the scenes! It just starts the container as my normal user, and then uses something called “user namespaces” so that inside the container I appear to be root.
The other cool thing about podman is that it has exactly the same interface as
Docker, so you can just take a Docker command and replace docker with
podman and it’ll Just Work. I’ve found that sometimes I need to do some extra
work to get podman to work in practice, but it’s still pretty nice that it has
the same command line interface.
This “run containers as a non-root user” feature is normally called “rootless containers”. (I find that name kind of counterintuitive, but that’s what people call it)
failed attempt 1: write my own tool using runc
I knew that Docker and podman use
runc (or maybe crun? I can’t keep track honestly) under the hood, so I thought –
well, maybe I can just use runc directly to make my own tool that starts
containers faster than Docker does!
I tried to do this 6 months ago and I don’t remember most of the details, but basically I spent 8 hours working on it, got frustrated because I couldn’t get anything to work, and gave up.
One specific detail I remember struggling with was setting up a working /dev
for my programs to use.
enter bubblewrap
Okay, that was a very long preamble so let’s get to the point! Last week, I
discovered a tool called bubblewrap that was basically exactly the thing I
was trying to build with runc in my failed attempt, except that it actually
works and has many more features and it’s built by people who know things about
security! Hooray!
The interface to bubblewrap is pretty different than the interface to Docker – it’s much lower level. There’s no concept of a container image – instead you map a bunch of directories on your host to directories in the container.
For example, here’s how to run a container with the same root directory as your
host operating system, but with only read access to that root directory, and only write access to /tmp.
bwrap \
--ro-bind / / \
--bind /tmp /tmp \
--proc /proc --dev /dev \
--unshare-pid \
--unshare-net \
bash
For example, you could imagine running some untrusted process under bubblewrap
this way and then putting all the files you want the process to be able to access in /tmp.
bubblewrap runs containers as an unprivileged (non-root) user
Like podman, bubblewrap runs containers as a non-root user, using user namespaces. It can also run containers as root, but in this post we’re just going to be talking about using it as an unprivileged user.
bubblewrap is fast
Let’s see how long it takes to run ls in a bubblewrap container!
$ time bwrap --ro-bind / / --proc /proc --dev /dev --unshare-pid ls /
Executed in 8.04 millis
That’s a big difference! 8ms is a lot faster than 279ms.
Of course, like we said before, the reason bubblewrap is faster is that it does a lot less. So let’s talk about some things bubblewrap doesn’t do.
some things bubblewrap doesn’t do
Here are some things that Docker/podman do that bubblewrap doesn’t do:
- set up overlayfs mounts for you, so that your changes to the filesystem don’t affect the base image
- set up networking bridges so that you can connect to a webserver inside the container
- probably a bunch more stuff that I’m not thinking of
In general, bubblewrap is a much lower level tool than something like Docker.
Also, bubblewrap seems to have pretty different goals than Docker – the README seems to say that it’s intended as a tool for sandboxing desktop software (I think it comes from flatpak).
running a container image with bubblewrap
I couldn’t find instructions for running a Docker container image with
bubblewrap, so here they are. Basically I just use Docker to download the
container image and put it into a directory and then run it with bwrap:
There’s also a tool called bwrap-oci which looks cool but I couldn’t get it to compile.
mkdir rootfs
docker export $(docker create frapsoft/fish) | tar -C rootfs -xf -
bwrap \
--bind $PWD/rootfs / \
--proc /proc --dev /dev \
--uid 0 \
--unshare-pid \
--unshare-net \
fish
One important thing to note is that this doesn’t create a temporary overlay filesystem for the container’s file writes, so it’ll let the container edit files in the image.
I wrote a post about overlay filesystems if you want to see how you could do that yourself though.
running “containers” with bubblewrap isn’t the same as with podman
I just gave an example of how to “run a container” with bubblewrap, and you might think “cool, this is just like podman but faster!”. It is not, and it’s actually unlike using podman in even more ways than I expected.
I put “container” in scare quotes because there are two ways to define “container”:
- something that implements OCI runtime specification
- any way of running a process in a way that’s somehow isolated from the host system
bubblewrap is a “container” tool in the second sense. It definitely provides isolation, and it does that using the same features – Linux namespaces – as Docker.
But it’s not a container tool in the first sense. And it’s a lower level tool so you can get into a bunch of weird states and you really need to think about all the weird details of how container work while using it.
For the rest of the post I’m going to talk about some weird things that can happen with bubblewrap that would not happen with podman/Docker.
weird thing 1: processes that don’t exist
Here’s an example of a weird situation I got into with bubblewrap that confused me for a minute:
$ bwrap --ro-bind / / --unshare-all bash
$ ps aux
... some processes
root 390073 0.0 0.0 2848 124 pts/9 S 14:28 0:00 bwrap --ro-bind / / --unshare-all --uid 0 bash
... some other processes
$ kill 390073
bash: kill: (390073) - No such process
$ ps aux | grep 390073
root 390073 0.0 0.0 2848 124 pts/9 S 14:28 0:00 bwrap --ro-bind / / --unshare-all --uid 0 bash
Here’s what happened
- I started a bash shell inside bubblewrap
- I ran
ps aux, and saw a process with PID390073 - I try to kill the process. It fails with the error
no such process. What? - I ran
ps aux, and still see the process with PID390073
What’s going on? Why doesn’t the process 390073 exist, even though ps says it does? Isn’t that impossible?
Well, the problem is that ps doesn’t actually list all the processes in your
current PID namespace. Instead, it iterates through all the entries in /proc
and prints those out. Usually, what’s in /proc is actually the same as the processes on your system.
But with Linux containers these things can get out of sync. What’s happening in
this example is that we have the /proc from the host PID namespace, but those
aren’t actually the processes that we have access to in our PID namespace.
Passing --proc /proc to bwrap fixes the issue – ps then actually lists the correct processes.
$ bwrap --ro-bind / / --unshare-all --dev /dev --proc /proc ps aux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
bork 1 0.0 0.0 3644 136 ? S+ 16:21 0:00 bwrap --ro-bind / / --unshare-all --dev /dev --proc /proc ps au
bork 2 0.0 0.0 21324 1552 ? R+ 16:21 0:00 ps aux
Just 2 processes! Everything is normal!
weird thing 2: trying to listen on port 80
Passing --uid 0 to bubblewrap makes the user inside the container root. You
might think that this means that the root user has administrative privileges
inside the container, but that’s not true!
For example, let’s try to listen on port 80:
$ bwrap --ro-bind / / --unshare-all --uid 0 nc -l 80
nc: Permission denied
What’s going on here is that the new root user actually doesn’t have the capabilities it needs to listen on port 80. (you need special permissions to listen on ports less than 1024, and 80 is less than 1024)
There’s actually a capability specifically for listening on privileged ports
called CAP_NET_BIND_SERVICE.
So to fix this all we need to do is to tell bubblewrap to give our user that capability.
$ bwrap --ro-bind / / --unshare-all --uid 0 --cap-add cap_net_bind_service nc -l 80
(no output, success!!!)
This works! Hooray!
finding the right capabilities is pretty annoying
bubblewrap doesn’t give out any capabilities by default, and I find that figuring out all the right capabilities and adding them manually is kind of annoying. Basically my process is
- run the thing
- see what fails
- read
man capabilitiesto figure out what capabilities I’m missing - add the capability with
--cap-add - repeat until everything is running
But that’s the price I pay for wanting things to be fast I guess :)
weird thing 2b: --dev /dev makes listening on privileged ports not work
One other strange thing is that if I take the exact same command above (which
worked!) and add --dev /dev (to set up the /dev/ directory), it causes it to not work again:
$ bwrap --ro-bind / / --dev /dev --unshare-all --uid 0 --cap-add cap_net_bind_service nc -l 80
nc: Permission denied
I think this might be a bug in bubblewrap, but I haven’t mustered the courage to dive into the bubblewrap code and start investigating yet. Or maybe there’s something obvious I’m missing!
weird thing 3: UID mappings
Another slightly weird thing was – I tried to run apt-get update inside a bubblewrap Ubuntu container and everything went very poorly.
Here’s how I ran apt-get update inside the Ubuntu container:
mkdir rootfs
docker export $(docker create ubuntu:20.04) | tar -C rootfs -xf -
bwrap \
--bind $PWD/rootfs / \
--proc /proc\
--uid 0 \
--unshare-pid \
apt-get update
And here are the error messages:
E: setgroups 65534 failed - setgroups (1: Operation not permitted)
E: setegid 65534 failed - setegid (22: Invalid argument)
E: seteuid 100 failed - seteuid (22: Invalid argument)
E: setgroups 0 failed - setgroups (1: Operation not permitted)
.... lots more similar errors
At first I thought “ok, this is a capabilities problem, I need to set
CAP_SETGID or something to give the container permission to change groups. But I did that and it didn’t help at all!
I think what’s going on here is a problem with UID maps. What are UID maps? Well, every time you run a container using “user namespaces” (which podman is doing), it creates a mapping of UIDs inside the container to UIDs on the host.
Let’s look that the UID maps! Here’s how to do that:
root@kiwi:/# cat /proc/self/uid_map
0 1000 1
root@kiwi:/# cat /proc/self/gid_map
1000 1000 1
This is saying that user 0 in the container is mapped to user 1000 on in the
host, and group 1000 is mapped to group 1000. (My normal user’s UID/GID is 1000, so this makes sense). You can find out
about this uid_map file in man user_namespaces.
All other users/groups that aren’t 1000 are mapped to user 65534 by default, according
to man user_namespaces.
what’s going on: non-mapped users can’t be used
The only users and groups that have been mapped are 0 and 1000. But man user_namespaces says:
After the uid_map and gid_map files have been written, only the mapped values may be used in system calls that change user and group IDs.
apt is trying to use users 100 and 65534. Those aren’t on the list of mapped
users! So they can’t be used!
This works fine in podman, because podman sets up its UID and GID mappings differently:
$ podman run -it ubuntu:20.04 bash
root@793d03a4d773:/# cat /proc/self/uid_map
0 1000 1
1 100000 65536
root@793d03a4d773:/# cat /proc/self/gid_map
0 1000 1
1 100000 65536
All the users get mapped, not just 1000.
I don’t quite know how to fix this, but I think it’s probably possible in bubblewrap to set up the uid mappings the same way as podman does – there’s an issue about it here that links to a workaround.
But this wasn’t an actual problem I was trying to solve so I didn’t dig further into it.
a quick note on Firecracker
Someone asked “would Firecracker work here?” (I wrote about Firecracker last year).
My experience with Firecracker VMs is that they use kind of a lot of RAM (like 50MB?), which makes sense because they’re VMs. And when I tried Firecracker on a tiny machine (with ~256MB of RAM / a tiny CPU), the startup times were 2-3 seconds.
I’m sure it’s possible to optimize Firecracker to be a bit faster, but at the end of the day I think it’s a VM and it’s not going to be anywhere near as low overhead as a process – there’s a whole operating system to start!
So Firecracker would add a lot more overhead than I want in this case.
bubblewrap works pretty great!
I’ve talked about a bunch of issues, but the things I’ve been trying to do in bubblewrap
have been very constrained and it’s actually been pretty simple. For example, I
was working on a git project where I really just want to run git inside a
container and map a git repository from the host.
That’s very simple to get to work with bubblewrap! There were basically no weird problems! It’s really fast!
So I’m pretty excited about this tool and I might use it for more stuff in the future.
sqlite-utils: a nice way to import data into SQLite for analysis
Hello! This is a quick post about a nice tool I found recently called sqlite-utils, from the tools category.
Recently I wanted to do some basic data analysis using data from my Shopify store. So I figured I’d query the Shopify API and import my data into SQLite, and then I could make queries to get the graphs I want.
But this seemed like a lot of boring work, like I’d have to write a
schema and write a Python program. So I hunted around for a solution, and I
found sqlite-utils, a tool designed to make it easy to import arbitrary data
into SQLite to do data analysis on the data.
sqlite-utils automatically generates a schema
The Shopify data has about a billion fields and I really did not want to type
out a schema for it. sqlite-utils solves this problem: if I have an array of
JSON orders, I can create a new SQLite table with that data in it like this:
import sqlite_utils
orders = ... # (some code to get the `orders` array here)
db = sqlite_utils.Database('orders.db')
db['shopify_orders'].insert_all(orders)
you can alter the schema if there are new fields (with alter)
Next, I ran into a problem where on the 5th page of downloads, the JSON contained a new field that I hadn’t seen before.
Luckily, sqlite-utils thought of that: there’s an alter flag which will
update the table’s schema to include the new fields. ```
Here’s what the code for that looks like
db['shopify_orders'].insert_all(orders, alter=True)
you can deduplicate existing rows (with upsert)
Next I ran into a problem where sometimes when doing a sync, I’d download data from the API where some of it was new and some wasn’t.
So I wanted to do an “upsert” where it only created new rows if the item didn’t
already exist. sqlite-utils also thought of this, and there’s an upsert
method.
For this to work you have to specify the primary key. For me that was
pk="id". Here’s what my final code looks like:
db['shopify_orders'].upsert_all(
orders,
pk="id",
alter=True
)
there’s also a command line tool
I’ve talked about using sqlite-utils as a library so far, but there’s also a
command line tool which is really useful.
For example, this inserts the data from a plants.csv into a plants table:
sqlite-utils insert plants.db plants plants.csv --csv
format conversions
I haven’t tried this yet, but here’s a cool example from the help docs of how you can do format conversions, like converting a string to a float:
sqlite-utils insert plants.db plants plants.csv --csv --convert '
return {
"name": row["name"].upper(),
"latitude": float(row["latitude"]),
"longitude": float(row["longitude"]),
}'
This seems really useful for CSVs, where by default it’ll often interpret numeric data as strings if you don’t do this conversions.
metabase seems nice too
Once I had all the data in SQLite, I needed a way to draw graphs with it. I wanted some dashboards, so I ended up using Metabase, an open source business intelligence tool. I found it very straightforward and it seems like a really easy way to turn SQL queries into graphs.
This whole setup (sqlite-utils + metabase + SQL) feels a lot easier to use than my previous setup, where I had a custom Flask website that used plotly and pandas to draw graphs.
that’s all!
I was really delighted by sqlite-utils, it was super easy to use and it did
everything I wanted.
Pages that didn't make it into "How DNS Works"
Hello! A couple weeks ago I released a new zine called How DNS Works.
When I started writing that zine (in, uh, January 2021), I originally had in mind a broader zine on “everything you need to know to own a domain”. So it had a bunch of pages on domain registration, TLS, and email.
At the time I thought “I can just explain DNS in like 5 pages, it’s not that complicated, there will be lots of space for other topics about domains”. I was extremely wrong about that and it turned out I needed all 28 pages to explain DNS. So I ended up deciding to just focus the zine on DNS and all those other topics didn’t make it into the final zine.
This morning it occurred to me that instead of letting all of the old draft pages languish in purgatory on my hard drive, I could post those extra pages here all together on my blog. So here they are!
disclaimer: not super cohesive
I will say (as a disclaimer) that these pages aren’t as cohesive as I usually like my zines to be and they definitely do not tell you everything you need to need to know to own a domain.
domain registration


I should say that these 2 pages don’t really do email justice – email security is a HUGE topic that honestly I don’t know a lot about.
TLS

These two pages also don’t remotely cover TLS, it’s possible I’ll write more in depth about TLS at some point. Who knows!
that’s all!
though I will say: if you liked these, you might be interested in buying How DNS Works :)