Openopened 3 months ago by nervuri · 23 comments
Reference in New Issue
There is no content yet.
Delete Branch '%!s(<nil>)'
Deleting a branch is permanent. It CANNOT be undone. Continue?
Deleting a branch is permanent. It CANNOT be undone. Continue?
Molly Brown has no notion of timeouts. If a client opens a connection and sends nothing through it, the connection is kept open indefinitely. This could make DOS attacks easier.
Please add configurable timeouts. See, for example, nginx's client_header_timeout and send_timeout, which both default to 60 seconds (although I would not set the default to more than 30).
As for the implementation, see Go's
Conn.SetWriteDeadline: https://pkg.go.dev/net#Conn You would call these right after
Thanks for pointing this out, I'll try to address it soon.
Be careful about setting deadlines. As the docs mention:
If you just set the deadline one time, something like a non-malicious long download that takes over 20 secs (using the deadline from the example) would fail.
Thanks for the followup, @makeworld!
It seems that one half of this problem is fairly straightforward. Setting only a read deadline immediately after
listener.Accept()and then clearing it after
readRequest()has returned should stop attackers exhausting socket descriptors by simply connecting and leaving the connection idle (like in @nervuri 's examples above) or by doing a "Slow Loris" glacial request, without interfering with long, slow, legitimate downloads.
An attacker can get past that by connecting and sending a request quickly, but then consuming the response very slowly. That seems a little trickier to solve. We send files with
io.Copy(), which doesn't provide any way to hook in an extension of the deadline after each successful
Write(). I guess we could set a single write deadline before calling
io.Copy()with a duration which is calculated as linear function of filesize, but that feels kind of ugly and still leaves servers vulnerable if they serve large files.
I guess we could replace our single call to
io.Copy()with a loop over calls to
io.CopyN(), refreshing the deadline each time. I have no idea, though, what a sensible value of
nwould be. Too low and we probably take a performance hit (though maybe not one worth caring about). Too high and we are likely to get false positives when people with slow connections try to download large files.
What should Molly even do in response to a write timeout? If the response header has already been sent, there is no longer any way to convey an error to the client. Does the standard library provide a way to close the connection without sending a TLS close notify? That's our only option for unambiguously failing half-way through a download.
@makeworld: Thanks for catching that! Another problematic case is that of CGI scripts keeping a connection open for streaming output - the deadline needs to be updated after each write.
@solderpunk: You can close the connection without sending close_notify by calling
close()on the underlying TCP connection instead of the TLS connection:
This won't tell the user much (if anything), but it seems to be the only solution we have.
To correct my initial suggestion, the read deadline should be set right before reading and the write deadline right before writing (not after
listener.Accept(), as I previously wrote).
Reading is easy, as you said; a single call to SetReadDeadline will suffice (you don't even need to clear the deadline after the request is read).
As for writing a large file, the two solutions you described both rely on defining a minimum transfer rate. Anything under that we would consider to be dubiously slow and would result in a dropped connection. So, the timeout for io.Copy() would be
ceiling(fileSize / minTransferRate), whereas the chunk size for io.CopyN() would be
writeTimeout * minTransferRate(we could use 1 instead of a custom writeTimeout, but the download is faster if we fetch, for instance, 30 KB at a time instead of 1 KB).
So, here's an example using CopyN:
However, this is quite slow for low values of minTransferRate and writeTimeout. Ideally we'd want to only extend the write deadline once it's hit, because that would allow us to set a larger chunkSize, to speed the thing up. It might even allow us to just call Copy() repeatedly until the download is done (the file reader should pick up where it left off, I think). But only the read deadline can be extended after it passes, not the write deadline - not sure if it's a bug or if I'm doing something wrong.
Also, it might be a good idea to extend the deadline only a limited number of times, until an absolute timeout is reached - just as a sanity check.
It turns out that the write deadline can be extended after it passes, but only for plain TCP connections, not TLS. Quoting from https://pkg.go.dev/crypto/tls#Conn.SetWriteDeadline :
I don't understand why it has to be this way, but, in any case, there goes the most elegant way of implementing write timeouts.
Note that the Go HTTP library's WriteTimeout option simply sets a write deadline. If a large file isn't sent within the timeout, the connection closes. It's up to the server admin to choose an appropriate value. A simple implementation like this would actually be okay, in my opinion. To be cautious, you could disable the write timeout by default or set a high default value.
Well, that's annoying!
But thanks for looking into this so closely. I guess I will try the approach of, by default, setting a single write deadline with a time derived from the filesize and a slow download rate of 1KB/s (maybe with a lower bound on the timeout to reduce the risk of false positives under pathological conditions), with an option to disable it entirely. I'll try to implement that, plus the read timeout, this weekend.
I am also currently testing a leaky bucket rate limiting implementation, which immediately sends status code 44 and closes the connection without even reading the request once a particular IP address has overflowed its bucket. I was actually motivated to write this by aggressive crawlers, but it now occurs to me it will also help contribute to defence against DOS attacks.
1 KB/s is what I had on dial-up around the year 2000. I remember spending about an hour to download a 3-4 MB MP3 file. Since Gemini is so retro-friendly and generally serves small files, perhaps the minimum transfer rate should be lower... or you could increase it based on filesize, if you want to get fancy.
As for good connections, be sure to check the performance difference when serving large files with and without this type of timeout. It may be too much to tolerate.
Indeed, even testing locally I've seen plenty of false positives when using a 1 second timeout (and a high minTransferRate, to be fair).
So Tor will tend to be automatically blocked by Molly? If so, please don't make this a default.
I seem to remember that dialup circa 2000 was a little faster than 1KB/s, at least on most nights. Maybe I was slightly downhill from the nearest phone exchange. :p But sure, let's halve it.
Hmm? If we aren't chunking the file into multiple writes with
copyN()why would there be a performance difference at all?
The rate limiting is off by default and requires
RateLimitEnable = truein the config to do anything, so nothing will be blocked by default. If you enable it, you can configure the allowed average rate of requests per second and the burst limit. I presume it will be possible to find settings which are not too Tor unfriendly while still providing some pushback against broken or inconsiderate bots. Is the issue with Tor just that the exit nodes look like a single hyperactive client?
I wished. But, yes, I hear it was better for others.
Oh, I thought you were going to use CopyN(). My bad, I misread.
Probably, as long as you don't receive a lot of Tor traffic. Otherwise, one solution would be to apply different rules to a specific set of IP addresses (either the nasty bots or the exit nodes), which can be done via the firewall (iptables, nftables, etc).
That's the main issue. Also, if you host an onion service, all of its traffic comes from the local Tor daemon (127.0.0.1).
No, actually, my bad, I think I misunderstood your earlier message, about the inability to extend a write deadline on a TLS connection after it expires. I missed the "after expiry" detail. I got it into my head that you could basically only ever set a single write deadline per TLS connection. In which case the only thing you can do it set a deadline far enough into the future to allow for the whole response to be sent over a non-suspiciously-slow connection and then start shovelling bits as fast as possible - so why not just stick with
I realise now (I think!) that it is allowed to extend a deadline before it expires. In which case you can do a loop over
io.CopyN(), extending the deadline after each successfully transferred chunk to allow the next chunk to be downloaded over a reasonable connection, and you abort the whole thing when the first chunk fails. This means that suspiciously-slow downloads of large files time out a lot quicker than the above scenario, which is good from the point of view of preventing socket exhaustion attacks.
But now I'm confused: if the above is still possible, what was the "most elegant way of implementing write timeouts" you had in mind which is no longer possible when you can't extend deadlines after expiry? Your example code looping over calls to
io.CopyN()does not actually rely on extending expired deadlines, does it?
Okay, first baby step, a hard-coded 30 second timeout on reading requests is in there and I have confirmed that
time openssl s_client hostname:1965no longer stays open indefinitely.
Never mind, I have now actually read your earlier comments in their entirety and see where you explicitly explained why you wanted to wait for the deadline to expire. Sorry for my inattentiveness.
No problem. I wrote a reply before I saw this message, so I'm posting it anyway, for the sake of clarity:
Yep, like in my example above.
The downside is that it can seriously degrade download speed if the chunk size is too small. It's important to test this (over a network connection, not just locally).
It's something like this:
This code works only for non-TLS connections. Doing it this way has no performance penalty, because the chunk size is adjusted dynamically to the transfer rate. On a good connection, even large files will probably get downloaded without hidding the deadline more than a few times. The CopyN() method with a chunk size of 512 bytes, however, will loop 2000 times for each MB, no matter how good your connection is.
So if the performance penalty is too much, then the simplest solution of setting a single write deadline might be the best option (maybe that's why Go's HTTP library does it this way).
I appreciate the clear explanation! Not that the cause of my earlier confusion was lack of clarity on your part.
I've not committed it yet, but I have a local implementation now, which I'm currently testing, that just sets a single deadline of 30 seconds or the time taken to download the file at 512 bytes per second, whichever is longer, and then uses
io.Copy()as previously: no performance penalty on good connections, but timeouts on large files take a long time, which means if you are serving a multi-megabyte file, and an attacker finds its URL, they might still be able to mount a loris attack. But at least this raises the bar.
I'm starting to wonder how much utility this approach really provides. Big caveat: I'm testing locally so far and maybe that makes a difference to how buffering happens in the TCP/IP stack. What I'm seeing is that sufficiently small files get sent by Molly in a single
Write()and so the deadline simply doesn't matter. The client can consume the response arbitrarily slowly without any consequence because it's just talking to the local OS buffer and Molly has already moved on. Files need to be in the ballpark of 5 megabytes to get beyond that point (much bigger than I expected! Maybe that point would come sooner over a real network). For files that large, allowing slow downloads at 512 bytes per second means the deadline is several hours long, which is more than enough time for an attacker to open a great many such connections, even if they do it slowly to avoid triggering the rate limiting an admin might also turn on.
Interesting. After conn.Close(), the server keeps sending data that was written to the buffer, but only if SetLinger is enabled. So I guess you can either
SetLinger(0)or set a smaller write buffer. Why doesn't Go do this by default? Beats me.
SetWriteBuffer(0), the write buffer doesn't go all the way to zero. Its minimum size depends on the client's TCP window size. If you
SetWriteBuffer(0)and the write deadline expires, what's left in the buffer will have to be read in no more than 10 minutes or so, otherwise the server will drop the connection. Take this with a grain of salt though, I deduced it by testing locally. Maybe I'll elaborate later.
It raises the bar, as you said. It's also simple to implement, has no performance penalty and is a fine default - Molly Brown users will be able to serve arbitrarily large files to people with slow connections without changing the settings, while also having a basic level of protection against socket descriptor exhaustion. If they get DOSed, they can lower the timeouts and/or raise the minTransferRate.
Now I'll expand upon what I wrote about in the previous post. The tl;dr is this:
The write deadline is not a connection deadline, because calling
conn.Close()does not close the connection immediately. The server keeps it open until its write buffer is read by the client. If the client reads slowly, that can take a very long time, possibly more than two days. To prevent that, you can make
conn.Close()instantly close the connection by calling
Now onto the details. Feel free to skip them, I just need an outlet for the stuff I've just learned. :) You wrote:
Based on my testing, that's not quite right. There's a server write buffer and a client read buffer:
In my case the server's write buffer is ~2.5 MB (for you it's ~5 MB, I guess you have more RAM) and the client's read buffer is around 80 KB by default. Let's say the client requests a 20 MB file and reads it at 256 bytes / second (an artificial limit in the client's code, nothing wrong with the network). The server instantly copies 2.5 MBs to its write buffer. As it sends data to the client, it replenishes the write buffer with data from the file. After a while, the deadline is hit and the server calls
conn.Close(). However, it actually keeps the connection open until the data remaining in the buffer is sent to the client.
The client requests data from the server in chunks. After sending a chunk, the server waits ~5 minutes for the client to request the next one. The chunk size (the TCP window size) is between 0.5 and 1 multiplied by the client's read buffer size (SO_RCVBUF). In my tests the read buffer size is ~80 KB by default, but I'm able to make it as low as 1152 bytes.
The server sets the size of its write buffer for a given connection based on the size of the client's read buffer. The relationship is nonlinear. Here are some values I found:
Reading 1152 bytes every 5 minutes until the server's 800 KB buffer is empty would take more than 2 days, so the supposed timeout we set via SetWriteDeadline() can be bypassed by a client using a tiny read buffer and reading very slowly.
However, if you
SetWriteBuffer(0), the server will use its minimum write buffer size, which, as far as I can tell, is never more than the client's read buffer size. Since the server will only wait ~5 min for the client to request the next chunk, the connection will usually be closed in no more than ~5 minutes, or ~10 minutes in the worst case (actually ~6 or ~11 minutes, because after sending the data, the server waits another minute for the client to close the connection gracefully).
Then again, reducing the write buffer so drastically may hurt performance. A better option (in this case) is to
SetLinger(0). This makes it so the connection is, in fact, closed when calling
conn.Close(), regardless of whether there's anything left it the write buffer. This would not work well if you were looping over CopyN(), but for the way you're implementing the write timeout, it should be perfect.
Again, take all of this with a grain of salt, as it's pretty much based on testing I did locally. (I didn't expect to get so deep in the weeds!)
Addendum: I ran a quick test on a remote RAM-constrained server and found the buffer sizes to be much smaller: ~180 KB by default and ~14 KB for a read buffer of 1152 B. Still enough to keep the connection open for an extra hour or so, though.
Thanks yet again for both taking the time to wander so deeply into these weeds and also for documenting your findings! Sorry that I let this sit for a while.
I'm going to add the
SetLinger(0)to the current implementation and will experiment with this on a remote server with considerably less RAM than my laptop.
Quick update, I have had the current implementation deployed at the Zaibatsu for ~2 weeks now. The only instances of time outs in the log are from my own tests, so that gives some confidence that the rate of false positives is low.
I have written a quick and dirty Python test script which doesn't do anything to explicitly control the read buffer, just calls
recv(256)on a socket followed by
time.sleep(1)in a loop, reading at half of the minimum speed used to calculate the timeout deadline. When downloading a ~1 MB image file, the script dies due to a closed socket in ~30 minutes. That's much quicker than anything I was seeing previously, the combination of
SetLinger(0)and testing on a server with much less RAM than my laptop seems to have made a big difference.
SetLinger(0), it makes no difference how much RAM the server has. The amount of RAM affects the default size of the write buffer, which is irrelevant, since the server no longer waits for the write buffer to be emptied before closing the connection.
If you ever want to play with the read buffer size, here's how you would set it to 1152 bytes in Python:
That's the lowest I could get it on my system. When I used lower values, Wireshark still reported 1152.
OK, so it looks like all that's left to do is make the timeouts and the minTransferRate configurable. I suppose these values will also apply to CGI output later on (#37).
Yep, exposing all this to config and documenting it will be my next step!
tlsConn.NetConn()is, somehow, quite a recent addition to the language, appearing only in Go 1.18. Time for more annoying conditional compilation hoop jumping. I am really committed to Molly Brown being able to be compiled with the version of Go shipped by the latest Debian stable (currently 1.15).
Go 1.19 is available in Debian stable via backports. If that doesn't meet your requirements, consider that Debian 12 will be released on June 10.