Messing With the Badguys

3 minute read

Scrolling through my HAProxy logs, I got a bit miffed at the amount of bot traffic I saw attempting to access nonexistent paths obviously looking for weaknesses to exploit.

This kind of traffic is extremely naïve and obviously doesn’t hurt me other than the minuscule amount of bandwidth and CPU power it wastes, but I’d like to make my environment slightly less hospitable for them.

HAProxy has a fun toy it calls a tarpit. They had a blog post about this a while ago, and when I stumbled over it I figured why not give it a try?

I started out by looking at what paths are actually accessed on my website. With some mental arithmetics I came up with the following pipeline:

zcat /var/log/haproxy.log* | awk {'print $18, $19'} | sort | uniq -c | sort -n

In sequence, these commands list all my log history for HAProxy, then pull out the 18th and 19th columns, usually corresponding to the HTTP method used and the path it was used on. In the next step I first sort the resulting lines, then have the unique lines counted, and finally I sort them according to count.

Most of the traffic, of course, is valid, but the above sequence lets me create a list of stuff I’m not interested in seeing. I picked up a number of these obvious paths and put them, one on each line, in a file I called badendpoints.lst.

Then I opened my HAProxy configuration file. To my defaults section, I added the following line to define for how long I would punish the bots:

        timeout tarpit 30s

Update a day later While my original solution did what I intended, I saw some improvement potential in terms of observability. Follow along to see the difference.

Old solution

Further down, in my HTTPS listener section, I added the following lines:

        acl badguys path_beg -i -f /etc/haproxy/badendpoints.lst
        http-request tarpit deny_status 403 if badguys

Here I create an Access Control List called badguys, and add to this ACL the case-insensitive (-i) versions of any line in the file I created earlier. Then I tell HAProxy to send anybody who triggers the badguys ACL to the tarpit, followed by telling them that they’re not allowed to do what they just did, by giving them an HTTP status code of 403.

New solution

Instead of immediately tarpitting misbehaving clients in the listener section, I decided to direct them to a backend section which does the exact same thing that I did previously.

Changed frontend section:

        acl badguys path_beg -i -f /etc/haproxy/badendpoints.lst
        use-backend bk_tarpit if badguys

New backend section for this purpose:

        http-request tarpit deny_status 403

When using the old solution, the log lines show the requested path and the expected 403 error, but there are other reasons to see 403 statuses, of course, and I can’t differentiate between them:

Feb 11 06:06:07 haproxy1 haproxy[3874]: 159.65.<redacted>:56324 [11/Feb/2024:06:05:37.924] web-https~ web-https/<NOSRV> -1/30060/-1/-1/30002 403 192 - - PT-- 4/4/0/0/0 0/0 "GET /wp-admin/admin.php?520=1 HTTP/1.1"

With the new solution, since I call the backend bk_tarpit, I can search specifically for that term in my logs:

Feb 11 11:59:28 haproxy1 haproxy[10929]: 2001:9b1:<redacted>:60964 [11/Feb/2024:11:58:58.117] web-https~ bk_tarpit/<NOSRV> -1/30010/-1/-1/30001 403 192 - - PT-- 1/1/0/0/3 0/0 "GET HTTP/2.0"

Original post continues

I always test my HAProxy configuration for syntax errors before I activate it:

sudo haproxy -c -f /etc/haproxy/haproxy.cfg

If everything is OK, I reload - not restart - HAProxy. This way the new rules take hold for any new session without breaking existing ones:

sudo systemctl reload haproxy.service

Then let’s confirm the new rules do what we want them to:

time curl -s > /dev/null 
curl -s > /dev/null  0.01s user 0.01s system 53% cpu 0.039 total

➜ time curl -s > /dev/null
curl -s > /dev/null  0.02s user 0.01s system 0% cpu 30.052 total

As expected, a call to my website responds immediately, while a call to a forbidden path is dropped in the tarpit for 30 seconds.

My private site sees very little traffic, so I’m not particularly worried about negative effects from this config change, but it should be noted that tarpitting evil bots keeps an HTTP session open for however long you decided to hold them hostage, so this actually is a potential risk for an own-goal in the form of getting DoS‘d.