Messing With the Badguys
Scrolling through my HAProxy logs, I got a bit miffed at the amount of bot traffic I saw attempting to access nonexistent paths obviously looking for weaknesses to exploit.
This kind of traffic is extremely naïve and obviously doesn’t hurt me other than the minuscule amount of bandwidth and CPU power it wastes, but I’d like to make my environment slightly less hospitable for them.
HAProxy has a fun toy it calls a tarpit. They had a blog post about this a while ago, and when I stumbled over it I figured why not give it a try?
I started out by looking at what paths are actually accessed on my website. With some mental arithmetics I came up with the following pipeline:
In sequence, these commands list all my log history for HAProxy, then pull out the 18th and 19th columns, usually corresponding to the HTTP method used and the path it was used on. In the next step I first sort the resulting lines, then have the unique lines counted, and finally I sort them according to count.
Most of the traffic, of course, is valid, but the above sequence lets me create a list of stuff I’m not interested in seeing. I picked up a number of these obvious paths and put them, one on each line, in a file I called badendpoints.lst
.
Then I opened my HAProxy configuration file. To my defaults
section, I added the following line to define for how long I would punish the bots:
Update a day later While my original solution did what I intended, I saw some improvement potential in terms of observability. Follow along to see the difference.
Old solution
Further down, in my HTTPS listener section, I added the following lines:
Here I create an Access Control List called badguys
, and add to this ACL the case-insensitive (-i
) versions of any line in the file I created earlier. Then I tell HAProxy to send anybody who triggers the badguys
ACL to the tarpit, followed by telling them that they’re not allowed to do what they just did, by giving them an HTTP status code of 403.
New solution
Instead of immediately tarpitting misbehaving clients in the listener section, I decided to direct them to a backend section which does the exact same thing that I did previously.
Changed frontend section:
New backend section for this purpose:
When using the old solution, the log lines show the requested path and the expected 403 error, but there are other reasons to see 403 statuses, of course, and I can’t differentiate between them:
With the new solution, since I call the backend bk_tarpit, I can search specifically for that term in my logs:
Original post continues
I always test my HAProxy configuration for syntax errors before I activate it:
If everything is OK, I reload - not restart - HAProxy. This way the new rules take hold for any new session without breaking existing ones:
Then let’s confirm the new rules do what we want them to:
As expected, a call to my website responds immediately, while a call to a forbidden path is dropped in the tarpit for 30 seconds.
My private site sees very little traffic, so I’m not particularly worried about negative effects from this config change, but it should be noted that tarpitting evil bots keeps an HTTP session open for however long you decided to hold them hostage, so this actually is a potential risk for an own-goal in the form of getting DoS‘d.