Name Last Update
cmd/iprep Loading commit data...
vendor Loading commit data...
.gitlab-ci.yml Loading commit data...
LICENSE Loading commit data...
README.md Loading commit data...
iprep.go Loading commit data...
iprep_test.go Loading commit data...
parser.go Loading commit data...
parser_test.go Loading commit data...
patterns.go Loading commit data...
rate.go Loading commit data...
rate_test.go Loading commit data...

iprep - an IP reputation engine

Detects sources of brute force attempts using statistical methods, and bans the originating IP addresses.

The program works by analyzing HTTP server logs and looking for clients where the following is true:

  • total rate of requests is above a certain threshold (you can have separate short-term and long-term thresholds),
  • aggregate cardinality of request URLs is very low (1 or 2).

This follows the idea that brute force attacks differ from normal traffic because they send a large amount of traffic to the same, very limited, set of URLs, without interacting with the site in any other way.

The cardinality estimate uses HyperLogLog++ (a probabilistic data structure), so we can track URL cardinality with a constant memory usage. As a consequence, resource utilization of iprep grows only with the number of clients, not their traffic.

Building

Once you have the source code at the right place in your GOPATH, simply build the iprep executable with:

go build ./cmd/iprep

This will leave you with an iprep executable in the current directory.

Running

While the cardinality-based approach is sufficiently good by itself to detect arbitrary brute-force attempts, iprep doesn't entirely trust it, and instead lets you define a set of rules that need to be matched in order for the request to be considered. Rules can match any field of the request, though the url field is going to be the most common one. A rule file contains a JSON-encoded list of rules, and every rule can consist of multiple matches (that are logically ANDed). Consider for instance the following simple rule file:

[
  {
    "matches": [{
        "field": "url",
        "match": "/wp-login\\.php$"
    }]
  }
]

In this case, iprep will still track all requests from all clients for cardinality estimates, but it will only check the request rate for Wordpress logins. A client making lots of requests to, say, /poll.php will not be blocked.

Logs can be analyzed in real-time (usually, through a pipe) or in batch by passing a log file. Either way, iprep expects its input on stdin. It knows about only two log formats at the moment, common and combined, matching the equivalent standard Apache log formats.

You can define the action to be performed when an IP should be blocked according to iprep with the --action command-line option. A literal %s token in that option will be replaced by the IP address in question, and the result will be sent to a shell for execution.

Resource requirements and miscellaneous performance notes

The iprep daemon needs to store aggregate statistical information for every IP it has seen. While lightweight, this does take a certain amount of space:

  • 24 bytes per RateEstimator
  • 2 estimators, 1 timestamp (total 56 bytes) and 1 HyperLogLogPlus (on the order of 100 bytes?) per IP

so if you talk with, say, 1M peers in a day, you are going to need at least 150M+ of memory to store their data.

The code is still GC-heavy (lots of Log objects) as it has not been optimized from that perspective yet.