Skip to content
Snippets Groups Projects

iprep

Design for an "IP reputation" database, meant to detect harmful actors (including distributed ones).

First an architectural observation: we most likely want to collect data as events originating from multiple sources, which pretty much dictates some sort of RPC submission scenario with probes separate from the central event database. Furthermore we recognize that most operations will amount to per-IP aggregations over the (windowed) event logs themselves.

The system must also perform well under overloading (DoS) scenarios, which implies at the very least that traffic from probes should not scale linearly with the number of events: there must be a level of time-based aggregation going on at the client side of the data submission protocol. This can be achieved by implementing a "minimum processing delay" and a scheduled report thread (or, more simply, via cron).

Database

The event database is a time-based append-only log: the two operations supported are append and scan (and internally a delete-older-than that periodically wipes entries that are too old to be relevant anymore).

Querying reputation for an IP consists in scanning the database for a pre-defined window of time in the past, and passing the results to a scoring script (currently written in an embedded language), that applies aggregation and weighting and returns the final score.

RPC interface

The server provides a simple GRPC interface that is used for event submission and querying. The query API is a simple IP lookup, returning a score. This conceivably could be turned into a DNS-based API as well.

Third-party sources

It would be nice to allow the scoring script to consult other IP-based third-party sources, such as DNSBLs, or GeoIP lookups, etc.