ai3
tools
iprep

Repository



iprep
Design for an "IP reputation" database, meant to detect harmful actors
(including distributed ones).
First an architectural observation: we most likely want to collect
data as events originating from multiple sources, which pretty
much dictates some sort of RPC submission scenario with probes
separate from the central event database. Furthermore we recognize
that most operations will amount to per-IP aggregations over the
(windowed) event logs themselves.
The system must also perform well under overloading (DoS) scenarios,
which implies at the very least that traffic from probes should not
scale linearly with the number of events: there must be a level of
time-based aggregation going on at the client side of the data
submission protocol. This can be achieved by implementing a "minimum
processing delay" and a scheduled report thread (or, more simply, via
cron).

Database
The event database is a time-based append-only log: the two operations
supported are append and scan (and internally a
delete-older-than that periodically wipes entries that are too old
to be relevant anymore).
Querying reputation for an IP consists in scanning the database for a
pre-defined window of time in the past, and passing the results to a
scoring script (currently written in an embedded
language), that applies aggregation and
weighting and returns the final score.

RPC interface
The server provides a simple GRPC interface that is used for event
submission and querying. The query API is a simple IP lookup,
returning a score. This conceivably could be turned into a DNS-based
API as well.

Third-party sources
It would be nice to allow the scoring script to consult other IP-based
third-party sources, such as DNSBLs, or GeoIP lookups, etc.