Name Last Update
account_intelligence Loading commit data... Loading commit data...
loganalyzer.go Loading commit data... Loading commit data...


A suite of ML-based tools to detect accounts compromised by spammers, based on mailserver log analysis.

The tools contained in this repository are:

  • loganalyzer: parses mail.log syslog files, reconstructs mail flows, and produces per-user aggregates of a number of interesting metrics
  • account_intelligence/ a Tensorflow-based neural network that is able to classify users in the above output as having been compromised or not.

This particular implementation currently only works with Postfix logs.

Implementation details

The loganalyzer binary computes a number of metrics (features) for each user, which are then used to classify the user by the NN. These features include, for the examined time interval:

  • how many messages were sent
  • how many messages were sent to "freemail" domains
  • how many messages were sent unsuccessfully (bounced, due to spam detection on the receiving end, or other causes)
  • message counts for the top 10 destination domains
  • how many times internal rate-limiting systems were triggered

The basic idea being that compromised accounts will try to send large numbers of messages to a very generic set of domains (as their targets are usually pulled out of huge lists of accounts), while normal users will have a much more focused communication pattern.

The other signals are there because they empirically showed good relevance as compromise detectors.


You are going to need a few dependencies, including a Go language environment, and Tensorflow. On a Debian system:

$ sudo apt install golang-go python-dev python-pip
$ pip install tensorflow


1) Build the loganalyzer binary:

$ go build -o loganalyzer loganalyzer.go
$ sudo cp loganalyzer /usr/local/bin/loganalyzer

2) Install the Python NN code:

$ sudo python install


The analyzer requires its input to be in the standard Syslog format with old-style (i.e. "broken") timestamps, for instance:

Apr 13 06:28:25 hostname program[pid]: foo bar

Unfortunately the input format is currently not flexible.


In order to train the neural network you'll need to train it on a dataset that is relevant to your situation, manually classifying the users as compromised or not. We'll call this the training dataset. Furthermore, in order to properly validate the accuracy of the network, you will need a second dataset (the test dataset) unrelated to the first (for example, logs from two separate non-overlapping time intervals).

Let's see an example. Assume mail.log.train and mail.log.test are the two datasets mentioned above. Firstly, we are going to invoke loganalyzer to extract features from the logs:

$ loganalyzer mail.log.train > features.train
$ loganalyzer mail.log.test > features.test

The resulting files will consist of JSON-encoded records, one per line, each corresponding to a different user. Something like:

{"user": "", "messages_count": 200,
 "top_domains": {"": 100, "": 100}, ...}

Now you need to create separate files containing the expected classification results for each user in the two datasets. We are going to call them labels.train and labels.test. These files use a simpler format, just a username and 0 or 1 to indicate a normal user (0) or a compromised account (1). For instance: 1 0

Since usually there are going to be a lot more normal users than compromised accounts, 0 is the default label if a user is not found in this file, so you can just list the compromised accounts in there to save typing.

Finally, train the NN:

$ account-intelligence-nn --train --labels=labels.train \
    < features.train

and verify its accuracy on the test dataset:

$ account-intelligence-nn --labels=labels.test < features.test

Note that the NN will save its state in the local directory by default. If the results are satisfying, you can run the analysis on any other log file:

$ loganalyzer mail.log | account-intelligence-nn 

First results

Training the NN with a small amount of A/I mail logs resulted in a 92% accuracy on the testing set, which is quite good considering the extremely small size of the training corpus (tens of users).