Preface
Historically, our only text classification method has been Bayes, a powerful statistical method that performs well with sufficient training. However, Bayes has its limitations:
- It requires thorough and well-balanced training
- It cannot work with low confidence levels, especially when dealing with a wide variety of spam
Large Language Models (LLMs) offer promising solutions to these challenges. These models can perform deep introspection with some sort of contextual “understanding”. However, their high computational demands (typically requiring GPUs) make scanning all emails impractical. Separating LLM execution from the scanning engine mitigates resource competition.
Rspamd GPT plugin
In Rspamd 3.9, I have tried to integrate the OpenAI GPT API for spam filtering and assess its usefulness. Here are the basic ideas behind this plugin:
- The selected displayed text part is extracted and submitted to the GPT API for spam probability assessment
- Additional message details such as Subject, displayed From, and URLs are also included in the assessment
- Then, we ask GPT to provide results in JSON format since human-readable GPT output cannot be parsed (in general)
- Some specific symbols (
BAYES_SPAM
,FUZZY_DENIED
,REPLY
, etc.) are excluded from the GPT scan - Obvious spam and ham are also excluded from the GPT evaluation
The former two points reduce the GPT workload for something that is already known, where GPT cannot add any value in the evaluation. We also use GPT as one of the classifiers, meaning that we do not rely solely on GPT evaluation.
Evaluation results
To evaluate the performance of the GPT-based classifier, we developed the rspamadm classifier_test
utility, capable of evaluating both supervised and unsupervised classifiers:
- It divides spam and ham samples into separate training and validation sets
- For supervised classifiers, it uses the training set to train the classifier
- Then both supervised and unsupervised classifiers are evaluated using the validation set to measure their performance
For example, the Bayes engine, trained on a robust corpus, demonstrates the following results:
$ rspamadm classifier_test --ham /ham --spam /spam --cv-fraction 0.3
Spam: 348 train files, 815 cv files; ham: 754 train files, 1762 cv files
Start learn spam, 348 messages, 10 connections
Start learn ham, 754 messages, 10 connections
Learning done: 348 spam messages in 1.61 seconds, 754 ham messages in 3.88 seconds
Start cross validation, 2577 messages, 10 connections
Metric Value
------------------------------
True Positives 735
False Positives 22
True Negatives 1717
False Negatives 49
Accuracy 0.97
Precision 0.97
Recall 0.94
F1 Score 0.95
Classified (%) 97.90
Elapsed time (seconds) 12.71
These results are impressive but assume the classifier is properly and decently trained. In scenarios involving a fresh system or high variability in emails, gathering reliable statistics might be challenging.
In contrast, the GPT engine operates as an unsupervised learning algorithm. We assume that LLM models have enough “understanding” of the language to distinguish spam and ham without direct training on emails. Moreover, we provide only text data, not raw email content.
Below are the results from different GPT models:
OpenAI GPT-3.5 Turbo
Metric Value
------------------------------
True Positives 129
False Positives 35
True Negatives 263
False Negatives 69
Accuracy 0.79
Precision 0.79
Recall 0.65
F1 Score 0.71
Classified (%) 95.20
Elapsed time (seconds) 318.91
This model is cost-effective and can be used as a baseline. The results were obtained from a low-quality sample corpus, resulting in high false positives and negatives.
OpenAI GPT-4o
Metric Value
------------------------------
True Positives 178
False Positives 25
True Negatives 257
False Negatives 9
Accuracy 0.93
Precision 0.88
Recall 0.95
F1 Score 0.91
Classified (%) 90.02
Elapsed time (seconds) 279.08
Despite its high cost, this advanced model is suitable, for example, for low-traffic personal email. It demonstrates significantly lower error rates compared to GPT-3.5, even with a similar low-quality sample corpus.
Using GPT to train Bayes classifier
Another interesting approach involves using GPT to supervise Bayes engine training. In this case, we benefit from the best of both worlds: GPT can operate without training, while Bayes can catch up afterward and perform instead of GPT (or at least serve as a cost-saving alternative).
So we tested GPT training Bayes and compared efficiency using the same methodologies.
GPT results:
Metric Value
------------------------------
True Positives 128
False Positives 13
True Negatives 301
False Negatives 68
Accuracy 0.84
Precision 0.91
Recall 0.65
F1 Score 0.76
Classified (%) 97.89
Elapsed time (seconds) 341.77
Bayes classifier results (trained by GPT in the previous test iteration):
Metric Value
------------------------------
True Positives 19
False Positives 43
True Negatives 269
False Negatives 9
Accuracy 0.85
Precision 0.31
Recall 0.68
F1 Score 0.42
Classified (%) 65.26
Elapsed time (seconds) 29.18
Bayes still exhibits uncertainty in classification, with more false positives than GPT. Improvement could be achieved through autolearning and by refining the corpus used for testing (our corpus contains many ham emails that look like spam even for human evaluators).
Plugin design
The GPT plugin operates as follows:
- It selects messages that qualify several pre-checks:
- they must not contain any symbols from the
excluded
set (e.g. Fuzzy/Bayes spam/Whitelists) - they must not clearly appear as ham or spam (e.g. with reject action or no action with a high negative score)
- they should have enough text tokens in the meaningful displayed part
- they must not contain any symbols from the
- If a message satisfies these checks, Rspamd selects the displayed part (e.g. HTML) and sends the following content to GPT:
- text part content as a single-line string (honoring limits if necessary)
- message subject
- displayed From
- some details about URLs (e.g. domains)
- This data is merged with a prompt to GPT requesting an evaluation of the email’s spam probability, with the output returned in JSON format (other output types may sometimes allow GPT to provide human-readable text that is very difficult to parse)
- After these steps, a corresponding symbol with a confidence score is inserted
- With autolearning enabled, Rspamd also trains the supervised classifier (Bayes)
Pricing considerations and conclusions
OpenAI provides an API for these requests, incurring costs (currently no free tier available). However, for personal email usage or automated Bayes training without manual intervention, GPT presents a viable option. For instance, processing a substantial volume of personal emails with GPT-3.5 costs approximately $0.05 daily (for about 100k tokens).
For large-scale email systems, it may be preferable to use another LLM (e.g. llama) internally on a GPU-powered platform. The current plugin is designed to integrate with different LLM types without significant modifications. This approach also enhances data privacy by avoiding sending email content to a third-party service (though OpenAI claims their models do not learn from API requests).
Despite not achieving 100% accuracy, the GPT plugin demonstrates efficiency comparable to human-filtered email. Future enhancements will focus on improving accuracy through additional metadata integration into the GPT engine, while optimizing token usage efficiency. There are also plans to better utilize LLM knowledge in Rspamd, particularly for better fine-grained classification.
The GPT plugin will be available starting from Rspamd 3.9, requiring an OpenAI API key and financial commitment for accessing ChatGPT services.