spamBERT/requirements.txt
Carsten Abele 38efd20b4d Initial commit: SpamLLM - DistilBERT spam classifier for RSpamd
Multilingual spam classifier (DE/EN) with language detection.
Non-DE/EN mails receive an additional spam score bonus.

- train.py: Fine-tune distilbert-base-multilingual-cased on spam/ham data
- server.py: FastAPI service with langdetect integration
- rspamd/: Lua plugin and config for RSpamd integration
- export_rspamd_data.py: Export Maildir folders to CSV training data
- test_classify.py: Local model validation with DE/EN/foreign test cases

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 22:27:05 +01:00

9 lines
157 B
Text

torch>=2.0.0
transformers>=4.36.0
fastapi>=0.104.0
uvicorn>=0.24.0
pydantic>=2.0.0
datasets>=2.16.0
scikit-learn>=1.3.0
accelerate>=0.25.0
langdetect>=1.0.9