Multilingual spam classifier (DE/EN) with language detection. Non-DE/EN mails receive an additional spam score bonus. - train.py: Fine-tune distilbert-base-multilingual-cased on spam/ham data - server.py: FastAPI service with langdetect integration - rspamd/: Lua plugin and config for RSpamd integration - export_rspamd_data.py: Export Maildir folders to CSV training data - test_classify.py: Local model validation with DE/EN/foreign test cases Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
9 lines
157 B
Text
9 lines
157 B
Text
torch>=2.0.0
|
|
transformers>=4.36.0
|
|
fastapi>=0.104.0
|
|
uvicorn>=0.24.0
|
|
pydantic>=2.0.0
|
|
datasets>=2.16.0
|
|
scikit-learn>=1.3.0
|
|
accelerate>=0.25.0
|
|
langdetect>=1.0.9
|