Security Analytics
Overview
The Security Analytics module exposes the internals of the ThreatOps classical machine learning pipeline to operators and analysts. Rather than treating AI triage as a black box, this module provides a live window into model health, training progress, version history, feature importance rankings, and the ensemble scoring strategy that drives every alert disposition decision on the platform.
The frontend page at /analytics polls the ML backend every 30 seconds and renders the state of three sklearn models — an Alert Classifier, an Anomaly Detector, and a Threat Scorer — alongside a training buffer progress bar, the 70/30 ensemble blend visualization, a ranked bar chart of the top 17 RandomForest feature importances, and a summary of the four possible disposition outcomes and their confidence thresholds. The backend is served entirely through the /api/v1/ml router, which also handles real-time predictions, analyst feedback recording, forced retraining, and model version history retrieval for rollback workflows.
What Was Proposed
- A classical ML pipeline with three specialized models covering alert classification, anomaly detection, and threat risk scoring
- Ensemble scoring that blends ML confidence with rule-based scores at a 70/30 ratio to balance data-driven decisions with deterministic expert rules
- Automatic model retraining triggered by accumulated analyst feedback, without requiring manual intervention or scheduled jobs
- Analyst feedback loop: every resolve, escalate, or false-positive action feeds labeled training data back to the models
- Feature importance transparency so analysts can audit which alert attributes drive high-risk scores
- Model versioning with persistent storage (Azure Blob) so retrained models can be rolled back if performance regresses
- Real-time prediction endpoint for ad-hoc analysis and integration testing
- A frontend analytics page showing live model stats, training buffer fill, ensemble weights, and feature importance rankings
What's Built
| Alert Classifier (RandomForest, sklearn) | ✓ Complete |
| Anomaly Detector (IsolationForest, sklearn) | ✓ Complete |
| Threat Scorer (GradientBoosting, sklearn) | ✓ Complete |
| 70/30 ensemble blend (ML + rule-based) | ✓ Complete |
| Auto-retrain at 100 analyst feedback samples | ✓ Complete |
| Analyst feedback recording endpoint with disposition validation | ✓ Complete |
| Feature importance extraction (RandomForest) | ✓ Complete |
| Model version history (Azure Blob persistent storage) | ✓ Complete |
| Real-time prediction endpoint with sample test | ✓ Complete |
| ML health check with latency percentiles | ✓ Complete |
| Performance log with historical retrain records | ✓ Complete |
| Frontend analytics page (model cards, training buffer, ensemble viz, feature chart) | ✓ Complete |
| Four disposition outcomes with threshold documentation | ✓ Complete |
Architecture
ML Router
File: platform/api/app/routers/ml.py — Prefix: /api/v1/ml
A thin FastAPI router that delegates all logic to the ml_pipeline singleton imported from app/ml/training_pipeline.py. The router is stateless — it does not hold any model state itself. The pipeline singleton manages model loading from Azure Blob storage at startup, maintains the training feedback buffer in memory, and schedules retraining when the buffer crosses the configured threshold (default: 100 samples).
ML Training Pipeline (app/ml/training_pipeline.py)
The singleton ml_pipeline encapsulates three sklearn model wrappers:
- AlertClassifier — wraps a
RandomForestClassifier; providespredict(alert)returning a classification label and confidence score, andget_feature_importances()returning a ranked dict of feature names to importance weights - AnomalyDetector — wraps an
IsolationForestwith contamination=0.10 (10% expected anomaly rate); unsupervised, so it does not use labeled training data - ThreatScorer — wraps a
GradientBoostingClassifierwith 100 estimators; outputs a continuous risk score in the 0–100 range
The pipeline exposes predict_ensemble(alert: dict), which runs all three models and blends results using the 70/30 formula: confidence = 0.7 * ml_score + 0.3 * rule_score. It also manages record_feedback(), which appends analyst-labeled samples to the training buffer and triggers trigger_retrain() when the buffer reaches threshold.
Model Store (app/ml/model_store.py)
Handles persistence of trained model artifacts to and from Azure Blob storage (container: ml-models on storage account stroconmlmodels). Each model version is saved with a timestamp and version number. The store provides get_all_model_versions() and get_version_history(model_name) for the rollback UI.
ML Models
Alert Classifier
Supervised multi-class classifier that categorizes incoming alerts by threat category. Trained on labeled analyst feedback (true_positive, false_positive, suspicious, benign). Provides accuracy metric and feature importances. Version tracked as stats.classifier.version.
Anomaly Detector
Unsupervised anomaly detection with contamination=0.10, meaning 10% of training samples are expected to be anomalous. Does not require labeled data. Displayed as a static 10% contamination rate in the UI since it has no accuracy metric in the traditional sense. Version tracked as stats.anomaly_detector.version.
Threat Scorer
Produces a continuous risk score from 0 to 100 for each alert. Built with 100 estimators. The score feeds directly into the ensemble blending formula. Higher scores push alerts toward critical_immediate disposition. Version tracked as stats.threat_scorer.version.
API Endpoints
ML Router — /api/v1/ml
| Method | Path | Description |
|---|---|---|
| GET /api/v1/ml/stats | Current model versions, accuracy, training buffer size, last retrain timestamp, per-class distribution, and data drift indicators | |
| GET /api/v1/ml/health | ML pipeline health: model load status per model, prediction latency percentiles (p50, p95, min, max, mean), buffer size, last prediction timestamp, scheduler status, storage backend info | |
| POST /api/v1/ml/predict | Run full ensemble prediction on a provided alert dict; returns classification, anomaly flag, risk score, blended confidence, and disposition recommendation | |
| GET /api/v1/ml/predict/sample | Run prediction on a hardcoded sample alert (PowerShell execution on DESKTOP-ADMIN01) for integration testing; returns result + sample_alert | |
| POST /api/v1/ml/feedback | Record analyst feedback for training; accepts alert dict, disposition (true_positive | false_positive | suspicious | benign), and optional risk_score override; triggers retrain if buffer threshold reached | |
| POST /api/v1/ml/retrain | Force immediate retrain of all models with accumulated buffer data (admin use; bypasses threshold check) | |
| GET /api/v1/ml/performance | Historical model performance log plus current stats snapshot; shows accuracy trends across retrain events | |
| GET /api/v1/ml/feature-importances | RandomForest feature importance dict sorted descending; used by the analytics page for the ranked bar chart visualization | |
| GET /api/v1/ml/versions | Version history for all three models from Azure Blob model_store | |
| GET /api/v1/ml/versions/{model_name} | Version history for a specific model (alert_classifier | anomaly_detector | threat_scorer); returns latest version and full history list | |
Frontend Route
| Route | File | Description |
|---|---|---|
| /analytics | src/app/analytics/page.tsx |
Security Analytics page — model cards, training buffer, ensemble strategy, feature importances bar chart, disposition breakdown |
The page is a Next.js App Router client component ("use client"). On mount it fires two requests in parallel via Promise.allSettled: GET /api/v1/ml/stats and GET /api/v1/ml/feature-importances. Either can fail independently without breaking the page — the component renders whatever data arrives. Auto-refreshes every 30 seconds via setInterval. API calls use the shared api client from @/lib/api-client.
Ensemble Strategy
The ensemble combines outputs from all three models into a single blended confidence score that drives the final alert disposition. The formula is applied in ml_pipeline.predict_ensemble():
blended_confidence = (0.7 × ml_score) + (0.3 × rule_score)
The 70% ML weight reflects confidence in the trained models when sufficient labeled data is available. The 30% rule-based weight preserves expert-authored detection logic as a safety net, particularly for novel attack patterns that the models have not yet seen in training data. The split is visualized in the frontend as a two-segment progress bar (cyan for ML, emerald for rule-based).
Score Sources
| Component | Weight | Source |
|---|---|---|
| ML Score | 70% | Blended output of AlertClassifier + AnomalyDetector + ThreatScorer predictions |
| Rule Score | 30% | Rule-based confidence from RiskScoringModel in the 6-stage triage pipeline |
The 6-stage triage pipeline (entity extraction, threat intel enrichment, UEBA, correlation, rule scoring, ML ensemble) feeds both the rule_score and the raw alert features into the ML models. The full pipeline is documented in the Alerts module.
Disposition Logic
The final blended confidence score maps to one of four disposition labels. These labels drive what the Autonomous SOC Engine does with the alert — auto-close it, notify an analyst, open a SOAR playbook, or page the incident response team.
Confidence > 85% that alert is benign. Automatically closed without analyst review. Feeds a "benign" label into the ML training buffer for future retraining.
Confidence between 50% and 85%. Insufficient certainty for automated closure. Alert queued for analyst review. The ML models will receive the analyst's final verdict as feedback.
Confidence below 50% or anomaly score high. Alert escalated to the incidents queue. A SOAR playbook may be triggered depending on severity. Analyst must investigate and close.
High risk score (>80 on 0–100 scale) and high severity. Immediate escalation. Incident created automatically. On-call analyst paged. SOAR playbook triggered without waiting for analyst action.
The thresholds are: >85% blended confidence → auto-resolve; 50–85% → investigate; <50% → escalate; high threat score + high severity → critical immediate. These thresholds are implemented in TriageService in app/services/triage_service.py.
Data Models
The analytics module does not define its own SQLAlchemy database models. All data is maintained in-memory by the ml_pipeline singleton, with model artifacts persisted to Azure Blob via model_store. The structures below document the API response shapes consumed by the frontend.
MLStats Response (GET /api/v1/ml/stats)
{
"classifier": {
"version": "1.3.0",
"accuracy": 0.923,
"samples_trained": 1450
},
"anomaly_detector": {
"version": "1.1.0",
"accuracy": null, // unsupervised — no accuracy metric
"samples_trained": null
},
"threat_scorer": {
"version": "1.2.0",
"accuracy": 0.891,
"samples_trained": 1450
},
"training_buffer": 34, // samples accumulated since last retrain
"retrain_threshold": 100, // samples needed to trigger auto-retrain
"total_retrains": 14,
"last_retrain": "2026-02-28T18:42:00",
"feature_importances": {
"severity_encoded": 0.187,
"rule_confidence": 0.142,
"correlation_count": 0.118,
// ... up to 17 features
}
}
FeedbackRequest (POST /api/v1/ml/feedback)
{
"alert": {
"title": "Lateral movement detected on DC01",
"severity": "high",
"source_siem": "sentinel",
"mitre_tactic": "lateral_movement",
"rule_confidence": 0.72,
"correlation_count": 5
},
"disposition": "true_positive", // true_positive | false_positive | suspicious | benign
"risk_score": 87.5 // optional analyst-provided override
}
PredictResponse (POST /api/v1/ml/predict)
{
"classification": "suspicious",
"anomaly_score": -0.31, // IsolationForest raw score (lower = more anomalous)
"risk_score": 74.2, // ThreatScorer output (0–100)
"ml_confidence": 0.68,
"rule_score": 0.72,
"blended_confidence": 0.692, // 0.7 * ml_confidence + 0.3 * rule_score
"disposition": "requires_investigation",
"reasoning": "High rule confidence, moderate ML confidence. Correlation count of 5 suggests related activity."
}
FeatureImportances Response (GET /api/v1/ml/feature-importances)
{
"feature_importances": {
"severity_encoded": 0.187,
"rule_confidence": 0.142,
"correlation_count": 0.118,
"hour_of_day": 0.094,
"source_siem_encoded": 0.082,
"mitre_tactic_encoded": 0.079,
"ioc_count": 0.071,
"asset_count": 0.063,
// ... additional features
}
}
ModelVersionHistory (GET /api/v1/ml/versions/{model_name})
{
"model_name": "alert_classifier",
"latest": {
"version": "1.3.0",
"saved_at": "2026-02-28T18:42:00",
"accuracy": 0.923,
"samples": 1450,
"blob_path": "ml-models/alert_classifier/v1.3.0.pkl"
},
"history": [ /* array of version records */ ],
"total_versions": 14
}
Prerequisites
- ML Training Pipeline —
app/ml/training_pipeline.pysingletonml_pipeline; must be initialized at startup; models loaded from Azure Blob or initialized fresh if no stored versions exist - AlertClassifier — sklearn
RandomForestClassifierwrapper withpredict(),get_feature_importances(), and versioned state - AnomalyDetector — sklearn
IsolationForestwrapper (contamination=0.10) - ThreatScorer — sklearn
GradientBoostingClassifierwrapper (100 estimators) with 0–100 score output - Model Store —
app/ml/model_store.py; requires Azure Blob Storage credentials (storage accountstroconmlmodels, containerml-models); falls back to local/tmp/storage if unavailable (ephemeral) - scikit-learn — Python package; required for all three model types; must be present in
requirements.txt - numpy / pandas — Required for feature engineering in the training pipeline
- Triage Service —
app/services/triage_service.py; callsml_pipeline.predict_ensemble()during the 6-stage alert triage pipeline; provides therule_scorethat feeds the 30% blend weight - Alerts Router —
app/routers/alerts.py; callsml_pipeline.record_feedback()during bulk actions so analyst dispositions reach the training buffer - Admin Ops Router —
app/routers/admin_ops.py; usesml_pipeline.get_model_stats(),trigger_retrain(), andget_performance_log()for the admin dashboard and activity log
UI Layout
Analytics Page — /analytics
- Header Row — "Security Analytics" title with Brain icon (blue,
#3b82f6). Subtitle: "Classical ML pipeline performance and feature analysis". Right-aligned Refresh button (slate background). - Error Banner — Conditionally rendered red banner showing the error message if either API call fails. Does not prevent the rest of the page from rendering with available data.
- Model Cards Row — 3-column grid (stacks to 1 on mobile). Each card has:
- Alert Classifier: Target icon (blue), "RandomForest" badge (blue), large accuracy percentage as headline stat, version and samples_trained as subtext
- Anomaly Detector: Activity icon (emerald), "IsolationForest" badge (emerald), "10%" contamination rate as headline stat (static), unsupervised note in subtext
- Threat Scorer: TrendingUp icon (orange), "GradientBoosting" badge (orange), "0–100" as headline stat for risk score range, "100 estimators" note in subtext
- Training Buffer Card — Shows current buffer fill as "X / Y" and a gradient progress bar (cyan to emerald). Subtext shows auto-retrain threshold and total retrains completed. Buffer percentage capped at 100% even if buffer overflows threshold.
- Ensemble Strategy Card — Side-by-side with training buffer. Shows two rows: "ML Weight: 70%" (blue) and "Rule-Based Weight: 30%" (emerald), with a two-segment horizontal bar visually splitting the weights. Formula shown as subtext: "Blended confidence = 0.7 * ML score + 0.3 * rule score".
- Feature Importances Chart — Full-width card with BarChart3 icon. Ranked list of up to 17 features (top-N from RandomForest). Each row: rank number, feature name in monospace font (truncated at w-44), horizontal bar (gradient cyan to emerald), and importance percentage value. Bars are relative to the top feature's importance (not absolute).
- Alert Disposition Actions — Full-width card with 4-column grid (stacks on mobile). One card per disposition: Auto-Resolve (emerald), Escalate (orange), Investigate (yellow), Critical Alert (red). Each shows a colored label, description, and threshold band. Subtext below the grid summarizes the threshold logic: >85% auto-resolve, <50% escalate, 50–85% investigate.
The page uses a white/slate design with colored badges and gradient bars. No chart library dependency — all visualizations are built with plain CSS div elements and inline width styles driven by the data values.