Tag
#monitoring
13 posts tagged monitoring.
- tooling
Federated Learning in Production: What Substra Actually Does for Privacy-Preserving ML
Owkin's Substra framework keeps training data local while sharing only model weights — but federated architectures break standard MLOps assumptions around
- monitoring
OpenAI Tops Gartner's Coding-Agent Quadrant. Now You Own a Production ML System.
Gartner named OpenAI a Leader in its first Magic Quadrant for Enterprise AI Coding Agents. The operational story is the part the press release skips: a
- monitoring
The ML Monitoring Metrics Taxonomy: Drift, Data Quality, and Model Decay
A reference taxonomy of the signals that actually tell you a production ML system is failing — input drift, prediction drift, concept drift, data quality
- monitoring
OpenTelemetry GenAI Semantic Conventions: Instrument LLM Apps
How the OpenTelemetry GenAI semantic conventions standardize spans, metrics, and events for LLM apps, what they skip, and how to instrument without rework.
- mlops
LLM Benchmarks in 2026: Which Still Discriminate, and How to Run
Static benchmarks like MMLU and HumanEval have saturated for frontier models. Here's which LLM benchmarks still produce signal, why contamination is worse
- monitoring
Watermarking Should Be Treated as a Monitoring Primitive
A new paper reframes LLM watermarking from an adversarial evasion problem into a monitoring infrastructure question.
- monitoring
LLM Testing: A Guide to Evals, Metrics, and Production Monitoring
LLM testing spans offline evals, CI gate checks, and live production monitoring — three distinct jobs that need different tools.
- mlops
LLM Benchmarks Explained: What the Numbers Mean and Miss
A practical guide to the major LLM benchmarks — MMLU, HumanEval, GPQA Diamond, SWE-bench — what they actually test, why saturation makes most scores
- mlops
LLM Fine Tuning in Production: A Practical MLOps Guide
When to use LLM fine tuning over RAG, how LoRA and QLoRA cut GPU costs, and what to monitor after you ship a fine-tuned model — for ML engineers who own
- mlops
Machine Learning Pipeline: Stages, Failure Points, and Monitoring
A practitioner's guide to the machine learning pipeline — from data ingestion to production monitoring — covering common failure points, drift types, and
- mlops
ML Model Deployment: A Guide to Shipping Models That Stay Healthy
ML model deployment fails far more often than it should — typically before the model ever serves traffic. Here's what breaks, which deployment patterns
- mlops
MLOps Best Practices: What Keeps Models Running in Production
A practitioner's guide to mlops best practices — from CI/CD pipeline automation and model versioning to drift detection and continuous retraining — based
- mlops
MLOps Tools: A Practitioner's Map of the Production Stack
A category-by-category breakdown of MLOps tools — experiment tracking, orchestration, feature stores, serving, and monitoring — with honest tradeoffs for