Tag
#serving
2 posts tagged serving.
- mlops
ML Model Deployment: Serving Frameworks, KV Cache, and the Latency Metrics That Matter
Once a model clears staging, the serving stack decision determines whether you hit your latency SLAs or spend a sprint chasing p99 spikes. Here's what to evaluate and what to instrument.
- infra
Local Coding Assistants Crossed the Quality Bar: Now Observe Them
A practitioner's Reddit report on running Qwen3.6-27B locally signals a real inflection point. But moving off managed cloud APIs shifts monitoring