DEV Community

# mlops

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
I Built a Complete AI Infrastructure Stack from Scratch — Here's What I Learned

I Built a Complete AI Infrastructure Stack from Scratch — Here's What I Learned

Comments
6 min read
Handling Failure: The Most Important Part of AI Systems

Handling Failure: The Most Important Part of AI Systems

Comments
2 min read
Model cards vs pre-registration: what counts as evidence under the EU AI Act

Model cards vs pre-registration: what counts as evidence under the EU AI Act

Comments
4 min read
QAT vs PTQ on our edge vision model: 6 months of A/B data

QAT vs PTQ on our edge vision model: 6 months of A/B data

Comments
4 min read
Structured channel pruning got our detector under 12ms on a Jetson

Structured channel pruning got our detector under 12ms on a Jetson

Comments
4 min read
Serving 40 LoRA adapters on one base model: the throughput we got

Serving 40 LoRA adapters on one base model: the throughput we got

Comments
4 min read
torch.compile recompiled our SDXL UNet 38 times in production

torch.compile recompiled our SDXL UNet 38 times in production

Comments
4 min read
Semantic caching the VLM step in our product-photo pipeline

Semantic caching the VLM step in our product-photo pipeline

1
Comments
4 min read
AI Observability: Stop Flying Blind in Production

AI Observability: Stop Flying Blind in Production

Comments
4 min read
LLM-as-judge variance broke our DPO training signal for 3 weeks

LLM-as-judge variance broke our DPO training signal for 3 weeks

Comments
4 min read
The bf16 grad accumulator that killed our SDXL LoRA training

The bf16 grad accumulator that killed our SDXL LoRA training

Comments
4 min read
Token-level eval harness for tool-calling agents: what we wired up

Token-level eval harness for tool-calling agents: what we wired up

Comments
4 min read
Capping VLM spend per CV researcher: hierarchical budgets in practice

Capping VLM spend per CV researcher: hierarchical budgets in practice

1
Comments 2
4 min read
Part 2: Enterprise Decision Intelligence Architecture: AI Governance, Threshold Policy Engines, and Operational AI Systems

Part 2: Enterprise Decision Intelligence Architecture: AI Governance, Threshold Policy Engines, and Operational AI Systems

Comments
11 min read
Auto-labelling 1.2M robotics frames with VLMs: a failover story

Auto-labelling 1.2M robotics frames with VLMs: a failover story

Comments
4 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.