Stabilizing Production LLM Prompts in Customer Support
The problem
A fintech company used an LLM pipeline to summarize support tickets and suggest responses for agents. Over time the summarization prompt drifted, producing inconsistent output that agents complained about.
The team had no tooling to measure prompt quality across changes.
The solution
promptctl introduced regression testing and baseline evaluation.
- Prompt versions tracked alongside application code
- Regression tests executed during deployment
- Prompt changes required passing evaluation thresholds
The result
41%
Cost reduction
$2,400
Monthly savings
1.7x
Faster triage
"The biggest benefit wasn't cost. It was stability."
— Head of AI Platform, fintech company