Routine steps
Classification, extraction, validation, routing, and repair do not always need the most expensive model in the stack.
This is a sanitized example of the shape of the deliverable. The real report changes with the repo, deployment path, and evidence available.
Flev workflows can route routine, structured steps to local or private small models while keeping stronger models available for complex reasoning.
Classification, extraction, validation, routing, and repair do not always need the most expensive model in the stack.
Teams should be able to see which model handled which step, why fallback exists, and who can approve changes.
Better Call evidence shows tool-call accuracy improving from 73.4% to 83.8% on 3,625 granite4.1:3b BFCL v4 cases.
A release pipeline fails after a dependency and Docker build change. The team needs to know whether to retry, patch, rollback, or change the runbook.
The failure occurs after dependency install succeeds and before the image is pushed. The failing step is the Docker build stage.
The build context no longer includes the generated runtime artifact expected by the Dockerfile.
The last successful artifact list and the exact Dockerfile diff should be checked before approving a patch.
Add an explicit build-artifact check before Docker build, then update the runbook with the verification command.
| Source | What it showed | Decision impact |
|---|---|---|
| CI step log | Install completed, Docker build failed on missing runtime artifact. | Rules out package install as the primary failure. |
| Repository diff | Build script changed output directory without matching Dockerfile update. | Supports a patch plan instead of blind retry. |
| Previous successful run | Artifact existed at the old path before image build. | Shows the runbook should verify artifact location. |
Inspect CI logs, diffs, run history, Kubernetes events, and existing runbooks.
Open a PR, push a patch, publish a package, deploy, rollback, or mutate cluster state.
Production deploys, credential changes, destructive cluster commands, or customer-visible communication.
Use this checklist before the first call. The goal is not a perfect brief; the goal is enough context to decide whether a 7-day diagnostic sprint can produce a useful result.