Example: Production Deployment¶
Deploy an agent to production with monitoring and scaling.
Checklist¶
- [ ] Model accuracy meets SLA
- [ ] Data preprocessing tested
- [ ] Error handling implemented
- [ ] Logging configured
- [ ] Monitoring set up
- [ ] Auto-scaling configured
- [ ] Documentation complete
Docker Deployment¶
FROM python:3.9-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY models/ models/
COPY config.yaml .
COPY app.py .
CMD ["python", "app.py"]
Build and run:
docker build -t my-agent:v1.0 .
docker run -p 5000:5000 my-agent:v1.0
Kubernetes¶
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-agent
spec:
replicas: 3
selector:
matchLabels:
app: my-agent
template:
metadata:
labels:
app: my-agent
spec:
containers:
- name: agent
image: my-agent:v1.0
ports:
- containerPort: 5000
Deploy:
kubectl apply -f deployment.yaml
Monitoring & Logging¶
import logging
logging.basicConfig(
level=logging.INFO,
format='%(asctime)s - %(levelname)s - %(message)s'
)
logger = logging.getLogger(__name__)
@app.route('/predict', methods=['POST'])
def predict():
logger.info("Prediction request received")
try:
result = model.predict(request.json)
logger.info(f"Prediction: {result}")
return result
except Exception as e:
logger.error(f"Prediction failed: {e}")
return error_response()
Auto-Scaling¶
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: agent-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: my-agent
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
Monitoring¶
# Set up Prometheus
iovalence monitor --model deployed-model --export prometheus
Check metrics: - Request rate - Latency (p50, p95, p99) - Error rate - Model accuracy drift
Rollback Strategy¶
# Quick rollback to previous version
iovalence rollback --model my-agent --version v1.0
# Or using Kubernetes
kubectl rollout undo deployment/my-agent
Best Practices¶
- Health checks - Monitor model continuously
- Versioning - Track all model versions
- Gradual rollout - Use canary deployments
- Metrics - Track accuracy in production
- Fallback - Have backup model ready
- Documentation - Document deployment process