Example: Production Deployment¶

Deploy an agent to production with monitoring and scaling.

Checklist¶

[ ] Model accuracy meets SLA
[ ] Data preprocessing tested
[ ] Error handling implemented
[ ] Logging configured
[ ] Monitoring set up
[ ] Auto-scaling configured
[ ] Documentation complete

Docker Deployment¶

FROM python:3.9-slim

WORKDIR /app

COPY requirements.txt .
RUN pip install -r requirements.txt

COPY models/ models/
COPY config.yaml .
COPY app.py .

CMD ["python", "app.py"]

Build and run:

docker build -t my-agent:v1.0 .
docker run -p 5000:5000 my-agent:v1.0

Kubernetes¶

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-agent
spec:
  replicas: 3
  selector:
    matchLabels:
      app: my-agent
  template:
    metadata:
      labels:
        app: my-agent
    spec:
      containers:
      - name: agent
        image: my-agent:v1.0
        ports:
        - containerPort: 5000

Deploy:

kubectl apply -f deployment.yaml

Monitoring & Logging¶

import logging

logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(levelname)s - %(message)s'
)

logger = logging.getLogger(__name__)

@app.route('/predict', methods=['POST'])
def predict():
    logger.info("Prediction request received")
    try:
        result = model.predict(request.json)
        logger.info(f"Prediction: {result}")
        return result
    except Exception as e:
        logger.error(f"Prediction failed: {e}")
        return error_response()

Auto-Scaling¶

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: agent-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-agent
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70

Monitoring¶

# Set up Prometheus
iovalence monitor --model deployed-model --export prometheus

Check metrics: - Request rate - Latency (p50, p95, p99) - Error rate - Model accuracy drift

Rollback Strategy¶

# Quick rollback to previous version
iovalence rollback --model my-agent --version v1.0

# Or using Kubernetes
kubectl rollout undo deployment/my-agent

Best Practices¶

Health checks - Monitor model continuously
Versioning - Track all model versions
Gradual rollout - Use canary deployments
Metrics - Track accuracy in production
Fallback - Have backup model ready
Documentation - Document deployment process

Next Steps¶

Learn More →

Back to Examples →