Skip to content

Example: Production Deployment

Deploy an agent to production with monitoring and scaling.

Checklist

  • [ ] Model accuracy meets SLA
  • [ ] Data preprocessing tested
  • [ ] Error handling implemented
  • [ ] Logging configured
  • [ ] Monitoring set up
  • [ ] Auto-scaling configured
  • [ ] Documentation complete

Docker Deployment

FROM python:3.9-slim

WORKDIR /app

COPY requirements.txt .
RUN pip install -r requirements.txt

COPY models/ models/
COPY config.yaml .
COPY app.py .

CMD ["python", "app.py"]

Build and run:

docker build -t my-agent:v1.0 .
docker run -p 5000:5000 my-agent:v1.0

Kubernetes

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-agent
spec:
  replicas: 3
  selector:
    matchLabels:
      app: my-agent
  template:
    metadata:
      labels:
        app: my-agent
    spec:
      containers:
      - name: agent
        image: my-agent:v1.0
        ports:
        - containerPort: 5000

Deploy:

kubectl apply -f deployment.yaml

Monitoring & Logging

import logging

logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(levelname)s - %(message)s'
)

logger = logging.getLogger(__name__)

@app.route('/predict', methods=['POST'])
def predict():
    logger.info("Prediction request received")
    try:
        result = model.predict(request.json)
        logger.info(f"Prediction: {result}")
        return result
    except Exception as e:
        logger.error(f"Prediction failed: {e}")
        return error_response()

Auto-Scaling

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: agent-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-agent
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70

Monitoring

# Set up Prometheus
iovalence monitor --model deployed-model --export prometheus

Check metrics: - Request rate - Latency (p50, p95, p99) - Error rate - Model accuracy drift

Rollback Strategy

# Quick rollback to previous version
iovalence rollback --model my-agent --version v1.0

# Or using Kubernetes
kubectl rollout undo deployment/my-agent

Best Practices

  1. Health checks - Monitor model continuously
  2. Versioning - Track all model versions
  3. Gradual rollout - Use canary deployments
  4. Metrics - Track accuracy in production
  5. Fallback - Have backup model ready
  6. Documentation - Document deployment process

Next Steps

Learn More →


Back to Examples →