Deploying Multi-Agent Systems to Azure AI Foundry (Azure AI Agent Service Part 5)

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • MyrinNew
    Senior Member
    • Feb 2024
    • 5175

    #1

    Deploying Multi-Agent Systems to Azure AI Foundry (Azure AI Agent Service Part 5)

    Deploying Multi-Agent Systems to Azure AI Foundry

    Part 5 of 5: From Development to Production

    Welcome to the finale of our Azure AI Agent Service series! We've covered a lot of ground—from foundational concepts to building your first agent, advanced patterns, and multi-agent orchestration. Now it's time to take everything we've built and deploy it to production.


    In this final installment, we'll walk through deploying multi-agent systems to Azure AI Foundry, covering infrastructure setup, CI/CD pipelines, scaling strategies, and monitoring. By the end, you'll have a complete production-ready deployment pipeline.





    Understanding Azure AI Foundry

    Azure AI Foundry (formerly Azure AI Studio) is Microsoft's unified platform for building, deploying, and managing AI applications. Think of it as the production home for your agents—providing the infrastructure, security, and observability you need for enterprise deployments.


    Key Components for Agent Deployments

    AI Hub Central governance and resource sharing
    AI Project Isolated workspace for your agent application
    Connections Secure credentials for Azure OpenAI, storage, etc.
    Deployments Model endpoints your agents consume
    Prompt Flow Orchestration runtime for complex agent flows


    Why Azure AI Foundry for Agents?

    1. Integrated Security - Managed identities, key vault integration, and VNET support
    2. Unified Monitoring - Built-in tracing with Azure Monitor and Application Insights
    3. Model Management - Version control and A/B testing for model deployments
    4. Cost Controls - Budgets, quotas, and consumption tracking
    5. Compliance - Enterprise-grade data residency and privacy controls





    Infrastructure as Code: Setting Up with Bicep

    Let's define our production infrastructure. We'll use Bicep (Azure's declarative IaC language) to create a repeatable, version-controlled deployment.


    The Complete Infrastructure Template





    // main.bicep - Azure AI Foundry Infrastructure for Multi-Agent System

    @description('Environment name (dev, staging, prod)')
    param environment string = 'prod'

    @description('Azure region for deployment')
    param location string = resourceGroup().location

    @description('Base name for all resources')
    param baseName string = 'aiagents'

    // Variables
    var uniqueSuffix = uniqueString(resourceGroup().id)
    var hubName = '${baseName}-hub-${environment}'
    var projectName = '${baseName}-project-${environment}'
    var openAiName = '${baseName}-openai-${uniqueSuffix}'
    var storageAccountName = '${baseName}storage${uniqueSuffix}'
    var appInsightsName = '${baseName}-insights-${environment}'
    var keyVaultName = '${baseName}-kv-${uniqueSuffix}'

    // Log Analytics Workspace (required for App Insights)
    resource logAnalytics 'Microsoft.OperationalInsights/workspaces@2022-10-01' = {
    name: '${baseName}-logs-${environment}'
    location: location
    properties: {
    sku: {
    name: 'PerGB2018'
    }
    retentionInDays: 30
    }
    }

    // Application Insights for monitoring
    resource appInsights 'Microsoft.Insights/components@2020-02-02' = {
    name: appInsightsName
    location: location
    kind: 'web'
    properties: {
    Application_Type: 'web'
    WorkspaceResourceId: logAnalytics.id
    publicNetworkAccessForIngestion: 'Enabled'
    publicNetworkAccessForQuery: 'Enabled'
    }
    }

    // Key Vault for secrets
    resource keyVault 'Microsoft.KeyVault/vaults@2023-07-01' = {
    name: keyVaultName
    location: location
    properties: {
    sku: {
    family: 'A'
    name: 'standard'
    }
    tenantId: subscription().tenantId
    enableRbacAuthorization: true
    enableSoftDelete: true
    softDeleteRetentionInDays: 7
    }
    }

    // Storage Account for agent artifacts
    resource storageAccount 'Microsoft.Storage/storageAccounts@2023-01-01' = {
    name: storageAccountName
    location: location
    sku: {
    name: 'Standard_LRS'
    }
    kind: 'StorageV2'
    properties: {
    minimumTlsVersion: 'TLS1_2'
    supportsHttpsTrafficOnly: true
    }
    }

    // Azure OpenAI Service
    resource openAi 'Microsoft.CognitiveServices/accounts@2023-10-01-preview' = {
    name: openAiName
    location: location
    kind: 'OpenAI'
    sku: {
    name: 'S0'
    }
    properties: {
    customSubDomainName: openAiName
    publicNetworkAccess: 'Enabled'
    }
    }

    // GPT-4o Deployment for agents
    resource gpt4Deployment 'Microsoft.CognitiveServices/accounts/deployments@2023-10-01-preview' = {
    parent: openAi
    name: 'gpt-4o'
    sku: {
    name: 'Standard'
    capacity: 30 // TPM in thousands
    }
    properties: {
    model: {
    format: 'OpenAI'
    name: 'gpt-4o'
    version: '2024-08-06'
    }
    }
    }

    // AI Hub - Central governance
    resource aiHub 'Microsoft.MachineLearningServices/workspaces@2024-04-01' = {
    name: hubName
    location: location
    kind: 'Hub'
    identity: {
    type: 'SystemAssigned'
    }
    properties: {
    friendlyName: 'AI Agents Hub (${environment})'
    storageAccount: storageAccount.id
    keyVault: keyVault.id
    applicationInsights: appInsights.id
    publicNetworkAccess: 'Enabled'
    }
    }

    // AI Project - Your agent workspace
    resource aiProject 'Microsoft.MachineLearningServices/workspaces@2024-04-01' = {
    name: projectName
    location: location
    kind: 'Project'
    identity: {
    type: 'SystemAssigned'
    }
    properties: {
    friendlyName: 'Multi-Agent System (${environment})'
    hubResourceId: aiHub.id
    publicNetworkAccess: 'Enabled'
    }
    }

    // Connection to Azure OpenAI
    resource openAiConnection 'Microsoft.MachineLearningServices/workspaces/connections@2024-04-01' = {
    parent: aiHub
    name: 'azure-openai-connection'
    properties: {
    category: 'AzureOpenAI'
    target: openAi.properties.endpoint
    authType: 'AAD'
    metadata: {
    ApiType: 'Azure'
    ResourceId: openAi.id
    }
    }
    }

    // Outputs for CI/CD
    output projectId string = aiProject.id
    output projectName string = aiProject.name
    output openAiEndpoint string = openAi.properties.endpoint
    output appInsightsConnectionString string = appInsights.properties.ConnectionString
    output storageAccountName string = storageAccount.name







    Deploying the Infrastructure





    # Create resource group
    az group create --name rg-aiagents-prod --location eastus2

    # Deploy infrastructure
    az deployment group create \
    --resource-group rg-aiagents-prod \
    --template-file main.bicep \
    --parameters environment=prod baseName=myagents

    # Get outputs for application configuration
    az deployment group show \
    --resource-group rg-aiagents-prod \
    --name main \
    --query properties.outputs










    Deploying Agents to Production

    With infrastructure in place, let's deploy our multi-agent system. We'll create a containerized application that hosts our agents.


    Production Agent Host





    // Program.cs - Production Agent Host
    using Azure.AI.Projects;
    using Azure.Identity;
    using Azure.Monitor.OpenTelemetry.AspNetCore;
    using Microsoft.AspNetCore.Diagnostics.HealthChecks;
    using System.Text.Json;

    var builder = WebApplication.CreateBuilder(args);

    // Configure Azure Monitor for distributed tracing
    builder.Services.AddOpenTelemetry()
    .UseAzureMonitor(options =>
    {
    options.ConnectionString = builder.Configuration["ApplicationInsights:ConnectionString"];
    });

    // Register AI Project client
    builder.Services.AddSingleton(sp =>
    {
    var connectionString = builder.Configuration["AzureAI:ProjectConnectionString"];
    return new AIProjectClient(connectionString, new DefaultAzureCredential());
    });

    // Register our agent orchestrator
    builder.Services.AddSingletonMultiAgentOrchestrato r>();
    builder.Services.AddHostedServiceAgentInitializati onService>();

    // Health checks
    builder.Services.AddHealthChecks()
    .AddCheckAgentHealthCheck>("agents")
    .AddCheckOpenAIHealthCheck>("openai");

    var app = builder.Build();

    // Health endpoints for Kubernetes/Container Apps
    app.MapHealthChecks("/health/live", new HealthCheckOptions
    {
    Predicate = _ => false // Just checks if app is running
    });

    app.MapHealthChecks("/health/ready", new HealthCheckOptions
    {
    Predicate = check => check.Tags.Contains("ready")
    });

    // Agent API endpoints
    app.MapPost("/api/orchestrate", async (
    OrchestrationRequest request,
    MultiAgentOrchestrator orchestrator,
    CancellationToken ct) =>
    {
    var result = await orchestrator.ExecuteAsync(request.Task, request.Context, ct);
    return Results.Ok(result);
    });

    app.MapGet("/api/agents", (MultiAgentOrchestrator orchestrator) =>
    {
    return Results.Ok(orchestrator.GetAgentStatus());
    });

    app.Run();

    // Request model
    public record OrchestrationRequest(string Task, Dictionarystring, object>? Context = null);







    Agent Initialization Service





    // Services/AgentInitializationService.cs
    public class AgentInitializationService : IHostedService
    {
    private readonly AIProjectClient _client;
    private readonly MultiAgentOrchestrator _orchestrator;
    private readonly ILoggerAgentInitializationService> _logger;
    private readonly IConfiguration _configuration;

    public AgentInitializationService(
    AIProjectClient client,
    MultiAgentOrchestrator orchestrator,
    ILoggerAgentInitializationService> logger,
    IConfiguration configuration)
    {
    _client = client;
    _orchestrator = orchestrator;
    _logger = logger;
    _configuration = configuration;
    }

    public async Task StartAsync(CancellationToken cancellationToken)
    {
    _logger.LogInformation("Initializing production agents...");

    var agentsClient = _client.GetAgentsClient();

    // Load agent configurations from settings
    var agentConfigs = _configuration
    .GetSection("Agents")
    .GetListAgentConfiguration>>() ?? new();

    foreach (var config in agentConfigs)
    {
    try
    {
    var agent = await GetOrCreateAgentAsync(agentsClient, config, cancellationToken);
    _orchestrator.RegisterAgent(config.Role, agent);
    _logger.LogInformation("Registered agent: {Role} ({Id})", config.Role, agent.Id);
    }
    catch (Exception ex)
    {
    _logger.LogError(ex, "Failed to initialize agent: {Role}", config.Role);
    throw; // Fail fast in production
    }
    }

    _logger.LogInformation("All {Count} agents initialized successfully", agentConfigs.Count);
    }

    private async TaskAgent> GetOrCreateAgentAsync(
    AgentsClient client,
    AgentConfiguration config,
    CancellationToken ct)
    {
    // Try to find existing agent by name (idempotent deployments)
    await foreach (var agent in client.GetAgentsAsync(ct))
    {
    if (agent.Name == config.Name)
    {
    _logger.LogInformation("Found existing agent: {Name}", config.Name);

    // Update if instructions changed
    if (agent.Instructions != config.Instructions)
    {
    return await client.UpdateAgentAsync(
    agent.Id,
    instructions: config.Instructions,
    cancellationToken: ct);
    }
    return agent;
    }
    }

    // Create new agent
    _logger.LogInformation("Creating new agent: {Name}", config.Name);
    return await client.CreateAgentAsync(
    model: config.Model,
    name: config.Name,
    instructions: config.Instructions,
    tools: config.GetToolDefinitions(),
    cancellationToken: ct);
    }

    public Task StopAsync(CancellationToken cancellationToken)
    {
    _logger.LogInformation("Agent host shutting down");
    return Task.CompletedTask;
    }
    }

    // Configuration model
    public class AgentConfiguration
    {
    public string Name { get; set; } = "";
    public string Role { get; set; } = "";
    public string Model { get; set; } = "gpt-4o";
    public string Instructions { get; set; } = "";
    public Liststring> Tools { get; set; } = new();

    public IEnumerableToolDefinition> GetToolDefinitions()
    {
    foreach (var tool in Tools)
    {
    yield return tool switch
    {
    "code_interpreter" => new CodeInterpreterToolDefinition(),
    "file_search" => new FileSearchToolDefinition(),
    _ => throw new ArgumentException($"Unknown tool: {tool}")
    };
    }
    }
    }







    Production Configuration





    // appsettings.Production.json
    {
    "AzureAI": {
    "ProjectConnectionString": "Set via environment variable"
    },
    "ApplicationInsights": {
    "ConnectionString": "Set via environment variable"
    },
    "Agents": [
    {
    "Name": "coordinator-agent-prod",
    "Role": "Coordinator",
    "Model": "gpt-4o",
    "Instructions": "You are the coordinator agent. Analyze incoming requests and delegate to specialist agents. Always validate outputs before returning.",
    "Tools": []
    },
    {
    "Name": "research-agent-prod",
    "Role": "Researcher",
    "Model": "gpt-4o",
    "Instructions": "You are a research specialist. Gather and synthesize information accurately. Always cite sources.",
    "Tools": ["file_search"]
    },
    {
    "Name": "analyst-agent-prod",
    "Role": "Analyst",
    "Model": "gpt-4o",
    "Instructions": "You are a data analyst. Process data, generate insights, and create visualizations when helpful.",
    "Tools": ["code_interpreter"]
    }
    ],
    "Orchestration": {
    "MaxConcurrentTasks": 10,
    "TaskTimeoutSeconds": 300,
    "RetryPolicy": {
    "MaxRetries": 3,
    "DelaySeconds": 2
    }
    }
    }










    Scaling Considerations

    Multi-agent systems have unique scaling challenges. Here's how to handle them:


    Horizontal Scaling with Azure Container Apps





    // container-app.bicep
    resource containerApp 'Microsoft.App/containerApps@2023-05-01' = {
    name: 'aiagent-host'
    location: location
    properties: {
    environmentId: containerAppEnvironment.id
    configuration: {
    ingress: {
    external: true
    targetPort: 8080
    transport: 'http'
    }
    secrets: [
    {
    name: 'ai-connection-string'
    keyVaultUrl: '${keyVault.properties.vaultUri}secrets/ai-connection-string'
    identity: 'system'
    }
    ]
    }
    template: {
    containers: [
    {
    name: 'agent-host'
    image: '${containerRegistry.properties.loginServer}/agent-host:${imageTag}'
    resources: {
    cpu: json('1.0')
    memory: '2Gi'
    }
    env: [
    {
    name: 'AzureAI__ProjectConnectionString'
    secretRef: 'ai-connection-string'
    }
    ]
    probes: [
    {
    type: 'Liveness'
    httpGet: {
    path: '/health/live'
    port: 8080
    }
    periodSeconds: 10
    }
    {
    type: 'Readiness'
    httpGet: {
    path: '/health/ready'
    port: 8080
    }
    periodSeconds: 5
    }
    ]
    }
    ]
    scale: {
    minReplicas: 2
    maxReplicas: 10
    rules: [
    {
    name: 'http-scaling'
    http: {
    metadata: {
    concurrentRequests: '20'
    }
    }
    }
    {
    name: 'cpu-scaling'
    custom: {
    type: 'cpu'
    metadata: {
    type: 'Utilization'
    value: '70'
    }
    }
    }
    ]
    }
    }
    }
    }







    Rate Limiting & Token Management





    // Services/TokenBudgetManager.cs
    public class TokenBudgetManager
    {
    private readonly SemaphoreSlim _semaphore;
    private readonly ILoggerTokenBudgetManager> _logger;
    private int _tokensUsedThisMinute;
    private DateTime _windowStart = DateTime.UtcNow;
    private readonly int _tokensPerMinute;

    public TokenBudgetManager(IConfiguration config, ILoggerTokenBudgetManager> logger)
    {
    _tokensPerMinute = config.GetValueint>("RateLimits:TokensPerMinute", 30000);
    _semaphore = new SemaphoreSlim(1, 1);
    _logger = logger;
    }

    public async Taskbool> TryAcquireAsync(int estimatedTokens, CancellationToken ct)
    {
    await _semaphore.WaitAsync(ct);
    try
    {
    ResetWindowIfNeeded();

    if (_tokensUsedThisMinute + estimatedTokens > _tokensPerMinute)
    {
    _logger.LogWarning(
    "Token budget exceeded. Used: {Used}, Requested: {Requested}, Limit: {Limit}",
    _tokensUsedThisMinute, estimatedTokens, _tokensPerMinute);
    return false;
    }

    _tokensUsedThisMinute += estimatedTokens;
    return true;
    }
    finally
    {
    _semaphore.Release();
    }
    }

    public void RecordActualUsage(int actualTokens, int estimated)
    {
    var difference = actualTokens - estimated;
    Interlocked.Add(ref _tokensUsedThisMinute, difference);
    }

    private void ResetWindowIfNeeded()
    {
    if (DateTime.UtcNow - _windowStart > TimeSpan.FromMinutes(1))
    {
    _tokensUsedThisMinute = 0;
    _windowStart = DateTime.UtcNow;
    }
    }
    }










    CI/CD Pipeline for Agent Deployments

    Here's a production-ready GitHub Actions pipeline:






    # .github/workflows/deploy-agents.yml
    name: Deploy Multi-Agent System

    on:
    push:
    branches: [main]
    pull_request:
    branches: [main]

    env:
    AZURE_RESOURCE_GROUP: rg-aiagents-prod
    CONTAINER_REGISTRY: myagentscr.azurecr.io
    IMAGE_NAME: agent-host

    jobs:
    test:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v4

    - name: Setup .NET
    uses: actions/setup-dotnet@v4
    with:
    dotnet-version: '8.0.x'

    - name: Restore dependencies
    run: dotnet restore

    - name: Build
    run: dotnet build --no-restore

    - name: Run unit tests
    run: dotnet test --no-build --verbosity normal

    - name: Run integration tests
    run: dotnet test --filter Category=Integration --verbosity normal
    env:
    AZURE_AI_CONNECTION_STRING: ${{ secrets.AZURE_AI_CONNECTION_STRING_TEST }}

    build-and-push:
    needs: test
    runs-on: ubuntu-latest
    if: github.event_name == 'push' && github.ref == 'refs/heads/main'
    outputs:
    image-tag: ${{ steps.meta.outputs.version }}

    steps:
    - uses: actions/checkout@v4

    - name: Azure Login
    uses: azure/login@v2
    with:
    creds: ${{ secrets.AZURE_CREDENTIALS }}

    - name: Login to ACR
    run: az acr login --name myagentscr

    - name: Extract metadata
    id: meta
    uses: docker/metadata-action@v5
    with:
    images: ${{ env.CONTAINER_REGISTRY }}/${{ env.IMAGE_NAME }}
    tags: |
    type=sha,prefix=
    type=raw,value=latest

    - name: Build and push
    uses: docker/build-push-action@v5
    with:
    context: .
    push: true
    tags: ${{ steps.meta.outputs.tags }}
    labels: ${{ steps.meta.outputs.labels }}

    deploy-staging:
    needs: build-and-push
    runs-on: ubuntu-latest
    environment: staging

    steps:
    - uses: actions/checkout@v4

    - name: Azure Login
    uses: azure/login@v2
    with:
    creds: ${{ secrets.AZURE_CREDENTIALS }}

    - name: Deploy to staging
    run: |
    az containerapp update \
    --name aiagent-host-staging \
    --resource-group ${{ env.AZURE_RESOURCE_GROUP }} \
    --image ${{ env.CONTAINER_REGISTRY }}/${{ env.IMAGE_NAME }}:${{ needs.build-and-push.outputs.image-tag }}

    - name: Run smoke tests
    run: |
    STAGING_URL=$(az containerapp show --name aiagent-host-staging --resource-group ${{ env.AZURE_RESOURCE_GROUP }} --query properties.configuration.ingress.fqdn -o tsv)
    curl -f https://$STAGING_URL/health/ready || exit 1
    curl -f -X POST https://$STAGING_URL/api/orchestrate \
    -H "Content-Type: application/json" \
    -d '{"task": "smoke test: return OK"}' || exit 1

    deploy-production:
    needs: [build-and-push, deploy-staging]
    runs-on: ubuntu-latest
    environment: production

    steps:
    - uses: actions/checkout@v4

    - name: Azure Login
    uses: azure/login@v2
    with:
    creds: ${{ secrets.AZURE_CREDENTIALS }}

    - name: Deploy to production
    run: |
    az containerapp update \
    --name aiagent-host \
    --resource-group ${{ env.AZURE_RESOURCE_GROUP }} \
    --image ${{ env.CONTAINER_REGISTRY }}/${{ env.IMAGE_NAME }}:${{ needs.build-and-push.outputs.image-tag }}

    - name: Verify deployment
    run: |
    PROD_URL=$(az containerapp show --name aiagent-host --resource-group ${{ env.AZURE_RESOURCE_GROUP }} --query properties.configuration.ingress.fqdn -o tsv)
    # Wait for healthy status
    for i in {1..30}; do
    if curl -sf https://$PROD_URL/health/ready; then
    echo "Deployment successful!"
    exit 0
    fi
    sleep 10
    done
    echo "Deployment verification failed"
    exit 1










    Monitoring Deployed Agents

    Visibility into your agent system is crucial. Here's a comprehensive monitoring setup:


    Custom Telemetry





    // Services/AgentTelemetry.cs
    public class AgentTelemetry
    {
    private readonly ILoggerAgentTelemetry> _logger;
    private readonly TelemetryClient _telemetry;

    public AgentTelemetry(ILoggerAgentTelemetry> logger, TelemetryClient telemetry)
    {
    _logger = logger;
    _telemetry = telemetry;
    }

    public IDisposable TrackAgentOperation(string agentName, string operation)
    {
    var operationId = Activity.Current?.Id ?? Guid.NewGuid().ToString();

    return new AgentOperationScope(
    _telemetry,
    agentName,
    operation,
    operationId,
    _logger);
    }

    public void TrackAgentMessage(string agentName, string role, int tokenCount)
    {
    _telemetry.TrackEvent("AgentMessage", new Dictionarystring, string>
    {
    ["AgentName"] = agentName,
    ["MessageRole"] = role
    }, new Dictionarystring, double>
    {
    ["TokenCount"] = tokenCount
    });
    }

    public void TrackToolExecution(string agentName, string toolName, bool success, TimeSpan duration)
    {
    _telemetry.TrackDependency(
    "AgentTool",
    toolName,
    agentName,
    DateTimeOffset.UtcNow - duration,
    duration,
    success);
    }

    public void TrackOrchestrationComplete(
    string taskType,
    int agentsInvolved,
    TimeSpan totalDuration,
    bool success)
    {
    _telemetry.TrackEvent("OrchestrationComplete", new Dictionarystring, string>
    {
    ["TaskType"] = taskType,
    ["Success"] = success.ToString()
    }, new Dictionarystring, double>
    {
    ["AgentsInvolved"] = agentsInvolved,
    ["DurationMs"] = totalDuration.TotalMilliseconds
    });

    _telemetry.GetMetric("orchestration.duration").Tra ckValue(totalDuration.TotalMilliseconds);
    _telemetry.GetMetric("orchestration.agents_per_tas k").TrackValue(agentsInvolved);
    }

    private class AgentOperationScope : IDisposable
    {
    private readonly TelemetryClient _telemetry;
    private readonly string _agentName;
    private readonly string _operation;
    private readonly Stopwatch _stopwatch;
    private readonly ILogger _logger;
    private bool _disposed;

    public AgentOperationScope(
    TelemetryClient telemetry,
    string agentName,
    string operation,
    string operationId,
    ILogger logger)
    {
    _telemetry = telemetry;
    _agentName = agentName;
    _operation = operation;
    _logger = logger;
    _stopwatch = Stopwatch.StartNew();

    _logger.LogInformation(
    "Starting agent operation: {Agent}.{Operation} [{OperationId}]",
    agentName, operation, operationId);
    }

    public void Dispose()
    {
    if (_disposed) return;
    _disposed = true;

    _stopwatch.Stop();

    _telemetry.TrackMetric($"agent.{_agentName}.{_oper ation}.duration",
    _stopwatch.ElapsedMilliseconds);

    _logger.LogInformation(
    "Completed agent operation: {Agent}.{Operation} in {Duration}ms",
    _agentName, _operation, _stopwatch.ElapsedMilliseconds);
    }
    }
    }







    Azure Monitor Dashboard (ARM Template)





    {
    "$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentTemplate.json#",
    "contentVersion": "1.0.0.0",
    "resources": [
    {
    "type": "Microsoft.Portal/dashboards",
    "apiVersion": "2020-09-01-preview",
    "name": "agent-system-dashboard",
    "location": "[resourceGroup().location]",
    "properties": {
    "lenses": [
    {
    "parts": [
    {
    "position": { "x": 0, "y": 0, "colSpan": 6, "rowSpan": 4 },
    "metadata": {
    "type": "Extension/Microsoft_Azure_Monitoring/PartType/MetricsChartPart",
    "settings": {
    "title": "Agent Response Times",
    "chartType": "Line",
    "metrics": [
    {
    "resourceId": "[parameters('appInsightsId')]",
    "name": "customMetrics/orchestration.duration",
    "aggregationType": "Average"
    }
    ]
    }
    }
    },
    {
    "position": { "x": 6, "y": 0, "colSpan": 6, "rowSpan": 4 },
    "metadata": {
    "type": "Extension/Microsoft_Azure_Monitoring/PartType/MetricsChartPart",
    "settings": {
    "title": "Token Usage by Agent",
    "chartType": "Bar"
    }
    }
    }
    ]
    }
    ]
    }
    }
    ]
    }







    Key Metrics to Monitor

    Agent Response Time (p95) > 30s Scale out or optimize prompts
    Token Usage Rate > 80% of TPM Upgrade quota or add throttling
    Error Rate > 5% Investigate logs, check model health
    Concurrent Tasks > 80% capacity Scale out
    Tool Execution Failures > 10% Check tool configurations





    Series Wrap-Up

    Congratulations! You've made it through the complete Azure AI Agent Service journey. Let's recap what we've covered:


    Series Summary

    1 Foundations Understanding agents, Azure AI Agent Service architecture
    2 First Agent Creating agents, conversations, and basic tool usage
    3 Advanced Patterns Stateful agents, complex tools, error handling
    4 Multi-Agent Systems Orchestration patterns, agent communication
    5 Production Deployment Infrastructure, CI/CD, scaling, monitoring


    What's Next?

    The agent ecosystem is evolving rapidly. Here's where to focus next:

    1. Explore Semantic Kernel - Microsoft's SDK for AI orchestration complements Azure AI Agent Service beautifully for complex workflows.
    2. Experiment with Specialized Models - As new models emerge, consider specialized agents for different tasks (fast models for routing, powerful models for complex reasoning).
    3. Implement Evaluation Pipelines - Build automated evaluation for agent responses using frameworks like Promptflow evaluations.
    4. Consider Hybrid Architectures - Combine agents with traditional APIs and workflows for the best of both worlds.
    5. Stay Current - Follow the Azure AI documentation and Azure updates for new capabilities.


    Resources






    Final Thoughts

    Building production multi-agent systems is a journey that combines software engineering fundamentals with the emerging patterns of AI development. The key principles remain the same: observability, reliability, and iterative improvement.


    Start small, measure everything, and evolve your system based on real-world feedback. The infrastructure and patterns we've covered give you a solid foundation—now it's time to build something amazing.


    Thank you for joining me on this series. I'd love to hear about what you're building with Azure AI Agent Service. Drop a comment below or find me on Twitter [@yourhandle].


    Happy building! 🚀





    This is Part 5 of 5 in the Azure AI Agent Service series. Check out Part 1, Part 2, Part 3, and Part 4 if you haven't already.




    More...
Working...