AI Gateway

AI Gateway 企业级设计:统一管理 Agent 流量与安全

深入探讨 AI Gateway 在企业环境中的架构设计,包括流量管理、安全防护、可观测性和成本控制的完整方案。

当企业内部有数十个 Agent 应用、数百个 LLM 调用时,如何统一管理这些流量就成了关键问题。AI Gateway 作为所有 AI 流量的统一入口,提供了认证鉴权、限流熔断、审计日志和成本控制等核心能力。

AI Gateway 架构

┌──────────────────────────────────────────────┐
│                 AI Gateway                    │
│                                              │
│  ┌────────┐ ┌────────┐ ┌────────┐ ┌───────┐ │
│  │认证鉴权│ │限流熔断│ │路由分发│ │审计日志│ │
│  └────────┘ └────────┘ └────────┘ └───────┘ │
│                                              │
│  ┌────────┐ ┌────────┐ ┌────────┐ ┌───────┐ │
│  │缓存    │ │负载均衡│ │成本追踪│ │告警    │ │
│  └────────┘ └────────┘ └────────┘ └───────┘ │
└──────────────────────────────────────────────┘
         │              │              │
    ┌────┴────┐    ┌────┴────┐    ┌────┴────┐
    │ Claude  │    │ GPT-4   │    │ Local   │
    │ API     │    │ API     │    │ Models  │
    └─────────┘    └─────────┘    └─────────┘

认证与鉴权

class AuthGateway {
  private apiKeyStore: Map<string, APIKeyInfo> = new Map();

  async authenticate(request: GatewayRequest): Promise<AuthContext> {
    const apiKey = request.headers['x-api-key'];
    if (!apiKey) {
      throw new UnauthorizedError('Missing API key');
    }

    const keyInfo = this.apiKeyStore.get(apiKey);
    if (!keyInfo) {
      throw new UnauthorizedError('Invalid API key');
    }

    if (keyInfo.expiresAt && keyInfo.expiresAt < Date.now()) {
      throw new UnauthorizedError('API key expired');
    }

    return {
      userId: keyInfo.userId,
      teamId: keyInfo.teamId,
      permissions: keyInfo.permissions,
      rateLimit: keyInfo.rateLimit,
    };
  }

  authorize(context: AuthContext, resource: string, action: string): boolean {
    const permission = `${resource}:${action}`;
    return context.permissions.includes(permission) ||
           context.permissions.includes(`${resource}:*`);
  }
}

流量管理

速率限制

class RateLimiter {
  private windows: Map<string, TokenBucket> = new Map();

  async check(key: string, limit: number): Promise<boolean> {
    if (!this.windows.has(key)) {
      this.windows.set(key, new TokenBucket(limit, limit));
    }

    const bucket = this.windows.get(key)!;
    return bucket.consume(1);
  }
}

class TokenBucket {
  private tokens: number;
  private lastRefill: number;

  constructor(
    private capacity: number,
    private refillRate: number
  ) {
    this.tokens = capacity;
    this.lastRefill = Date.now();
  }

  consume(count: number): boolean {
    this.refill();

    if (this.tokens >= count) {
      this.tokens -= count;
      return true;
    }

    return false;
  }

  private refill(): void {
    const now = Date.now();
    const elapsed = (now - this.lastRefill) / 1000;
    this.tokens = Math.min(this.capacity, this.tokens + elapsed * this.refillRate);
    this.lastRefill = now;
  }
}

熔断保护

class CircuitBreakerGateway {
  private breakers: Map<string, BreakerState> = new Map();

  async call(provider: string, fn: () => Promise<any>): Promise<any> {
    const state = this.getState(provider);

    if (state === 'open') {
      throw new CircuitOpenError(`Provider ${provider} is circuit-broken`);
    }

    try {
      const result = await fn();
      this.onSuccess(provider);
      return result;
    } catch (error) {
      this.onFailure(provider);
      throw error;
    }
  }
}

多模型路由

class ModelRouter {
  private providers: Map<string, ModelProvider> = new Map();

  async route(request: LLMRequest): Promise<LLMResponse> {
    // 根据模型名称路由到对应的 Provider
    const provider = this.selectProvider(request.model);

    // 应用重试策略
    return await this.withRetry(
      () => provider.call(request),
      { maxRetries: 2, backoffMs: 1000 }
    );
  }

  private selectProvider(model: string): ModelProvider {
    const provider = this.providers.get(model);
    if (!provider) {
      throw new Error(`Model ${model} not available`);
    }
    return provider;
  }
}

成本追踪

class CostTracker {
  private usage: Map<string, CostRecord[]> = new Map();

  async record(request: LLMRequest, response: LLMResponse): Promise<void> {
    const cost = this.calculateCost(
      request.model,
      response.usage.inputTokens,
      response.usage.outputTokens
    );

    const record: CostRecord = {
      userId: request.userId,
      model: request.model,
      inputTokens: response.usage.inputTokens,
      outputTokens: response.usage.outputTokens,
      cost,
      timestamp: Date.now(),
    };

    this.addRecord(record);
  }

  private calculateCost(model: string, inputTokens: number, outputTokens: number): number {
    const pricing: Record<string, { input: number; output: number }> = {
      'claude-sonnet-4-20250514': { input: 3.0 / 1000000, output: 15.0 / 1000000 },
      'gpt-4o': { input: 2.5 / 1000000, output: 10.0 / 1000000 },
    };

    const price = pricing[model] || { input: 0, output: 0 };
    return inputTokens * price.input + outputTokens * price.output;
  }
}

审计日志

class AuditLogger {
  private logStream: fs.WriteStream;

  log(entry: AuditEntry): void {
    this.logStream.write(JSON.stringify({
      ...entry,
      timestamp: new Date().toISOString(),
    }) + '\n');
  }
}

interface AuditEntry {
  userId: string;
  action: string;
  resource: string;
  model?: string;
  inputTokens?: number;
  outputTokens?: number;
  cost?: number;
  duration: number;
  status: 'success' | 'error';
  error?: string;
}

常见问题(FAQ)

AI Gateway 和 API Gateway 有什么区别?

AI Gateway 是 API Gateway 的特化版本,针对 LLM 调用的特点(长连接、流式响应、Token 计费)进行了优化。

Gateway 的延迟开销有多大?

通常 5-15ms,主要是认证和路由的开销。对于 LLM 调用(通常需要数秒)来说可以忽略。

如何处理多个 LLM Provider 的切换?

在 Gateway 层实现统一的模型路由,根据可用性、成本和延迟自动选择最优 Provider。

总结

AI Gateway 是企业级 AI 基础设施的关键组件。通过统一的认证鉴权、流量管理、成本追踪和审计日志,让企业可以安全、可控地使用各种 AI 服务。