AI Gateway 企业级设计:统一管理 Agent 流量与安全
深入探讨 AI Gateway 在企业环境中的架构设计,包括流量管理、安全防护、可观测性和成本控制的完整方案。
当企业内部有数十个 Agent 应用、数百个 LLM 调用时,如何统一管理这些流量就成了关键问题。AI Gateway 作为所有 AI 流量的统一入口,提供了认证鉴权、限流熔断、审计日志和成本控制等核心能力。
AI Gateway 架构
┌──────────────────────────────────────────────┐
│ AI Gateway │
│ │
│ ┌────────┐ ┌────────┐ ┌────────┐ ┌───────┐ │
│ │认证鉴权│ │限流熔断│ │路由分发│ │审计日志│ │
│ └────────┘ └────────┘ └────────┘ └───────┘ │
│ │
│ ┌────────┐ ┌────────┐ ┌────────┐ ┌───────┐ │
│ │缓存 │ │负载均衡│ │成本追踪│ │告警 │ │
│ └────────┘ └────────┘ └────────┘ └───────┘ │
└──────────────────────────────────────────────┘
│ │ │
┌────┴────┐ ┌────┴────┐ ┌────┴────┐
│ Claude │ │ GPT-4 │ │ Local │
│ API │ │ API │ │ Models │
└─────────┘ └─────────┘ └─────────┘
认证与鉴权
class AuthGateway {
private apiKeyStore: Map<string, APIKeyInfo> = new Map();
async authenticate(request: GatewayRequest): Promise<AuthContext> {
const apiKey = request.headers['x-api-key'];
if (!apiKey) {
throw new UnauthorizedError('Missing API key');
}
const keyInfo = this.apiKeyStore.get(apiKey);
if (!keyInfo) {
throw new UnauthorizedError('Invalid API key');
}
if (keyInfo.expiresAt && keyInfo.expiresAt < Date.now()) {
throw new UnauthorizedError('API key expired');
}
return {
userId: keyInfo.userId,
teamId: keyInfo.teamId,
permissions: keyInfo.permissions,
rateLimit: keyInfo.rateLimit,
};
}
authorize(context: AuthContext, resource: string, action: string): boolean {
const permission = `${resource}:${action}`;
return context.permissions.includes(permission) ||
context.permissions.includes(`${resource}:*`);
}
}
流量管理
速率限制
class RateLimiter {
private windows: Map<string, TokenBucket> = new Map();
async check(key: string, limit: number): Promise<boolean> {
if (!this.windows.has(key)) {
this.windows.set(key, new TokenBucket(limit, limit));
}
const bucket = this.windows.get(key)!;
return bucket.consume(1);
}
}
class TokenBucket {
private tokens: number;
private lastRefill: number;
constructor(
private capacity: number,
private refillRate: number
) {
this.tokens = capacity;
this.lastRefill = Date.now();
}
consume(count: number): boolean {
this.refill();
if (this.tokens >= count) {
this.tokens -= count;
return true;
}
return false;
}
private refill(): void {
const now = Date.now();
const elapsed = (now - this.lastRefill) / 1000;
this.tokens = Math.min(this.capacity, this.tokens + elapsed * this.refillRate);
this.lastRefill = now;
}
}
熔断保护
class CircuitBreakerGateway {
private breakers: Map<string, BreakerState> = new Map();
async call(provider: string, fn: () => Promise<any>): Promise<any> {
const state = this.getState(provider);
if (state === 'open') {
throw new CircuitOpenError(`Provider ${provider} is circuit-broken`);
}
try {
const result = await fn();
this.onSuccess(provider);
return result;
} catch (error) {
this.onFailure(provider);
throw error;
}
}
}
多模型路由
class ModelRouter {
private providers: Map<string, ModelProvider> = new Map();
async route(request: LLMRequest): Promise<LLMResponse> {
// 根据模型名称路由到对应的 Provider
const provider = this.selectProvider(request.model);
// 应用重试策略
return await this.withRetry(
() => provider.call(request),
{ maxRetries: 2, backoffMs: 1000 }
);
}
private selectProvider(model: string): ModelProvider {
const provider = this.providers.get(model);
if (!provider) {
throw new Error(`Model ${model} not available`);
}
return provider;
}
}
成本追踪
class CostTracker {
private usage: Map<string, CostRecord[]> = new Map();
async record(request: LLMRequest, response: LLMResponse): Promise<void> {
const cost = this.calculateCost(
request.model,
response.usage.inputTokens,
response.usage.outputTokens
);
const record: CostRecord = {
userId: request.userId,
model: request.model,
inputTokens: response.usage.inputTokens,
outputTokens: response.usage.outputTokens,
cost,
timestamp: Date.now(),
};
this.addRecord(record);
}
private calculateCost(model: string, inputTokens: number, outputTokens: number): number {
const pricing: Record<string, { input: number; output: number }> = {
'claude-sonnet-4-20250514': { input: 3.0 / 1000000, output: 15.0 / 1000000 },
'gpt-4o': { input: 2.5 / 1000000, output: 10.0 / 1000000 },
};
const price = pricing[model] || { input: 0, output: 0 };
return inputTokens * price.input + outputTokens * price.output;
}
}
审计日志
class AuditLogger {
private logStream: fs.WriteStream;
log(entry: AuditEntry): void {
this.logStream.write(JSON.stringify({
...entry,
timestamp: new Date().toISOString(),
}) + '\n');
}
}
interface AuditEntry {
userId: string;
action: string;
resource: string;
model?: string;
inputTokens?: number;
outputTokens?: number;
cost?: number;
duration: number;
status: 'success' | 'error';
error?: string;
}
常见问题(FAQ)
AI Gateway 和 API Gateway 有什么区别?
AI Gateway 是 API Gateway 的特化版本,针对 LLM 调用的特点(长连接、流式响应、Token 计费)进行了优化。
Gateway 的延迟开销有多大?
通常 5-15ms,主要是认证和路由的开销。对于 LLM 调用(通常需要数秒)来说可以忽略。
如何处理多个 LLM Provider 的切换?
在 Gateway 层实现统一的模型路由,根据可用性、成本和延迟自动选择最优 Provider。
总结
AI Gateway 是企业级 AI 基础设施的关键组件。通过统一的认证鉴权、流量管理、成本追踪和审计日志,让企业可以安全、可控地使用各种 AI 服务。