Skip to main content

Background and Value

You can use a unified entry point to call multiple model formats (OpenAI / Claude / Gemini), achieving a consistent experience without modifying existing clients. The platform actively validates quotas, records critical logs, and supports proxies and multipliers, helping you operate securely in a controllable and observable manner.

Use Cases

  • Want to quickly switch to a unified entry point while continuing to use existing OpenAI / Claude / Gemini clients.
  • Need to set multipliers or model wildcards across different channels to balance cost and compatibility.
  • Care about call visibility: want to see records of models, tokens, channels, latency, tokens, etc.
  • Need to configure proxies in restricted networks or use notification capabilities to receive results promptly.

Core Principles

  • Compatibility First: Follows request formats of OpenAI /v1, Claude /claude/v1, and Gemini /gemini/:version. You only need to replace the Base URL and token to make calls.
  • Controllable Quotas: Each call validates and pre-deducts quota; insufficient quota results in immediate rejection. Successful calls confirm deduction, failed calls roll back pre-deduction, ensuring predictable consumption.
  • Flexible Routing: Models support wildcards, channels can set multipliers or proxies. When needed, you can specify routes via token suffixes or specific headers, enabling both automatic distribution and precise control.
  • Transparent and Traceable: Logs record model names, token names, channels, request latency (request_time), prompt and completion tokens, etc. Statistics aggregate call counts, quotas, and latency by date, facilitating reconciliation and troubleshooting.
  • Special Capabilities Pass-through: Supports official parameter pass-through for o1/o3 ReasoningEffort selection, Claude thinking, Gemini search and code execution, maintaining consistency with respective clients.

Key Usage Points

  • Before the first call, retrieve the model list to confirm available models, then use a test token to verify multiplier and wildcard configurations.
  • When sending requests, keep official fields intact and only replace the Base URL and token. Decide whether to specify a route based on needs; otherwise, use default automatic routing.
  • Monitor quota status: insufficient quota results in immediate rejection. Replenish in advance when continuous calls are needed.
  • After calls, review logs and aggregated statistics to verify models, tokens, latency, and consumption, identifying anomalies or reconciling accounts.

Limitations and Notes

  • Only enabled models and routes will be routed; disabled models or insufficient quota will be rejected.
  • Proxies and multipliers are configured at the channel level; effectiveness depends on current route settings.
  • Special capabilities require passing official parameters; when corresponding parameters are not provided, default behavior applies.