Documentation Index
Fetch the complete documentation index at: https://docs.agipower.ai/llms.txt
Use this file to discover all available pages before exploring further.
Get generation
Metering and Billing Information
Metering (Token Usage)
Metering data (e.g., token usage in thenativeTokens field) is returned synchronously with the request in the protocol’s native format:
- OpenAI Chat Completions protocol: returned in the response
usagefield - OpenAI Responses protocol: returned in the response
usagefield - Anthropic protocol: returned in the response
usagefield - Vertex AI protocol: returned in the response
usageMetadatafield
Billing (Billing & Costs)
Billing data (cost-related fields such asusage, ratingResponses, etc.) is not currently returned synchronously with the request. After the request completes, you must query it via this endpoint 3–5 minutes later.
We’re improving and upgrading our billing architecture to enable synchronous billing data in responses as soon as possible. Stay tuned!
Request params
Authorization Header Required
Header parameters:- Name:
Authorization - Format:
Bearer <API_KEY> - Description: Your AGIPower API key
- Pay As You Go API key: supports querying full metering and billing information
- Subscription API key (prefixed with
sk-ss-v1-): supports metering only; billing information is not supported
generate_id string Required
Query parameters:
The generation id returned by AGIPower API endpoints. You can obtain it from:
- Create Chat Completion - OpenAI Chat Completions protocol
- Create a Model Response - OpenAI Responses protocol
- Create Messages - Anthropic protocol
- Generate Content - Vertex AI protocol
Returns
api string
API type. Values vary by protocol:
chat.completions- OpenAI Chat Completions protocolresponses- OpenAI Responses protocolmessages- Anthropic protocolgenerateContent- Vertex AI protocol
generationId string
The current generation id.
model string
Model ID.
createAt string
The time when the server received the inference request.
generationTime integer
Total duration of this inference from first token to completion, in milliseconds.
latency integer
Time to first token, in milliseconds.
nativeTokens object
Usage information consumed by this inference, including:
completion_tokensinteger- Tokens used for the completionprompt_tokensinteger- Tokens used for the prompttotal_tokensinteger- Total tokenscompletion_tokens_detailsobject- Completion token detailsreasoning_tokensinteger- Tokens used for reasoning
prompt_tokens_detailsobject- Prompt token detailscached_tokensinteger- Cached tokens
streamed boolean
Whether the response is streamed.
finishReason string
The reason the model stopped generating.
usage number
Credits consumed by this inference.
ratingResponses object
Billing response details, including:
billAmountnumber- Billed amountdiscountAmountnumber- Discount amountoriginAmountnumber- Original amountpriceVersionstring- Price versionratingDetailsarray- Billing detail items, each containing:billAmountnumber- Billed amountdiscountAmountnumber- Discount amountfeeItemCodestring- Fee item code (e.g.,completion,prompt)originAmountnumber- Original amountratenumber- Rate