Skip to main content
POST
/
v1
/
chat
Create chat response
curl --request POST \
  --url https://api.caprioletech.com/v1/chat \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "openai/gpt-5.5",
  "input": "Hello World!"
}
'
{
  "id": "49da8eb7-916b-43a3-ab02-442bc2841839",
  "model": "openai/gpt-5.5",
  "result": {
    "text": "Here is a short joke."
  },
  "usage": {
    "input_tokens": 228807,
    "output_tokens": 6,
    "total_tokens": 228813,
    "cached_tokens": 228224,
    "charged_tokens": 23412
  }
}

Documentation Index

Fetch the complete documentation index at: https://docs.capriole.ai/llms.txt

Use this file to discover all available pages before exploring further.

Use this endpoint to send plain text input to a model and receive a plain text response. Capriole AI web chat and the public API are separate product surfaces. In web chat, Claude Fast and Claude Thinking are product modes over the same Claude Opus 4.7 model: Fast is the lower-latency chat lane over that model, and Thinking is the deeper-analysis chat lane. The public API does not expose those web chat modes as separate model IDs; for Claude API requests, use anthropic/claude-opus-4-7.

Authorizations

Authorization
string
header
required

Use an API key created in the Capriole AI page. Send it as Authorization: Bearer sk-....

Body

application/json
model
enum<string>
required

Public model identifier returned by GET /v1/models

Available options:
openai/gpt-5.5,
openai/gpt-5.4-mini,
google/gemini-3.1-pro-preview,
google/gemini-3.5-flash,
anthropic/claude-opus-4-7
input
string
required

Plain text user input

Enable provider-native web search when the selected model supports it.

temperature
number

Optional sampling temperature.

Required range: x >= 0
max_output_tokens
integer

Optional maximum number of output tokens.

max_retries
integer

Optional maximum number of provider retries.

Required range: x >= 0
timeout
number

Optional provider request timeout in seconds.

Response

Chat completion response

id
string
required
model
string
required
result
object
required
usage
object
required