Skip to main content
POST
/
v1
/
chat
Create chat response
curl --request POST \
  --url https://api.caprioletech.com/v1/chat \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "openai/gpt-5.5",
  "input": "Hello World!"
}
'
{ "id": "49da8eb7-916b-43a3-ab02-442bc2841839", "model": "openai/gpt-5.5", "result": { "text": "Here is a short joke." }, "usage": { "input_tokens": 228807, "output_tokens": 6, "total_tokens": 228813, "cached_tokens": 228224, "charged_tokens": 23412 } }
Use this endpoint to send plain text input to a model and receive a plain text response. Capriole AI web chat and the public API are separate product surfaces. In web chat, Claude Fast and Claude Thinking are product modes over Claude Opus 4.8. The public API does not expose those web chat modes as separate model IDs; for Claude API requests, use anthropic/claude-opus-4-8. Existing anthropic/claude-opus-4-7 integrations remain supported.

Authorizations

Authorization
string
header
required

Use an API key created in the Capriole AI page. Send it as Authorization: Bearer sk-....

Body

application/json
model
enum<string>
required

Public model identifier returned by GET /v1/models

Available options:
openai/gpt-5.5,
openai/gpt-5.4-mini,
anthropic/claude-opus-4-8,
anthropic/claude-opus-4-7,
google/gemini-3.1-pro-preview,
google/gemini-3.5-flash
input
string
required

Plain text user input

Enable provider-native web search when the selected model supports it.

temperature
number

Optional sampling temperature.

Required range: x >= 0
max_output_tokens
integer

Optional maximum number of output tokens.

max_retries
integer

Optional maximum number of provider retries.

Required range: x >= 0
timeout
number

Optional provider request timeout in seconds.

Response

Chat completion response

id
string
required
model
string
required
result
object
required
usage
object
required