openai_responses
Generated from plugins/openai_responses/README.md.
OpenAI Responses API bundle for the AI Agent Platform.
Use it when you want an OpenAI-backed agent with:
- plain chat over the Responses API
- OpenAI API-key auth, ChatGPT auth, or automatic fallback between them
- streaming and non-streaming replies
- direct-send image attachments
- upload-before-send image/PDF attachments
- tool calling
- reasoning controls and surfaced reasoning metadata
- usage and cost metadata in message/UI output
- verbosity controls for supported models
- flex processing controls with configurable timeout and retries
- prompt caching controls with a session-owned cache key
- native conversation compaction for longer sessions
Most users can think of this as a single plugin bundle that adds OpenAI
Responses support to the app. The package name for installation is
openai_responses.
It can also be paired with the chatgpt-auth-app application plugin so
Responses requests can use a saved ChatGPT login against the ChatGPT-hosted
endpoint.
Install
From the repo root:
python -m pip install -e core/python
python -m pip install -e "plugins/openai_responses[dev]"
Environment
OPENAI_API_KEYis required unless you passapi_keydirectly in provider config and useauth_mode: "api", or useauth_mode: "auto"with a valid saved ChatGPT login.- For local use in this repo, keeping
OPENAI_API_KEYin the repo root.envis the simplest setup. - Do not commit real API keys into configs or docs.
The provider defaults to OPENAI_API_KEY automatically, so api_key can be
omitted from config when the env var is present.
When auth_mode: "chatgpt" is selected, or when auth_mode: "auto" resolves
to ChatGPT because a valid saved login exists, the provider reads a saved
credential file instead of requiring OPENAI_API_KEY.
When credentials and network access are available, API-key mode fetches the real OpenAI model list through the OpenAI Python SDK. For ChatGPT auth, the provider merges three Codex-backed sources for model discovery:
- checked-in manual fallback models
- a checked-in Codex models snapshot
- the latest live Codex models catalog from GitHub
If discovery is unavailable, the UI falls back to manual text entry for
model.
Quickstart
This is the smallest useful terminal/app-layer config when you want basic
Responses chat with the whole openai_responses bundle loaded from its plugin
descriptor.
{
"plugin_cache_dir": "~/.crystal/cache/plugins",
"plugins": [
"path:/absolute/path/to/plugins/openai_responses"
],
"providers": {
"openai": {
"provider": "openai_responses",
"model": "gpt-5-mini",
"timeout": 60
}
},
"agents": {
"default": {
"provider": "openai"
}
}
}
Run the terminal app:
cd application/python
python -m agent_terminal_app --console --config /path/to/config_openai_responses.json
In config-aware UIs, the provider will try to populate the model selector
from the appropriate live catalog. If that lookup fails, you can still type a
model id manually.
ChatGPT auth quickstart
This is the smallest useful config for OpenAI Responses with ChatGPT login support.
It assumes:
- the
chatgpt-auth-appapplication plugin is loaded - the system-message feature is loaded
- top-level
instructionsshould come from the system-message feature rather than a provider-onlyinstructionsconfig key
{
"plugin_cache_dir": "${env:CONFIG_DIR}/.plugin_cache/plugins",
"plugins": [
"path:/absolute/path/to/plugins/openai_responses",
"path:/absolute/path/to/plugins/chatgpt-auth-app",
"path:/absolute/path/to/plugins/feature-system-message"
],
"providers": {
"openai": {
"provider": "openai_responses",
"model": "gpt-5",
"auth_mode": "auto",
"system_message_as_instructions": true,
"strip_leading_system_or_developer_message": true,
"system_message": "You are a precise assistant.",
"timeout": 120
}
},
"application": {
"chatgpt_auth": {}
},
"agents": {
"default": {
"provider": "openai"
}
}
}
Then:
- Run the app or terminal client.
- Use the
chatgpt_authapplication actions: Sign in with ChatGPTShow ChatGPT login statusCancel ChatGPT loginLog out of ChatGPT- After login succeeds,
auth_mode: "auto"will use ChatGPT-backed auth when available and fall back to API-key mode otherwise.
Full bundle example
Reasoning request shaping depends on the request-options feature, and tool usage depends on loading one or more tool plugins. This example shows the intended combination.
{
"plugin_cache_dir": "~/.crystal/cache/plugins",
"plugins": [
"path:/absolute/path/to/plugins/openai_responses",
"plugins.request_options_feature.RequestOptionsFeature",
"plugins.math_tools.MathTools"
],
"providers": {
"openai": {
"provider": "openai_responses",
"model": "gpt-5-mini",
"api_key": "${env:OPENAI_API_KEY}",
"base_url": "https://api.openai.com/v1",
"auth_mode": "api",
"timeout": 120,
"enable_flex_processing": false,
"enable_fast_mode": false,
"enable_fast_mode_for_api_requests": false,
"flex_timeout_seconds": 300,
"enable_flex_retries": true,
"flex_max_retries": 2,
"attachment_upload_mode": "upload_before_send",
"enable_prompt_cache_key": true,
"enable_prompt_cache_retention_24h": false,
"reasoning_effort": "minimal",
"reasoning_summary": "concise",
"verbosity": "high"
}
},
"agents": {
"default": {
"provider": "openai"
}
}
}
What this enables:
- OpenAI Responses chat
- reasoning controls and reasoning metadata on assistant messages
- tool schema injection and tool-call handling
- usage and cost metadata
- GPT-5 verbosity controls
- flex processing controls for supported OpenAI accounts
- Fast mode controls for ChatGPT-backed requests, with optional API opt-in
- prompt caching settings plus a regenerate-key session action
- a native compaction session action for longer conversations
ChatGPT auth mode
The bundle supports three auth modes:
api- use
api_keyorOPENAI_API_KEY - use the normal OpenAI API base URL
chatgpt- use saved ChatGPT login credentials
- use the ChatGPT-hosted Responses endpoint
auto- use ChatGPT when a valid saved login exists
- otherwise fall back to API-key mode
Relevant keys:
auth_modechatgpt_credentials_pathchatgpt_base_urlsrc/openai_responses_plugins/data/openai_chatgpt_manual_models.json- checked-in ChatGPT-only fallback models and estimate-only pricing
- pricing entries may define
pricing,service_tiers, or both - used only when a model is missing from upstream sources
src/openai_responses_plugins/data/openai_codex_models.json- checked-in offline snapshot of the public Codex model catalog
- refresh it with
python scripts/refresh_openai_responses_models.py - used when live Codex fetching fails or when you want offline coverage from a recent repo update
chatgpt_account_idchatgpt_client_versionstrip_leading_system_or_developer_message
ChatGPT model-discovery precedence:
- live Codex catalog overrides the checked-in Codex snapshot
- checked-in Codex snapshot overrides manual fallback models
- manual fallback models only fill gaps
Defaults:
- ChatGPT base URL defaults to
https://chatgpt.com/backend-api/codex - saved credentials default to
CONFIG_DIR/auth/chatgpt-auth.jsonwhen the auth feature can discoverCONFIG_DIRfrom runtime env
Instructions path for ChatGPT mode
For the ChatGPT endpoint, the recommended shape is:
- set
system_message_as_instructions: trueon the provider config - let the system-message feature compile the prompt into
state["request"]["instructions"] - avoid a provider-only
instructionsconfig key
This mirrors how Codex-style requests send top-level instructions.
Migrating older sessions
If older session histories contain a leading provider-native system or
developer message from before top-level instructions were used, enable:
strip_leading_system_or_developer_message: true
When enabled, the provider drops the first native message only if its role is
system or developer before sending the request.
Features
Tools
Load one or more tool plugins, and the bundle will pass their schemas to the Responses API and surface tool calls/results in the normal tool loop.
Image attachments
The bundle supports two attachment paths for images:
- direct send through Responses
input_imageitems - upload-before-send through the OpenAI Files API, followed by Responses
file_idreferences
Direct-send image shapes:
multipartContent[].content.urlwithhttp://orhttps://image URLsmultipartContent[].content.urlwithdata:image/...URLsmultipartContent[].content.data_base64plusmime_type, which the provider converts into adata:URL before sending
When attachment_upload_mode: "upload_before_send" is enabled, inline image
attachments are uploaded first and then sent as input_image items using
OpenAI file references.
PDF attachments
PDF attachments are supported only when
attachment_upload_mode: "upload_before_send" is enabled.
Supported PDF input shapes:
multipartContent[].content.urlwithdata:application/pdf;base64,...multipartContent[].content.data_base64plusmime_type: "application/pdf"
In this mode, the provider uploads the PDF through the OpenAI Files API and
sends the Responses request using an input_file item with a file_id
reference.
Explicit upload actions are still follow-up work for Stage 3.
When the package is loaded through its plugin path, it also exposes the
openai_responses_attachments provider extension. That extension:
- advertises upload UI for supported attachment types
- exposes
store_attachment,delete_attachment, anddownload_attachmentactions backed by the application-owned session asset store - lets clients upload an attachment once, send it later via
asset_ref, and download the same stored asset again from message history
Reasoning
Reasoning controls are available through:
reasoning_effortreasoning_summary
To use those options, also load
plugins.request_options_feature.RequestOptionsFeature.
When the reasoning extension is active, it also requests
include: ["reasoning.encrypted_content"] so OpenAI reasoning items can be
preserved in raw metadata for future native replay.
Usage
Usage metadata is attached to assistant messages, including formatted prompt, completion, cached-token, and cost fields. This is useful in the terminal UI and other clients that render message footers or status bars.
Prompt caching
Prompt caching controls are available through:
enable_prompt_cache_keyprompt_cache_keyenable_prompt_cache_retention_24h
When prompt cache key support is enabled:
- new sessions automatically receive a generated session-owned key
- older sessions missing a key are repaired on first request
- forked sessions keep the same key by default
- users can manually edit the key through normal session settings
- users can regenerate the key through the
regenerate_prompt_cache_keysession action
By default, prompt cache key support is enabled when this extension is loaded.
Set enable_prompt_cache_key: false if you want to turn off automatic
session-owned key generation and request injection.
When the effective auth mode selects the ChatGPT endpoint:
prompt_cache_keybehavior remains availableprompt_cache_retention: "24h"is suppressed automatically even ifenable_prompt_cache_retention_24histrue
Flex processing
Flex processing controls are available through:
enable_flex_processingflex_timeout_secondsenable_flex_retriesflex_max_retries
When flex processing is enabled, the extension writes request overrides so the
provider can apply service_tier: "flex", the configured timeout, and the
configured retry policy on both streaming and non-streaming Responses calls.
When the effective auth mode selects the ChatGPT endpoint, flex processing is disabled automatically even if the config flags are set. The ChatGPT-hosted endpoint does not support this OpenAI API feature.
Verbosity
verbosity controls the Responses text.verbosity option for supported
models. In the current implementation, that is enabled for models whose id
contains gpt-5.
Native compaction
The bundle includes a session action that compacts provider-native history for future Responses turns. This is intended for context maintenance, not for producing a human-readable summary message.
Configuration reference
Configuration is a flat dict shared across the provider and enabled extensions.
Provider keys
provider: must beopenai_responsesin app-layer provider configmodel: required model idauth_mode: optional, one ofapi,chatgpt,auto; defaults toapiapi_key: optional ifOPENAI_API_KEYis set or ChatGPT auth is usedbase_url: optional, defaults tohttps://api.openai.com/v1api_base_url: optional explicit base URL for API-key modechatgpt_base_url: optional, defaults tohttps://chatgpt.com/backend-api/codexchatgpt_credentials_path: optional path to saved ChatGPT credentialschatgpt_account_id: optional explicit ChatGPT account id overridechatgpt_client_version: optional, used for ChatGPT/modelsdiscoverystrip_leading_system_or_developer_message: optional compatibility flag for migrated sessionscompaction_model: optional model id for native compaction; leave empty or omit it to usemodeluse_chatgpt_auth_for_compaction: optional, defaults totrue; set tofalseto force native compaction onto the API endpointuse_compaction_placeholder_transcript: optional, defaults tofalse; set totrueto show compacted native ranges as a single visible placeholder message instead of rebuilding the compacted native output directlytimeout: optional request timeout in seconds, defaults to60
Attachment keys
attachment_upload_mode:directorupload_before_send; defaults todirect
Use upload_before_send when you want the provider to upload inline image or
PDF attachment data through the OpenAI Files API before the Responses request.
PDF support is only available in this mode.
Flex processing keys
enable_flex_processing: enableservice_tier: "flex"for Responses callsflex_timeout_seconds: per-request timeout to apply when flex is enabled; defaults to300enable_flex_retries: enable request-scoped retries for flex calls; defaults totrueflex_max_retries: max retry count when flex retries are enabled; defaults to2
Note: flex processing is ignored automatically when the effective auth mode selects the ChatGPT endpoint.
Fast mode
Fast mode controls are available through:
enable_fast_mode: enableservice_tier: "priority"for ChatGPT-backed Responses callsenable_fast_mode_for_api_requests: config-only opt-in that letsenable_fast_modealso applyservice_tier: "priority"to API-backed Responses calls
By default, Fast mode applies only to ChatGPT-backed requests. When
enable_fast_mode_for_api_requests: true is set in config, the same Fast mode
toggle also applies service_tier: "priority" to API-backed requests. The
usage footer renders tier: priority when OpenAI reports that tier, and cost
estimation uses priority-tier pricing when the checked-in or manual pricing
data defines it.
Benchmark script
The repo includes a benchmark helper at
scripts/run_openai_responses_tier_matrix.py.
Useful options:
--modelor--models--repeat--execution-mode sequential|parallel--run-scope api|chatgpt|both--desired-input-tokens--desired-output-tokens
The script saves a text artifact under artifacts/ containing:
- the benchmark configuration
- one section per model/tier/repeat run
- actual usage and duration metrics
- aggregate summaries per model/tier row
Reasoning keys
These require plugins.request_options_feature.RequestOptionsFeature to be
loaded.
reasoning_effort: model-specific effort value. Current UI options are:- GPT-5 models:
minimal,low,medium,high - GPT-5.1 models:
none,low,medium,high - GPT-5.2 and GPT-5.4 models:
none,low,medium,high,xhigh - GPT-5 pro:
high - unknown models: fallback UI shows all documented values
reasoning_summary: one ofauto,concise,detailed
Assistant messages always expose the frontend-friendly field
metadata.reasoning. OpenAI-specific raw reasoning data is preserved under
metadata.openai_responses_reasoning.
Verbosity keys
verbosity: one oflow,medium,highshow_verbosity_ui: set tofalseto hide the verbosity selector in UI
Verbosity only activates when the selected model is tagged with
supports_verbosity. In the current implementation, that tag is added for
models whose id contains gpt-5.
Usage keys
pricing_snapshot_path: optional override for the checked-in pricing snapshotshow_usage: set tofalseto hide usage footer/status UI elementsshow_estimated_cached_tokens: set totrueto show ChatGPT estimated cached-token footersshow_openai_auth_mode: set totrueto show the effective auth mode (APIorChatGPT) in usage footers
ChatGPT-backed Responses usage currently estimates cached tokens using
turn-aware heuristics. That estimate is kept separate from the raw provider
usage payload and is used for ChatGPT-mode pricing instead of any
backend-reported cached-token field. By default it is not displayed; enable
show_estimated_cached_tokens if you want the footer.
Prompt caching keys
enable_prompt_cache_key: defaults totrue; set tofalseto disable automatic session-owned prompt cache keysprompt_cache_key: optional manual cache key override shown in normal session settings UIenable_prompt_cache_retention_24h: set totrueto sendprompt_cache_retention: "24h"
Note: enable_prompt_cache_retention_24h is ignored automatically when the
effective auth mode selects the ChatGPT endpoint.
Tools
Tool schemas come from loaded tool plugins such as plugins.math_tools.MathTools.
No extra config key is required beyond loading tools in the application.
Native compaction
It exposes a session action with id compact_native_history that uses the
native compaction API, rewrites provider-native history, and then rebuilds the
visible transcript directly from the compacted native output.
If you prefer the older placeholder-style transcript behavior, enable:
use_compaction_placeholder_transcript: true
When enabled, compacted native segments are shown as a single visible placeholder message and are expanded back into native compaction artifacts on future requests.
The bundle also exposes a session action that toggles this view mode:
Collapse compacted messagesExpand compacted messages
The action changes the visible projection only. Retained native history stays canonical and is rebuilt using the updated session override.
By default, native compaction follows the same effective auth mode as normal
Responses requests. If use_chatgpt_auth_for_compaction is enabled, a session
running in ChatGPT mode, or auto mode that selected ChatGPT, will attempt
native compaction against the ChatGPT-backed endpoint too. If that backend does
not support native compaction, the action fails with a clear endpoint-specific
error. Set use_chatgpt_auth_for_compaction: false to force compaction onto
the normal API endpoint instead.
By default, compaction uses the same model as normal requests. Set
compaction_model to use a different model for the compaction API call, for
example:
{
"model": "gpt-5.5",
"compaction_model": "gpt-5.4-mini"
}
The compacted native output shape is model-dependent. Some models return visible compacted assistant messages alongside opaque compaction artifacts; others return only the retained visible user messages plus opaque compaction state. Both shapes are valid as long as future requests continue from the compacted native history.
The action no longer accepts a freeform instructions field. When the backend
requires top-level instructions, native compaction now uses the same
config-driven system-message pipeline as normal requests:
- enable
system_message_as_instructions: true - set
system_messageas usual - enable
strip_leading_system_or_developer_message: trueif older sessions still contain a persisted leadingsystemordevelopermessage
When system_message_as_instructions: false, native compaction also follows
the same request-time message-injection path as a normal request:
- if
state["request"]["messages"]is the effective outbound message set, compaction uses that rendered message list as its compact input _metadatais stripped from outbound compact input- if the compacted output preserves the leading
systemordevelopermessage, the stored native history restores the original retained version of that message, including template-form content and_metadata - when a retained
compaction_summaryitem is present, the visible rebuilt transcript surfaces it as a clear assistant placeholder message instead of letting that native item drift into unrelated later message ownership
Prompt caching session action
It exposes a session action with id regenerate_prompt_cache_key that replaces
the current session-owned prompt cache key without changing the visible
transcript.
Capabilities and caveats
- API-key mode supports both non-streaming and streaming Responses requests.
- ChatGPT mode is streaming-only in the current implementation. Non-streaming requests fail with a clear error.
- The provider retains Responses-native history so future turns can replay native items rather than flattening everything to plain chat text.
- Reasoning requests include
reasoning.encrypted_contentso raw reasoning can be preserved even when the model does not expose readable reasoning text. - Verbosity support is model-gated and currently follows the provider tag check
for
gpt-5models. - ChatGPT-backed requests use top-level
instructionsandstore: false. - Flex processing and prompt-cache retention are automatically disabled when the ChatGPT endpoint is selected.
- ChatGPT
/modelsdiscovery may still requireclient_version; if discovery is unavailable, pinmodelexplicitly in config. - Native compaction is intentionally different from human-readable summary insertion. It preserves provider-native compacted artifacts for future Responses requests instead of replacing history with a plain assistant summary.
Native compaction behavior
Native compaction is meant for Responses-native context maintenance, not transcript summarization.
- It can compact the full native history or a visible prefix.
- It uses the OpenAI Python SDK rather than application-layer summary logic.
- It uses
compaction_modelwhen configured, otherwise it uses the normal requestmodel. - It rebuilds visible messages after compaction directly from the compacted native output, whose visible assistant/user shape may vary by model.
- If
use_compaction_placeholder_transcript: trueis enabled, it instead shows the compacted region as a single placeholder message in the visible transcript while still preserving the underlying retained native compaction artifacts. - Future requests continue from the compacted native history.
Bundle layout
If you are extending or debugging the bundle, the responsibilities are split like this:
- provider: transport, SDK calls, Responses-native conversion, and native history replay
- attachments: attachment upload policy/UI plus PDF composer UI
- tools: tool schema injection and tool-call reconstruction
- reasoning: reasoning request shaping and reasoning metadata
- usage: usage formatting and pricing metadata
- verbosity:
text.verbosityrequest shaping - prompt caching:
prompt_cache_key/ retention request shaping and session-key lifecycle - ChatGPT auth feature: saved-credential loading plus
api/chatgpt/autoauth-mode selection - native compaction: session action for provider-native compaction
Implementation classes registered by the bundle:
openai_responses_plugins.openai_responses_provider.OpenAIResponsesProvideropenai_responses_plugins.openai_responses_attachments_extension.OpenAIResponsesAttachmentsExtensionopenai_responses_plugins.openai_responses_chatgpt_auth_feature.OpenAIResponsesChatGPTAuthFeatureopenai_responses_plugins.openai_responses_flex_processing_extension.OpenAIResponsesFlexProcessingExtensionopenai_responses_plugins.openai_responses_fast_mode_extension.OpenAIResponsesFastModeExtensionopenai_responses_plugins.openai_responses_native_compaction_extension.OpenAIResponsesNativeCompactionExtensionopenai_responses_plugins.openai_responses_prompt_caching_extension.OpenAIResponsesPromptCachingExtensionopenai_responses_plugins.openai_responses_tools_extension.OpenAIResponsesToolsExtensionopenai_responses_plugins.openai_responses_reasoning_extension.OpenAIResponsesReasoningExtensionopenai_responses_plugins.openai_responses_usage_extension.OpenAIResponsesUsageExtensionopenai_responses_plugins.openai_responses_verbosity_extension.OpenAIResponsesVerbosityExtension
Troubleshooting
- Authentication errors:
check
OPENAI_API_KEYin the repo root.env, setapi_keyin config, or verify the saved ChatGPT credential file exists when usingchatgpt/auto. - ChatGPT mode says streaming is required:
use the streaming request path, or switch
auth_modetoapi/auto. - ChatGPT requests behave oddly on older sessions:
enable
system_message_as_instructions: trueandstrip_leading_system_or_developer_message: true. - Reasoning controls do nothing:
make sure
plugins.request_options_feature.RequestOptionsFeatureis loaded. - Verbosity control does not appear or has no effect:
verify the model is tagged with
supports_verbosity; today that means agpt-5model id. - Tool calls never appear: load at least one tool plugin and use a model/prompt that will actually call tools.
- Native compaction action is missing:
make sure the active provider is
openai_responsesand the bundle was loaded frompath:/absolute/path/to/plugins/openai_responsesor equivalent.
Tests
For contributors and local validation:
Run the fast default package suite:
pytest plugins/openai_responses/tests -q
This package defaults to -m 'not integration and not manual_integration'.
Run hosted API integration tests:
pytest plugins/openai_responses/tests -m 'integration and api and not slow_integration and openai' -q
Run manual ChatGPT-auth integration tests separately:
pytest plugins/openai_responses/tests -m 'manual_integration' -q
The OpenAI Responses test suite bootstraps the repo root .env during local
use in this repo and does not overwrite already-exported environment variables.
Hosted API tests typically require OPENAI_API_KEY. Manual integration tests
also require saved ChatGPT auth state.
License
Copyright 2026 Dynamic Programming Solutions Kft.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.