argo-openai-proxy

This project is a proxy application that forwards requests to an ARGO API and optionally converts the responses to be compatible with OpenAI's API format. It can be used in conjunction with autossh-tunnel-dockerized or other secure connection tools.

For detailed information, please refer to documentation at argo-proxy ReadtheDocs page

TL;DR

pip install argo-proxy # install the package
argo-proxy # run the proxy

Function calling is available for Chat Completions endpoint starting from v2.7.5. Try with pip install "argo-proxy>=2.7.5"

NOTICE OF USAGE

The machine or server making API calls to Argo must be connected to the Argonne internal network or through a VPN on an Argonne-managed computer if you are working off-site. Your instance of the argo proxy should always be on-premise at an Argonne machine. The software is provided "as is," without any warranties. By using this software, you accept that the authors, contributors, and affiliated organizations will not be liable for any damages or issues arising from its use. You are solely responsible for ensuring the software meets your requirements.

Notice of Usage
Deployment
Usage
Bug Reports and Contributions

Deployment

Prerequisites

Python 3.10+ is required.
It is recommended to use conda, mamba, or pipx, etc., to manage an exclusive environment.
Conda/Mamba Download and install from: https://conda-forge.org/download/
pipx Download and install from: https://pipx.pypa.io/stable/installation/

Install dependencies:

PyPI current version:

pip install argo-proxy

To upgrade:

argo-proxy --version  # Display current version
# Check against PyPI version
pip install argo-proxy --upgrade

or, if you decide to use dev version (make sure you are at the root of the repo cloned):

pip install .

Configuration File

If you don't want to manually configure it, the First-Time Setup will automatically create it for you.

The application uses config.yaml for configuration. Here's an example:

argo_embedding_url: "https://apps.inside.anl.gov/argoapi/api/v1/resource/embed/"
argo_stream_url: "https://apps-dev.inside.anl.gov/argoapi/api/v1/resource/streamchat/"
argo_url: "https://apps-dev.inside.anl.gov/argoapi/api/v1/resource/chat/"
port: 44497
host: 0.0.0.0
user: "your_username" # set during first-time setup
verbose: true # can be changed during setup

Running the Application

To start the application:

argo-proxy [config_path]

Without arguments: search for config.yaml under:
- current directory
- ~/.config/argoproxy/
- ~/.argoproxy/ The first one found will be used.
With path: uses specified config file, if exists. Otherwise, falls back to default search.
```
argo-proxy /path/to/config.yaml
```
With --edit flag: opens the config file in the default editor for modification.

First-Time Setup

When running without an existing config file:

The script offers to create config.yaml from config.sample.yaml
Automatically selects a random available port (can be overridden)
Prompts for:
- Your username (sets user field)
- Verbose mode preference (sets verbose field)
Validates connectivity to configured URLs
Shows the generated config in a formatted display for review before proceeding

Example session:

$ argo-proxy
No valid configuration found.
Would you like to create it from config.sample.yaml? [Y/n]:
Creating new configuration...
Use port [52226]? [Y/n/<port>]:
Enter your username: your_username
Enable verbose mode? [Y/n]
Created new configuration at: /home/your_username/.config/argoproxy/config.yaml
Using port 52226...
Validating URL connectivity...
Current configuration:
--------------------------------------
{
    "host": "0.0.0.0",
    "port": 52226,
    "user": "your_username",
    "argo_url": "https://apps-dev.inside.anl.gov/argoapi/api/v1/resource/chat/",
    "argo_stream_url": "https://apps-dev.inside.anl.gov/argoapi/api/v1/resource/streamchat/",
    "argo_embedding_url": "https://apps.inside.anl.gov/argoapi/api/v1/resource/embed/",
    "verbose": true
}
--------------------------------------
# ... proxy server starting info display ...

Configuration Options Reference

Option	Description	Default
`argo_embedding_url`	Argo Embedding API URL	Prod URL
`argo_stream_url`	Argo Stream API URL	Dev URL (for now)
`argo_url`	Argo Chat API URL	Dev URL (for now)
`host`	Host address to bind the server to	`0.0.0.0`
`port`	Application port (random available port selected by default)	randomly assigned
`user`	Your username	(Set during setup)
`verbose`	Debug logging	`true`
`real_stream`	Enable real streaming mode (default since v2.7.7)	`true`

Streaming Modes: Real Stream vs Pseudo Stream

Argo Proxy supports two streaming modes for chat completions:

Real Stream (Default since v2.7.7)

Default behavior: Enabled by default since v2.7.7 (real_stream: true or omitted in config)
How it works: Directly streams chunks from the upstream API as they arrive
Advantages:
- True real-time streaming behavior
- Lower latency for streaming responses
- More responsive user experience
- Recommended for production use

Pseudo Stream

Enable via: Set real_stream: false in config file or use --pseudo-stream CLI flag
How it works: Receives the complete response from upstream, then simulates streaming by sending chunks to the client
Status: Available for compatibility with previous behavior and function calling

Configuration Examples

Via config file:

# Enable real streaming (experimental)
real_stream: true

# Or explicitly use pseudo streaming (default)
real_stream: false

Via CLI flag:

# Use default real streaming (since v2.7.7)
argo-proxy

# Enable legacy pseudo streaming
argo-proxy --pseudo-stream

Function Calling Behavior

When using function calling (tool calls):

Native function calling support: Available for OpenAI and Anthropic models. Gemini models is in development
Real streaming compatible: Native function calling works with both streaming modes
OpenAI format: All input and output remains in OpenAI format regardless of underlying model
Legacy support: Prompting-based function calling available via --tool-prompting flag

`argo-proxy` CLI Available Options

$ argo-proxy -h
usage: argo-proxy [-h] [--host HOST] [--port PORT] [--verbose | --quiet]
                  [--real-stream | --pseudo-stream] [--tool-prompting]
                  [--edit] [--validate] [--show] [--version]
                  [config]

Argo Proxy CLI

positional arguments:
  config                Path to the configuration file

options:
  -h, --help            show this help message and exit
  --host HOST, -H HOST  Host address to bind the server to
  --port PORT, -p PORT  Port number to bind the server to
  --verbose, -v         Enable verbose logging, override if `verbose` set False in config
  --quiet, -q           Disable verbose logging, override if `verbose` set True in config
  --real-stream, -rs    Enable real streaming (default behavior), override if `real_stream` set False in config
  --pseudo-stream, -ps  Enable pseudo streaming, override if `real_stream` set True or omitted in config
  --tool-prompting      Enable prompting-based tool calls/function calling, otherwise use native tool calls/function calling
  --edit, -e            Open the configuration file in the system's default editor for editing
  --validate, -vv       Validate the configuration file and exit
  --show, -s            Show the current configuration during launch
  --version, -V         Show the version and check for updates

Management Utilities

The following options help manage the configuration file:

--edit, -e: Open the configuration file in the system's default editor for editing.
- If no config file is specified, it will search in default locations (~/.config/argoproxy/, ~/.argoproxy/, or current directory)
- Tries common editors like nano, vi, vim (unix-like systems) or notepad (Windows)
--validate, -vv: Validate the configuration file and exit without starting the server.
- Useful for checking config syntax and connectivity before deployment
--show, -s: Show the current configuration during launch.
- Displays the fully resolved configuration including defaults
- Can be used with --validate to just display configuration without starting the server

# Example usage:
argo-proxy --edit  # Edit config file
argo-proxy --validate --show  # Validate and display config
argo-proxy --show  # Show config at startup

Usage

Endpoints

OpenAI Compatible

These endpoints convert responses from the ARGO API to be compatible with OpenAI's format:

/v1/responses: Available from v2.7.0. Response API.
/v1/chat/completions: Chat Completions API.
/v1/completions: Legacy Completions API.
/v1/embeddings: Embedding API.
/v1/models: Lists available models in OpenAI-compatible format.

Not OpenAI Compatible

These endpoints interact directly with the ARGO API and do not convert responses to OpenAI's format:

/v1/chat: Proxies requests to the ARGO API without conversion.
/v1/embed: Proxies requests to the ARGO Embedding API without conversion.

Utility Endpoints

/health: Health check endpoint. Returns 200 OK if the server is running.
/version: Returns the version of the ArgoProxy server. Notifies if a new version is available. Available from 2.7.0.post1.

Timeout Override

You can override the default timeout with a timeout parameter in your request. This parameter is optional for client request. Proxy server will keep the connection open until it finishes or client disconnects.

Details of how to make such override in different query flavors: Timeout Override Examples

Models

Chat Models

OpenAI Series

Original ARGO Model Name	Argo Proxy Name
`gpt35`	`argo:gpt-3.5-turbo`
`gpt35large`	`argo:gpt-3.5-turbo-16k`
`gpt4`	`argo:gpt-4`
`gpt4large`	`argo:gpt-4-32k`
`gpt4turbo`	`argo:gpt-4-turbo`
`gpt4o`	`argo:gpt-4o`
`gpt4olatest`	`argo:gpt-4o-latest`
`gpto1preview`	`argo:gpt-o1-preview`, `argo:o1-preview`
`gpto1mini`	`argo:gpt-o1-mini`, `argo:o1-mini`
`gpto3mini`	`argo:gpt-o3-mini`, `argo:o3-mini`
`gpto1`	`argo:gpt-o1`, `argo:o1`
`gpto3`	`argo:gpt-o3`, `argo:o3`
`gpto4mini`	`argo:gpt-o4-mini`, `argo:o4-mini`
`gpt41`	`argo:gpt-4.1`
`gpt41mini`	`argo:gpt-4.1-mini`
`gpt41nano`	`argo:gpt-4.1-nano`

Google Gemini Series

Original ARGO Model Name	Argo Proxy Name
`gemini25pro`	`argo:gemini-2.5-pro`
`gemini25flash`	`argo:gemini-2.5-flash`

Anthropic Claude Series

Original ARGO Model Name	Argo Proxy Name
`claudeopus4`	`argo:claude-opus-4`, `argo:claude-4-opus`
`claudesonnet4`	`argo:claude-sonnet-4`, `argo:claude-4-sonnet`
`claudesonnet37`	`argo:claude-sonnet-3.7`, `argo:claude-3.7-sonnet`
`claudesonnet35v2`	`argo:claude-sonnet-3.5`, `argo:claude-3.5-sonnet`

Embedding Models

Original ARGO Model Name	Argo Proxy Name
`ada002`	`argo:text-embedding-ada-002`
`v3small`	`argo:text-embedding-3-small`
`v3large`	`argo:text-embedding-3-large`

Tool Calls

The tool calls (function calling) interface has been available since version v2.7.5.alpha1, now with native function calling support.

Native Function Calling Support

OpenAI models: Full native function calling support
Anthropic models: Full native function calling support
Gemini models: Native function calling support in development
OpenAI format: All input and output remains in OpenAI format regardless of underlying model

Availability

Available on both streaming and non-streaming chat completion endpoints
Only supported on /v1/chat/completions endpoint
Argo passthrough endpoint (/v1/chat) and response endpoint (/v1/chat/response) not yet implemented due to limited development time
Legacy completion endpoints (/v1/completions) do not support tool calling

Tool Call Examples

Function Calling OpenAI Client: function_calling_chat.py
Function Calling Raw Request: function_calling_chat.py

For more usage details, refer to the OpenAI documentation.

ToolRegistry

A lightweight yet powerful Python helper library is available for various tool handling: ToolRegistry. It works with any OpenAI-compatible API, including Argo Proxy starting from version v2.7.5.alpha1.

Examples

Raw Requests

For examples of how to use the raw request utilities (e.g., httpx, requests), refer to:

Direct Access to ARGO

Direct Chat Example: argo_chat.py
Direct Chat Stream Example: argo_chat_stream.py
Direct Embedding Example: argo_embed.py

OpenAI Compatible Requests

Chat Completions Example: chat_completions.py
Chat Completions Stream Example: chat_completions_stream.py
Legacy Completions Example: legacy_completions.py
Legacy Completions Stream Example: legacy_completions_stream.py
Responses Example: responses.py
Responses Stream Example: responses_stream.py
Embedding Example: embedding.py
o1 Mini Chat Completions Example: o1_mini_chat_completions.py

OpenAI Client

For examples demonstrating the use case of the OpenAI client (openai.OpenAI), refer to:

Chat Completions Example: chat_completions.py
Chat Completions Stream Example: chat_completions_stream.py
Legacy Completions Example: legacy_completions.py
Legacy Completions Stream Example: legacy_completions_stream.py
Responses Example: responses.py
Responses Stream Example: responses_stream.py
Embedding Example: embedding.py
O3 Mini Simple Chatbot Example: o3_mini_simple_chatbot.py

Bug Reports and Contributions

This project is developed in my spare time. Bugs and issues may exist. If you encounter any or have suggestions for improvements, please open an issue or submit a pull request. Your contributions are highly appreciated!

Name		Name	Last commit message	Last commit date
Latest commit History 614 Commits
dev_scripts		dev_scripts
docs		docs
examples		examples
src/argoproxy		src/argoproxy
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
config.sample.yaml		config.sample.yaml
pyproject.toml		pyproject.toml
run_app.sh		run_app.sh
timeout_examples.md		timeout_examples.md

License

Oaklight/argo-proxy

Folders and files

Latest commit

History

Repository files navigation

argo-openai-proxy

TL;DR

NOTICE OF USAGE

Deployment

Prerequisites

Configuration File

Running the Application

First-Time Setup

Configuration Options Reference

Streaming Modes: Real Stream vs Pseudo Stream

Real Stream (Default since v2.7.7)

Pseudo Stream

Configuration Examples

Function Calling Behavior

argo-proxy CLI Available Options

Management Utilities

Usage

Endpoints

OpenAI Compatible

Not OpenAI Compatible

Utility Endpoints

Timeout Override

Models

Chat Models

OpenAI Series

Google Gemini Series

Anthropic Claude Series

Embedding Models

Tool Calls

Native Function Calling Support

Availability

Tool Call Examples

ToolRegistry

Examples

Raw Requests

Direct Access to ARGO

OpenAI Compatible Requests

OpenAI Client

Bug Reports and Contributions

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 30

Uh oh!

Contributors 2

Uh oh!

Languages

`argo-proxy` CLI Available Options