OpenAI Chat Completions API

The OpenAI Chat Completions API enables chat interactions with OpenAI Large Language Models (LLMs). It can be used in systems that need to integrate chatbots or other language interactions based on Artificial Intelligence and offers a high degree of customization and control over the model’s behavior.

In addition, to help structure and monitor requests and responses, the API has the following interceptors already configured in its flow:

Header: adds or adjusts headers.
Log: there are two instances of this interceptor configured, one in the request flow and another in the response flow. Its function is to generate logs to allow visualization of the request and response.
JSON Schema Validation: validates the content in the request body (payload) according to the standard accepted by the API.
Rate Limit AI Tokens: allows you to control the consumption of tokens in applications that consume LLMs.

Importing the API

To use the OpenAI Chat Completions API, you first need to import it into your Platform by following the steps below:

Download the API documentation.
Unzip the OpenAI_Chat_Completions_API.zip folder.
Import the OpenAI_Chat_Completions_API.yaml file into the Platform by using the Import/Export feature.
Deploy the API in an environment.

For authentication, you must insert the API key received from OpenAI in the header of the request.

API Reference

General Description

Title: OpenAI Chat Completions API
Purpose: To allow interaction with language models to generate replies based on messages sent by users.
Base Path: /v1
Main Method: POST in the endpoint /chat/completions
Security: It uses authentication via Bearer Token in the request header.

Main feature

The endpoint /chat/completions allows you to generate a chat interaction. It:

Receives a list of messages in JSON format.
Uses these messages to interact with a language model chosen by the client.
Returns the replies generated by the model.

Input (Payload)

Format: JSON

Main fields:

model (string, required): Defines which language model will be used (e.g. gpt-4o).
messages (array, required): List of messages in the conversation, in which each message contains:
- role (string): Role of the author of the message, such as user, system or assistant.
- content (string): Textual content of the message.
max_tokens (integer, optional): Limits the number of tokens generated in the reply.
temperature (number, optional): Controls the creativity of the reply (smaller values generate more deterministic replies).
top_p (number, optional): Nucleus sampling to limit the cumulative probability of the generated words.
n (integer, optional): Number of replies that should be generated.
stop (array, optional): Defines sequences that terminate the text generation.
presence_penalty and frequency_penalty (numbers, optional): Penalize repetition and encourage originality.
logit_bias (object, optional): Adjusts the probability of specific words.
user (string, optional): Unique identifier to track a user’s interaction.

Response

Status codes:

200 OK: Response generated successfully. It returns:
- id (string): Unique identifier of the reply.
- choices (array): List of the generated replies, each with:
  - message: Message generated by the model, with role and content.
  - finish_reason: Reason for ending the generation (e.g. stop).
- usage: Usage metrics for tokens (prompt, completion, total).
400 Bad Request: Invalid request.
500 Internal Server Error: Internal server or service error.

Extra Settings

Timeout: 60 seconds.
API lifecycle: Available (status AVAILABLE).
Authentication: It requires a token in the Authorization header.

Thanks for your feedback!

EDIT

Share your suggestions with us!
Click here and then [+ Submit idea]