Skip to content

Support for rate limited embeddings calculation #877

@martin-eder-zeiss

Description

@martin-eder-zeiss

For large input files, you might encounter this error from the Azure backend during embeddings calculation:

Error: Failed to create embedding

Caused by:
    0: Failed to call embeddings api
    1: Requests to the Embeddings_Create Operation under Azure OpenAI API version 2024-02-01 have exceeded call rate limit of your current OpenAI S0 pricing tier. Please retry after 55 seconds. Please go here: https://aka.ms/oai/quotaincrease if you would like to further increase the default rate limit. (code: 429)

Please provide a way to limit the requests per second. It would also be great if this would not lead directly to an exception, but to some kind of back-off behavior. There could be multiple users of a single endpoint, so you have no control over the total load and aichat could crash even if rate limiting is implemented.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions