Skip to content

bug: AssertionError with input_spec when calling BentoML service from Gradio #5418

@restato

Description

@restato

Describe the bug

I am encountering an AssertionError when attempting to call a BentoML service from a Gradio UI, specifically when the @api decorator for the service's predict method includes an input_spec argument. The error message is AssertionError: assert issubclass(cls, IODescriptor).

To reproduce

Define a Pydantic BaseModel for the input, Input, and an Output model

from pydantic import BaseModel, Field
# Assuming ResponseModel is defined elsewhere or a simple placeholder
class ResponseModel(BaseModel):
    category: str
    # Add other fields as necessary based on your actual ResponseModel

class Input(BaseModel):
    env: str
    user_id: str
    team_name: str
    log_type: str
    event: str
    log_hour: int

Define a BentoML service with a predict method decorated with @api and input_spec=Input:

import bentoml
import typing as t
from bentoml import api

# ... (Input and Output models as defined above)

class QuickStart(bentoml.Service):

    @api(input_spec=Input) # This line causes the error
    def predict(self, **params: t.Any) -> ResponseModel: # Assuming Output is ResponseModel
        # Input preprocessing
        params = Input(**params)
        # ... actual prediction logic
        return ResponseModel(category="some_category") # Placeholder

Create a Gradio interface that calls this service:

import bentoml
# ... (Input model defined above)

def predict(env, user_id, team_name, log_type, event, log_hour):
    data = Input(
        env=env,
        user_id=user_id,
        team_name=team_name,
        log_type=log_type,
        event=event,
        log_hour=log_hour,
    )

    service = bentoml.get_current_service()
    response = service.predict(**data.model_dump())
    return response.data.category

Run the Gradio UI and attempt to make a prediction.

Expected behavior

The predict method should be called successfully, and the Gradio UI should display the prediction result without an AssertionError.

Actual Behavior

The following AssertionError occurs:

Traceback (most recent call last):
  File "/Users/junsu.lee/workspace/ml-projects/projects/quick_start/serving/.venv/lib/python3.11/site-packages/gradio/queueing.py", line 625, in process_events
    response = await route_utils.call_process_api(
  File "/Users/junsu.lee/workspace/ml-projects/projects/quick_start/serving/.venv/lib/python3.11/site-packages/gradio/route_utils.py", line 322, in call_process_api
    output = await app.get_blocks().process_api(
  File "/Users/junsu.lee/workspace/ml-projects/projects/quick_start/serving/.venv/lib/python3.11/site-packages/gradio/blocks.py", line 2220, in process_api
    result = await self.call_function(
  File "/Users/junsu.lee/workspace/ml-projects/projects/quick_start/serving/.venv/lib/python3.11/site-packages/gradio/blocks.py", line 1731, in call_function
    prediction = await anyio.to_thread.run_sync(  # type: ignore
  File "/Users/junsu.lee/workspace/ml-projects/projects/quick_start/serving/.venv/lib/python3.11/site-packages/anyio/to_thread.py", line 56, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
  File "/Users/junsu.lee/workspace/ml-projects/projects/quick_start/serving/.venv/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 2470, in run_sync_in_worker_thread
    return await future
  File "/Users/junsu.lee/workspace/ml-projects/projects/quick_start/serving/.venv/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 967, in run
    result = context.run(func, *args)
  File "/Users/junsu.lee/workspace/ml-projects/projects/quick_start/serving/.venv/lib/python3.11/site-packages/gradio/utils.py", line 904, in wrapper
    response = f(*args, **kwargs)
  File "/Users/junsu.lee/workspace/ml-projects/projects/quick_start/serving/gradio_ui.py", line 39, in predict
    response = service.predict(**data.model_dump())
  File "/Users/junsu.lee/workspace/ml-projects/projects/quick_start/serving/.venv/lib/python3.11/site-packages/_bentoml_impl/client/base.py", line 54, in method
    return self.call(name, *args, **kwargs)
  File "/Users/junsu.lee/workspace/ml-projects/projects/quick_start/serving/.venv/lib/python3.11/site-packages/_bentoml_impl/client/proxy.py", line 118, in call
    return self._sync.call(__name, *args, **kwargs)
  File "/Users/junsu.lee/workspace/ml-projects/projects/quick_start/serving/.venv/lib/python3.11/site-packages/_bentoml_impl/client/http.py", line 401, in call
    return self._call(endpoint, args, kwargs)
  File "/Users/junsu.lee/workspace/ml-projects/projects/quick_start/serving/.venv/lib/python3.11/site-packages/_bentoml_impl/client/http.py", line 593, in _call
    req = self._build_request(endpoint, args, kwargs, headers or {})
  File "/Users/junsu.lee/workspace/ml-projects/projects/quick_start/serving/.venv/lib/python3.11/site-packages/_bentoml_impl/client/http.py", line 211, in _build_request
    model = endpoint.input_spec.from_inputs(*args, **kwargs)
  File "/Users/junsu.lee/workspace/ml-projects/projects/quick_start/serving/.venv/lib/python3.11/site-packages/_bentoml_sdk/io_models.py", line 183, in from_inputs
    assert issubclass(cls, IODescriptor)
AssertionError

Environment

Environment variable

BENTOML_DEBUG=''
BENTOML_QUIET=''
BENTOML_BUNDLE_LOCAL_BUILD=''
BENTOML_DO_NOT_TRACK=''
BENTOML_CONFIG=''
BENTOML_CONFIG_OPTIONS=''
BENTOML_PORT=''
BENTOML_HOST=''
BENTOML_API_WORKERS=''

System information

bentoml: 1.4.15
python: 3.11.11
platform: macOS-15.3.2-arm64-arm-64bit
uid_gid: 501:20

pip_packages
a2wsgi==1.10.10
aiofiles==24.1.0
aiohappyeyeballs==2.6.1
aiohttp==3.12.14
aiosignal==1.4.0
aiosqlite==0.21.0
alembic==1.16.4
annotated-types==0.7.0
anyio==4.9.0
appdirs==1.4.4
asgiref==3.9.1
async-timeout==5.0.1
attrs==25.3.0
bentoml==1.4.15
blinker==1.9.0
boto3==1.36.26
botocore==1.36.26
cattrs==23.1.2
cbor2==5.4.6
certifi==2025.7.14
charset-normalizer==3.4.2
click==8.1.3
click-option-group==0.5.7
cloudpickle==3.1.1
colorlog==6.7.0
contourpy==1.3.2
cycler==0.12.1
databricks-cli==0.18.0
deprecated==1.2.18
docker==6.1.3
entrypoints==0.4
fakeredis==1.9.2
fastapi==0.116.1
ffmpy==0.6.0
filelock==3.18.0
flask==3.1.1
fonttools==4.58.5
frozenlist==1.7.0
fs==2.4.16
fsspec==2025.5.1
gitdb==4.0.12
gitpython==3.1.44
gradio==5.34.2
gradio-client==1.10.3
groovy==0.1.2
gunicorn==21.2.0
h11==0.16.0
hf-xet==1.1.5
httpcore==1.0.9
httpx==0.28.1
httpx-ws==0.7.2
huggingface-hub==0.33.4
idna==3.10
importlib-metadata==7.2.1
itsdangerous==2.2.0
jinja2==3.1.6
jmespath==1.0.1
joblib==1.5.1
jsonpatch==1.33
jsonpointer==3.0.0
kafka-python==2.0.3
kantoku==0.18.3
kiwisolver==1.4.8
langchain-core==0.3.68
langsmith==0.4.5
lightning-utilities==0.14.3
llvmlite==0.44.0
mako==1.3.10
markdown==3.8.2
markdown-it-py==3.0.0
markupsafe==3.0.2
matplotlib==3.10.3
mdurl==0.1.2
mlflow==2.9.2
mpmath==1.3.0
multidict==6.6.3
networkx==3.5
numba==0.61.2
numpy==1.26.4
nvidia-ml-py==12.575.51
oauthlib==3.3.1
opentelemetry-api==1.35.0
opentelemetry-instrumentation==0.56b0
opentelemetry-instrumentation-aiohttp-client==0.56b0
opentelemetry-instrumentation-asgi==0.56b0
opentelemetry-sdk==1.35.0
opentelemetry-semantic-conventions==0.56b0
opentelemetry-util-http==0.56b0
orjson==3.10.18
packaging==23.2
pandas==2.1.3
pathspec==0.12.1
pillow==11.3.0
pip-requirements-parser==32.0.1
polars==0.19.19
prometheus-client==0.22.1
prompt-toolkit==3.0.51
propcache==0.3.2
protobuf==4.25.8
psutil==7.0.0
pyarrow==14.0.2
pydantic==2.8.2
pydantic-core==2.20.1
pydub==0.25.1
pygments==2.19.2
pyjwt==2.10.1
pymysql==1.1.1
pyparsing==3.2.3
python-dateutil==2.9.0.post0
python-dotenv==1.1.1
python-json-logger==3.3.0
python-multipart==0.0.20
pytorch-lightning==2.0.6
pytz==2023.4
pyyaml==6.0.2
pyzmq==27.0.0
querystring-parser==1.2.4
questionary==2.1.0
redis==4.3.4
requests==2.32.4
requests-toolbelt==1.0.0
rich==14.0.0
ruff==0.12.3
s3transfer==0.11.3
safehttpx==0.1.6
schema==0.7.7
scikit-learn==1.7.0
scipy==1.16.0
semantic-version==2.10.0
sentry-sdk==1.40.6
setuptools==80.9.0
shellingham==1.5.4
simple-di==0.1.5
six==1.17.0
smmap==5.0.2
sniffio==1.3.1
sortedcontainers==2.4.0
sqlalchemy==2.0.41
sqlparse==0.5.3
starlette==0.47.1
sympy==1.14.0
tabulate==0.9.0
tenacity==9.1.2
threadpoolctl==3.6.0
tomli-w==1.2.0
tomlkit==0.13.3
torch==2.1.1
torchmetrics==1.7.4
tornado==6.5.1
tqdm==4.67.1
trino==0.324.0
typer==0.16.0
typing-extensions==4.14.1
tzdata==2025.2
tzlocal==5.3.1
urllib3==2.5.0
uvicorn==0.35.0
watchfiles==1.1.0
wcwidth==0.2.13
websocket-client==1.8.0
websockets==15.0.1
werkzeug==3.1.3
wrapt==1.17.2
wsproto==1.2.0
yarl==1.20.1
zipp==3.23.0
zstandard==0.23.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions