Skip to content

Re-authorize submodule imports if top was allowed #1103

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Mar 31, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 0 additions & 3 deletions docs/source/en/guided_tour.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -181,9 +181,6 @@ agent = CodeAgent(tools=[], model=model, additional_authorized_imports=['request
agent.run("Could you get me the title of the page at url 'https://huggingface.co/blog'?")
```

Additionally, as an extra security layer, access to submodule is forbidden by default, unless explicitly authorized within the import list.
For instance, to access the `numpy.random` submodule, you need to add `'numpy.random'` to the `additional_authorized_imports` list.

> [!WARNING]
> The LLM can generate arbitrary code that will then be executed: do not add any unsafe imports!

Expand Down
15 changes: 11 additions & 4 deletions docs/source/en/tutorials/secure_code_execution.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -55,15 +55,22 @@ One could argue that on the [spectrum of agency](../conceptual_guides/intro_agen

So you need to be very mindful of security.

To improve safety, we propose a range of measures that propose elevated levels of security, at a higher setup cost.

We advise you to keep in mind that no solution will be 100% safe.

<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/code_execution_safety_diagram.png">

### Our local Python executor

To add a first layer of security, code execution in `smolagents` is not performed by the vanilla Python interpreter.
We have re-built a more secure `LocalPythonExecutor` from the ground up.

To be precise, this interpreter works by loading the Abstract Syntax Tree (AST) from your Code and executes it operation by operation, making sure to always follow certain rules:
- By default, imports are disallowed unless they have been explicitly added to an authorization list by the user.
- Furthermore, access to submodules is disabled by default, and each must be explicitly authorized in the import list as well.
- Note that some seemingly innocuous packages like `random` can give access to potentially harmful submodules, as in `random._os`.
- The total count of elementary operations processed is capped to prevent infinite loops and resource bloating.
- Any operation that has not been explicitly defined in our custom interpreter will raise an error.
- Note that some seemingly innocuous packages like `random` can give access to potentially harmful submodules, as in `random._os`.
- The total count of elementary operations processed is capped to prevent infinite loops and resource bloating.
- Any operation that has not been explicitly defined in our custom interpreter will raise an error.

You could try these safeguards as follows:

Expand Down
37 changes: 33 additions & 4 deletions src/smolagents/local_python_executor.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@
from collections.abc import Mapping
from functools import wraps
from importlib import import_module
from importlib.util import find_spec
from types import BuiltinFunctionType, FunctionType, ModuleType
from typing import Any, Callable, Dict, List, Optional, Set, Tuple

Expand Down Expand Up @@ -113,6 +114,20 @@ def custom_print(*args):
"complex": complex,
}

# Non-exhaustive list of dangerous modules that should not be imported
DANGEROUS_MODULES = [
"builtins",
"io",
"multiprocessing",
"os",
"pathlib",
"pty",
"shutil",
"socket",
"subprocess",
"sys",
]

DANGEROUS_FUNCTIONS = [
"builtins.compile",
"builtins.eval",
Expand Down Expand Up @@ -224,11 +239,25 @@ def _check_return(
result = func(expression, state, static_tools, custom_tools, authorized_imports=authorized_imports)
if "*" not in authorized_imports:
if isinstance(result, ModuleType):
if result.__name__ not in authorized_imports:
raise InterpreterError(f"Forbidden access to module: {result.__name__}")
for module_name in DANGEROUS_MODULES:
if (
module_name not in authorized_imports
and result.__name__ == module_name
# builtins has no __file__ attribute
and getattr(result, "__file__", "")
== (getattr(import_module(module_name), "__file__", "") if find_spec(module_name) else "")
):
raise InterpreterError(f"Forbidden access to module: {module_name}")
elif isinstance(result, dict) and result.get("__spec__"):
if result["__name__"] not in authorized_imports:
raise InterpreterError(f"Forbidden access to module: {result['__name__']}")
for module_name in DANGEROUS_MODULES:
if (
module_name not in authorized_imports
and result["__name__"] == module_name
# builtins has no __file__ attribute
and result.get("__file__", "")
== (getattr(import_module(module_name), "__file__", "") if find_spec(module_name) else "")
):
raise InterpreterError(f"Forbidden access to module: {module_name}")
elif isinstance(result, (FunctionType, BuiltinFunctionType)):
for qualified_function_name in DANGEROUS_FUNCTIONS:
module_name, function_name = qualified_function_name.rsplit(".", 1)
Expand Down
32 changes: 11 additions & 21 deletions tests/test_local_python_executor.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@
from smolagents.default_tools import BASE_PYTHON_TOOLS, FinalAnswerTool
from smolagents.local_python_executor import (
DANGEROUS_FUNCTIONS,
DANGEROUS_MODULES,
InterpreterError,
LocalPythonExecutor,
PrintContainer,
Expand All @@ -40,21 +41,6 @@
)


# Non-exhaustive list of dangerous modules that should not be imported
DANGEROUS_MODULES = [
"builtins",
"io",
"multiprocessing",
"os",
"pathlib",
"pty",
"shutil",
"socket",
"subprocess",
"sys",
]


# Fake function we will use as tool
def add_two(x):
return x + 2
Expand Down Expand Up @@ -526,6 +512,10 @@ def test_imports(self):
code = "from numpy.random import default_rng as d_rng\nrng = d_rng(12345)\nrng.random()"
result, _ = evaluate_python_code(code, BASE_PYTHON_TOOLS, state={}, authorized_imports=["numpy.random"])

# Test that importing numpy imports submodules
code = "import numpy as np\nnp.random.default_rng(12345)\nnp.random.random()"
result, _ = evaluate_python_code(code, BASE_PYTHON_TOOLS, state={}, authorized_imports=["numpy"])

def test_additional_imports(self):
code = "import numpy as np"
evaluate_python_code(code, authorized_imports=["numpy"], state={})
Expand Down Expand Up @@ -1815,7 +1805,7 @@ def test_vulnerability_via_importlib(self, additional_authorized_imports, expect
(
"import queue; queue.threading._os.system(':')",
[],
InterpreterError("Forbidden access to module: threading"),
InterpreterError("Forbidden access to module: os"),
),
(
"import queue; queue.threading._os.system(':')",
Expand All @@ -1831,7 +1821,7 @@ def test_vulnerability_via_importlib(self, additional_authorized_imports, expect
(
"import doctest; doctest.inspect.os.system(':')",
["doctest"],
InterpreterError("Forbidden access to module: inspect"),
InterpreterError("Forbidden access to module: os"),
),
(
"import doctest; doctest.inspect.os.system(':')",
Expand All @@ -1842,23 +1832,23 @@ def test_vulnerability_via_importlib(self, additional_authorized_imports, expect
(
"import asyncio; asyncio.base_events.events.subprocess",
["asyncio"],
InterpreterError("Forbidden access to module: asyncio.base_events"),
InterpreterError("Forbidden access to module: subprocess"),
),
(
"import asyncio; asyncio.base_events.events.subprocess",
["asyncio", "asyncio.base_events"],
InterpreterError("Forbidden access to module: asyncio.events"),
InterpreterError("Forbidden access to module: subprocess"),
),
(
"import asyncio; asyncio.base_events.events.subprocess",
["asyncio", "asyncio.base_events", "asyncio.events"],
["asyncio", "asyncio.base_events", "asyncio.base_events.events"],
InterpreterError("Forbidden access to module: subprocess"),
),
# sys submodule
(
"import queue; queue.threading._sys.modules['os'].system(':')",
[],
InterpreterError("Forbidden access to module: threading"),
InterpreterError("Forbidden access to module: sys"),
),
(
"import queue; queue.threading._sys.modules['os'].system(':')",
Expand Down