Skip to content

Proposal: Combine settings, metadata, static, etc. into a single datasette.yaml File #2093

@asg017

Description

@asg017

Very often I get tripped up when trying to configure my Datasette instances. For example: if I want to change the port my app listen too, do I do that with a CLI flag, a --setting flag, inside metadata.json, or an env var? If I want to up the time limit of SQL statements, is that under metadata.json or a setting? Where does my plugin configuration go?

Normally I need to look it up in Datasette docs, and I quickly find my answer, but the number of places where "config" goes it overwhelming.

  • Flat CLI flags like --port, --host, --cors, etc.
  • --setting, like default_page_size, sql_time_limit_ms etc
  • Inside metadata.json, including plugin configuration

Typically my Datasette deploys are extremely long shell commands, with multiple --setting and other CLI flags.

Proposal: Consolidate all "config" into datasette.toml

I propose that we add a new datasette.toml that combines "settings", "metadata", and other common CLI flags like --port and --cors into a single file. It would be similar to "Cargo.toml" in Rust projects, "package.json" in Node projects, and "pyproject.toml" in Python, etc.

A sample of what it could look like:

# "top level" configuration that are currently CLI flags on `datasette serve`
[config]
port = 8020
host = "0.0.0.0"
cors = true

# replaces multiple `--setting` flags
[settings]
base_url = "/app/datasette/"
default_allow_sql = true
sql_time_limit_ms = 3500

# replaces `metadata.json`.
# The contents of datasette-metadata.json could be defined in this file instead, but supporting separate files is nice (since those are easy to machine-generate)
[metadata]
include="./datasette-metadata.json"

# plugin-specific 
[plugins]
[plugins.datasette-auth-github]
client_id = {env = "DATASETTE_AUTH_GITHUB_CLIENT_ID"}
client_secret = {env = "GITHUB_CLIENT_SECRET"}

[plugins.datasette-cluster-map]

latitude_column = "lat"
longitude_column = "lon"

Pros

  • Instead of multiple files and CLI flags, everything could be in one tidy file
  • Editing config in a separate file is easier than editing CLI flags, since you don't have to kill a process + edit a command every time
  • New users will know "just edit my datasette.toml instead of needing to learn metadata + settings + CLI flags
  • Better dev experience for multiple environment. For example, could have datasette -c datasette-dev.toml for local dev environments (enables SQL, debug plugins, long timeouts, etc.), and a datasette -c datasette-prod.toml for "production" (lower timeouts, less plugins, monitoring plugins, etc.)

Cons

  • Yet another config-management system. Now Datasette users will need to know about metadata, settings, CLI flags, and datasette.toml. However with enough documentation + announcements + examples, I think we can get ahead of it.
  • If toml is chosen, would need to add a toml parser for Python version <3.11
  • Multiple sources of config require priority. For example: Would --setting default_allow_sql off override the value inside [settings]? What about --port?

Other Notes

Toml

I chose toml over json because toml supports comments. I chose toml over yaml because Python 3.11 has builtin support for it. I also find toml easier to work with since it doesn't have the odd "gotchas" that YAML has ("ex 3.10 resolving to 3.1, Norway NO resolving to false, etc.). It also mimics pyproject.toml which is nice. Happy to change my mind about this however

Plugin config will be difficult

Plugin config is currently in metadata.json in two places:

  1. Top level, under "plugins.[plugin-name]". This fits well into datasette.toml as [plugins.plugin-name]
  2. Table level, under "databases.[db-name].tables.[table-name].plugins.[plugin-name]. This doesn't fit that well into datasette.toml, unless it's nested under [metadata]?

Extensions, static, one-off plugins?

We could also include equivalents of --plugins-dir, --static, and --load-extension into datasette.toml, but I'd imagine there's a few security concerns there to think through.

Explicitly list with plugins to use?

I believe Datasette by default will load all install plugins on startup, but maybe datasette.toml can specify a list of plugins to use? For example, a dev version of datasette.toml can specify datasette-pretty-traces, but the prod version can leave it out

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions