Skip to content

Conversation

plukevdh
Copy link
Contributor

@plukevdh plukevdh commented Aug 6, 2025

Checklist for Pull Requests

  • All tests pass (yarn test:all)
  • Code follows the style guide and passes lint checks
  • Documentation is updated (README, docs, etc)
  • Linked to corresponding issue, if applicable

Summary of Changes

In super high throughput scenarios, queuing thousands of jobs a sec cause pool availability issues given the default knex pooling config. It'd be very helpful to expose the pooling config for the underlying Knex pooling config. This PR attempts to do that.

We could optionally expose some of the other config options, but those don't concern my needs at the moment. Happy to revise if it would help.

@plukevdh
Copy link
Contributor Author

plukevdh commented Aug 6, 2025

A quick aside (happy to open a separate issue): i can't find any docs on how to run test:all locally without postgres (or other DBs) running in a container or otherwise locally. I'd be super helpful for future contributors to have a docker compose or other instructions on what is required to perform all the tests.

/**
* Configuration for PostgreSQL backend that supports full Knex configuration options.
*/
export type PostgresBackendConfig = Pick<Knex.Config, "pool" | "connection">;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal here is to limit the options we can pass to Knex. I didnt want to allow overriding everything, just the pool and connection options (which you already partially allow via the Knex.ConnectionConfig typing).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to combine this into the other test file. I just split as it was a separate flow.

@merencia
Copy link
Contributor

merencia commented Aug 6, 2025

Hello @plukevdh thanks for the PR.

About having a docker compose, it's valid.

Currently we have a task yarn db:all that starts some docker containers without a docker compose. Have you tried it?

@GiovaniGuizzo
Copy link
Contributor

A quick aside (happy to open a separate issue): i can't find any docs on how to run test:all locally without postgres (or other DBs) running in a container or otherwise locally. I'd be super helpful for future contributors to have a docker compose or other instructions on what is required to perform all the tests.

Indeed, I think we haven't described it properly.
@merencia I'm creating a new issue to better describe development and tests. We do have a docs page, but it's not comprehensive.

@plukevdh
Copy link
Contributor Author

plukevdh commented Aug 6, 2025

Currently we have a task yarn db:all that starts some docker containers without a docker compose. Have you tried it?

Have not! Didn't see it, but I can try it. I ran the tasks in GH Actions in my own repo and they passed. I'll give that a try as well. Thanks!

@plukevdh
Copy link
Contributor Author

plukevdh commented Aug 6, 2025

using db:all worked and all tests pass now. thanks for the quick guidance!

@merencia
Copy link
Contributor

merencia commented Aug 6, 2025

Also, I was thinking about the connection pool.
Currently, the dashboard creates a new backend instance to access the database, reusing the config provided during start().

I don't think having two separate pools is necessarily a problem, but maybe we should allow a dedicated configuration specifically for the dashboard connection.

Anyways we may tackle that in another issue.

@plukevdh
Copy link
Contributor Author

plukevdh commented Aug 6, 2025

Currently, the dashboard creates a new backend instance to access the database, reusing the config provided during start().

So a couple of usage comments to that end. I am currently trying to spin out three distinct components from this project given how projects like bullmq work:

  1. I want to be able to queue jobs independently of job runners or the frontend dashboard. Here, throughput needs to happen as fast as possible, so pooling is pretty critical here.
  2. I want to be able to scale my worker pool independent of the FE or the queuing input point. How fast jobs work are not as critical as how fast i can get them into the queue. The point of backend jobs not how fast a thing gets worked as it is that the frontend doesn't get held up. I can scale to meet number of background jobs more easily than I can scale the frontend to meet the demand of the job input. Pooling is generally not as critical here, but still generally needs to be considered as it relates to my DB max conn counts.
  3. I'd love to be able to run the FE separate from the workers OR queuing endpoint, and typically, this only needs a single connection.

The way this project is architected at the moment, these three components are somewhat tied together. The dashboard can run independently, but still requires workers being run which is less than ideal. For the workers and queuing components, I can just dashboard: { enabled: false } and then choose which of the three components will be responsible for migrations.

@plukevdh
Copy link
Contributor Author

plukevdh commented Aug 6, 2025

I realize my goals are maybe not the goals of this project. The upside of what you all have here is a super simple, single runtime model that contains the queuing mechanism, the workers, and a dashboard all in a single process. I'm looking at this project particularly because of the simplicity and efficiency of the data model. In high throughput tests, knex blows up because of connection pool starvation when trying to queue jobs above like 2-3k req/sec.

@merencia
Copy link
Contributor

merencia commented Aug 6, 2025

@plukevdh just to clarify a few things:

  • Sidequest doesn't run everything in a single process. When you call start(), it forks a separate process that acts as the main Sidequest runtime. That process manages job scheduling, queuing, and dispatching. Jobs themselves are executed in isolated threads via Piscina, so job execution is sandboxed and doesn’t block your main app.

  • The sidequest package is just a facade that wires up the engine, dashboard, and config for simpler usage. But you can use @sidequest/engine, @sidequest/dashboard, giving you full control to:

    • enqueue jobs from one process (e.g., your API server)
    • run workers in a separate service
    • host the dashboard in its own process
  • You can configure the engine without starting it. That allows you to load your queues and call enqueue() from any process without running schedulers or workers.

This isn’t in the docs yet, but we’ll make sure to add it soon.

@GiovaniGuizzo
Copy link
Contributor

GiovaniGuizzo commented Aug 6, 2025

Currently, the dashboard creates a new backend instance to access the database, reusing the config provided during start().

So a couple of usage comments to that end. I am currently trying to spin out three distinct components from this project given how projects like bullmq work:

  1. I want to be able to queue jobs independently of job runners or the frontend dashboard. Here, throughput needs to happen as fast as possible, so pooling is pretty critical here.
  2. I want to be able to scale my worker pool independent of the FE or the queuing input point. How fast jobs work are not as critical as how fast i can get them into the queue. The point of backend jobs not how fast a thing gets worked as it is that the frontend doesn't get held up. I can scale to meet number of background jobs more easily than I can scale the frontend to meet the demand of the job input. Pooling is generally not as critical here, but still generally needs to be considered as it relates to my DB max conn counts.
  3. I'd love to be able to run the FE separate from the workers OR queuing endpoint, and typically, this only needs a single connection.

The way this project is architected at the moment, these three components are somewhat tied together. The dashboard can run independently, but still requires workers being run which is less than ideal. For the workers and queuing components, I can just dashboard: { enabled: false } and then choose which of the three components will be responsible for migrations.

  1. If you run Sidequest.configure({ ... }) you can use Sidequest to enqueue jobs and starting the dashboard. Like Lucas said, you can start only the dashboard if you use SidequestDashboard directly (it's from @sidequest/dashboard but re-exported in sidequest) . I have created a PR to describe this behavior in the docs.
  2. I see. Well, if you want to have the absolute best performance, we suggest you simply run Sidequest.configure({ ... }). It will create a single DB connection and start no workers in your API. We haven't tested such a high throughput of thousands/s while enqueueing. This is something we should definitely investigate. But I find it hard to be feasible the way things are, with a single Sidequest instance handling all inputs. If you absolutely MUST, you can use @sidequest/engine for that. If you instantiate and configure multiple engines with the same config (not start, just configure), then you will have a pool of engines ready to enqueue, each with it's own Knex pool and config.
  3. You can :). If you need to:
    3.1. Run only the dashboard: use SidequestDashboard directly.
    3.2. Run only the enqueuer: use Sidequest.configure
    3.3. Run only the worker: use Sidequest.start({ dashboard: { enabled: false } })

Maybe we need to rethink those entrypoints to make it easier to understand. It all makes sense in our heads 😂

@GiovaniGuizzo
Copy link
Contributor

BTW, feel free to merge this PR :)

@GiovaniGuizzo GiovaniGuizzo merged commit 7db5d6b into sidequestjs:master Aug 6, 2025
2 checks passed
sidequest-release bot pushed a commit that referenced this pull request Aug 6, 2025
# [1.3.0](v1.2.0...v1.3.0) (2025-08-06)

### Features

* add pooling control for PG knex config ([#53](#53)) ([7db5d6b](7db5d6b))
@sidequest-release
Copy link

🎉 This PR is included in version 1.3.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants