Performance and database issues

Hi,

We're having concourse jobs stuck in pending state for a very long time, all while seeing lots of slow queries on our postgres server.  Also seeing resources being slow to trigger, sometimes taking hours, even when `check_every` not set (defaulting to 1m).  When things get really bad, the list of jobs on the left in the GUI is not updated.

The first time this happened, we got some relief by reducing amount of logs being retained via `build_logs_to_retain` for a job on a 3m timer.

The problem recently struck again and we found another job that needed the `build_logs_to_retain` treatment.  However, once addressed, we still the pending state and slow triggers.  Also seeing the job run but it take minutes for tasks to get going.

Clues in database:
* Several instances of query `REFRESH MATERIALIZED VIEW CONCURRENTLY latest_completed_builds_per_job` are often running in parallel.  Nominally taking a few seconds when things seem fine, often minutes or over an hour when experiencing aforementioned issues.
* Load average on database often 30-40, with 8 cores at hand.
* Usually seeing 40-50 active connections into database, sometimes over 100 reported, from concourse
* Table `concourseinfrastructureprod.builds` went from 20000+ down to 4000 when we most recently removed/refactored a job with excessive build logs retained.

Other details:
* Concourse version: 3.8.0
* Deployment type: BOSH
* Infrastructure/IaaS: Openstack
* Browser (if applicable): Chrome
* Database - Postgres 9.5, 8 core CPU, no significant io wait
* 5 workers
* Concourse web and worker VMs all sized reasonably to our knowledge (low cpu/load)
* Did this used to work? - Yes

Please help the Concourse fans on my team keep using Concourse, so that mgmt doesn't force us on to Jenkins.  :)

Thanks!

Aaron

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Performance and database issues #2111

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Performance and database issues #2111

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions