-
-
Notifications
You must be signed in to change notification settings - Fork 866
pgx Migration (Removing lib/pq) #9066
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
there's no need to cast the int array into a pg type array. Seems pgx does the work for us! Signed-off-by: Taylor Silva <dev@taydev.net>
had to create our own implementation of pq's listener. Thankfully we had that wrapped in an interface already. While digging into the pq implementation I found we couldn't implement things in the exact same way they did. They actually received notifications by using a sort of "escape hatch" they built into their driver. They had a func that would read every single bit of data sent to the driver. It would then check the type of the data going through the driver and if it was a "Notification" then it would read the raw data from a bufffer in the driver. We cannot do the same thing with the Pgx driver. I'm not 100% sure my implementation is good. It's passing most tests, but I'm not sure how robust it is. I have no clue if pgx will recover from connection errors for us for example. Signed-off-by: Taylor Silva <dev@taydev.net>
This interface is embedded in the LogConn struct. Because we expect the interface to have a method named "Conn" this resulted in a naming conflict when the interface is embedded because the method name now conflicts with the embedded name of the interface... yeah that's confusing to explain. Anyways, had to rename Conn so LogConn could satisfy the Conn interface, that's the gist of this commit. Signed-off-by: Taylor Silva <dev@taydev.net>
Signed-off-by: Taylor Silva <dev@taydev.net>
Signed-off-by: Taylor Silva <dev@taydev.net>
Signed-off-by: Taylor Silva <dev@taydev.net>
Signed-off-by: Taylor Silva <dev@taydev.net>
and further decrease sleep time Signed-off-by: Taylor Silva <dev@taydev.net>
Signed-off-by: Taylor Silva <dev@taydev.net>
Signed-off-by: Taylor Silva <dev@taydev.net>
using a pool so we can easily reset the connection if needed. the lib/pq listener had this behaviour. We're just trying to replicate it here. Using exponential backoff which will timeout after 15mins. We use this package in other places in our codebase. Signed-off-by: Taylor Silva <dev@taydev.net>
Signed-off-by: Taylor Silva <dev@taydev.net>
Signed-off-by: Taylor Silva <dev@taydev.net>
in this case we had row.BuildID with a type of int. The column we were trying to insert that into was of type "text" in postgres. We have to convert the int to a string first before passing it into the query. Signed-off-by: Taylor Silva <dev@taydev.net>
WOW is the pgx driver ANAL about types when it comes to arrays. So in postgres, if `array_agg()` returns nothing, you get back `{NULL}`. So this _should_ cast just fine to the sql.NullString type from the Go sql package. Pgx cannot figure out how to cast to this type at all, you just get errors like this: sql: Scan error on column index 3, name "passed": cannot convert to sql.Scanner: cannot find registered type for *[]sql.NullString and initially I was getting this error: sql: Scan error on column index 3, name "passed": unsupported Scan, storing driver.Value type string into type *[]sql.NullString In that second error I found it weird that it said it was trying to turn a type `string` into type `*[]sql.NullString`. I kept playing around with the types and using dlv to debug. I eventually figured out pgx would safely scan into a slice of type `[]*string`, so an array that contains pointers to strings, which is a VERY annoying array to use! I eventually figured out this is because `array_agg()` returns `{NULL}` when it has no results. I figured out that by wrapping `array_agg()` with `array_remove(array_agg(), NULL)` I could then get pgx to cast to a normal `[]string` slice. A bunch of tests still failed a this point because `Passed` on the various structs would be non-nil. This is why there are length checks on the slices and we set the var to `nil` if the length is zero before we pass it to the returning struct. I think this is actually a bit of a bug in pgx. This workaround is not bad, but was VERY annoying to figure out, especially when it worked just fine with lib/pq. Signed-off-by: Taylor Silva <dev@taydev.net>
Signed-off-by: Taylor Silva <dev@taydev.net>
Signed-off-by: Taylor Silva <dev@taydev.net>
Signed-off-by: Taylor Silva <dev@taydev.net>
Signed-off-by: Taylor Silva <dev@taydev.net>
also learned that pgx does not support sending multiple commands in a single statement. You have to send it in two separate statements: jackc/pgx#1090 Signed-off-by: Taylor Silva <dev@taydev.net>
It seems the pgx driver gives up on the initial connection very quickly. Guessing lib/pq made some re-attempts which is why we never saw this behaviour previously. Signed-off-by: Mathias Bogaert <mathias.bogaert@gmail.com> Co-authored--by: Taylor Silva <dev@taydev.net>
Disables fsync, full_page_writes and synchronous_commit for Postgres in CI environments. This greatly improves performance by removing disk durability guarantees which aren't needed in ephemeral test environments.
Sets random_page_cost=1.1 which tells PostgreSQL's query planner that random disk access is relatively fast, as is typical with SSDs. This encourages the planner to consider index scans over sequential scans when appropriate.
Sets random_page_cost=1.1 which tells PostgreSQL's query planner that random disk access is relatively fast, as is typical with SSDs. This encourages the planner to consider index scans over sequential scans when appropriate.
…tion in PgxListener With the locks and one channel, we kept ending up in a deadlocking situation where one goroutine had one lock and another goroutine had another lock OR a goroutine would get stuck waiting for a message from the opsDone channel when there was zero chance of any message being sent from another goroutine. Deadlock! Definitely due to bad coding/design on my part. I couldn't figure out a nice way to make the two locking thing work and not eventually deadlock. Decided to go back to figuring out how to do the whole thing with channels, which I had tried earlier but failed at. I re-read Effective Go's piece about concurrency and "share by communicating": https://go.dev/doc/effective_go#concurrency > Do not communicate by sharing memory; instead, share memory by communicating. It got me in the right mindset and I was able to put together the setup below. There is now zero chance that someone will be blocked sending or receiving on a channel now. And we have less vars now! We went from two shared locks and one shared channel, down to one shared channel, and a second private channel just for listenerLoop() to keep track of its stuff. I think there is some potential to refactor listenerLoop() here. I would love to add a unit test here that could reproduce the race condition stuff as well, but not sure the best way to achieve that. Regardless, all the db tests pass and testflight is passing now too, which is the test suite which originally caught this error. Signed-off-by: Taylor Silva <dev@taydev.net>
there was a race condition here where we would try listening for notifications with a nil conn. Had to make sure the check on the connection was AFTER we were given back control from the comms channel. Also made it explicit that l.conn is set to nil during close() even though it appears Release() does this for us. This is to avoid any race conditions with Release() Signed-off-by: Taylor Silva <dev@taydev.net>
analytically
approved these changes
Feb 18, 2025
taylorsilva
added a commit
to concourse/flag
that referenced
this pull request
Feb 18, 2025
We've changed the driver we're using upstream from lib/pq to Pgx. concourse/concourse#9066 It seems this setting was only needed for folks using PgBouncer and was based on info from a 2019 blog post: - concourse/concourse#6936 - https://blog.bullgare.com/2019/06/pgbouncer-and-prepared-statements/ For PgBouncer to work with pgx it was previously suggested that one change the Pgx ExecMode to the SimpleProtocol. Recent pgx user reports indicate that >1.21.0 of PgBouncer should work without any config changes though: jackc/pgx#1784 The most recent 1.24.0 PgBouncer release also indicates that Prepared statements are now supported: https://www.pgbouncer.org/2025/01/pgbouncer-1.24.0 Therefore I don't think we need to change any configuration at the sql driver level now. We definitely don't need the BinaryParameters flag anymore since it was specific to lib/pq and doesn't work at all with pgx. Trying to set this flag with pgx results in an error. Signed-off-by: Taylor Silva <dev@taydev.net>
This was referenced Feb 18, 2025
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Changes proposed by this PR
This PR removes all usage of lib/pq and replaces it with pgx. Everything is still wrapped by the
database/sql
interface which made the transition mostly easy. Further work could be done to move to the pgx interface which would provide significant performance gains when it comes to scanning data from Postgresql into Go structs.What we do get with switching to pgx right now is support for newer Postgres features. The recent example of this was the addition of binary_params which ended up not really working with
lib/pq
.Notes to reviewer
The hardest part of this transition was replacing
lib/pq
's Listener. pgx did not come with a similar interface and only provides a single blocking function WaitForNotification(). This function alone does not provide the same concurrent functionality oflib/pq
's Listener interface.I had to implement the
Listener
interface and it's existing behaviour using pgx. This work went through two iterations, all preserved in this PR.Iteration #1: Using
sync.Mutex
I got to a working version of the Listener that would pass all units tests. Running the
testflight
suite would fail though. My implementation had two locks and eventually two goroutines would deadlock with each other. This resulted in builds appearing to run but never actually running. This deadlocking and trying to manage two locks is why I eventually gave up using Locks.Iteration #2: Channels; Share by communicating
The current implementation uses channels to control who can use the single connection object. Once I had wrapped my head around channels (again) this proved to be a much more straight forward implementation and easier to reason about. It's working, passing all unit and integration tests.
Release Note