Skip to content

Conversation

erikjohnston
Copy link
Member

Currently SyTest waits for synapse process to fully start by repeatedly
checking if it can connect to the HTTP socket. This won't work for
starting worker processest that do not listen on any HTTP socket.

Instead, we use the sd_notify mechanism. This works by SyTest binding
to a unix socket, passing the socket address to Synapse via env, and
waiting to receive a READY=1 notification on the socket.

Currently SyTest waits for synapse process to fully start by repeatedly
checking if it can connect to the HTTP socket. This won't work for
starting worker processest that do not listen on any HTTP socket.

Instead, we use the `sd_notify` mechanism. This works by SyTest binding
to a unix socket, passing the socket address to Synapse via env, and
waiting to receive a `READY=1` notification on the socket.
@erikjohnston erikjohnston requested a review from a team August 28, 2020 13:48
@erikjohnston
Copy link
Member Author

Oh, this does't work with pydron of course. Whoops

@erikjohnston erikjohnston removed the request for review from a team August 28, 2020 14:15
@erikjohnston
Copy link
Member Author

Ok, so I've made nothing use it yet, and instead will use it when I rejig things to remove pydron.py.

@erikjohnston erikjohnston requested a review from a team August 28, 2020 15:12
Comment on lines 286 to 294
)->else_with_f( sub {
my ( $f ) = @_;

# We need to manually kill child procs here as we don't seem to have
# registered the on finish handler yet.
$self->kill_and_await_finish()->then( sub {
$f
})
} );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this feels wrong, for a couple of reasons:

  • failure to start one subprocess should not necessarily mean that all other subprocesses get killed (at least at the abstract level of a ProcessManager class)
  • it's asymmetric with _start_process_and_await_connectable.

what is the "finish handler" of which the comment speaks?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, turns out that there is a timeout several layers up. Its set to 60s so I probably got bored waiting before it fired :/

Comment on lines 319 to 326
# Create a random abstract socket name. Abstract sockets start with a null
# byte.
my $random_id = join "", map { chr 65 + rand 25 } 1 .. 20;
my $path = "\0sytest-$random_id.sock";

# We replace null byte with '@' to allow us to pass it in via env. (This is
# as per the sd_notify spec).
$env->{"NOTIFY_SOCKET"} = $path =~ s/\0/\@/rg;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cleaner to do this outside _await_ready_notification (possibly in another helper function) and just pass in the $path to _await_ready_notification ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a look at this, and basically either a) you move it to a function that accepts $env which I don't think buys you very much, or b) move this block of code up to the _start_process_and_await.. function which just makes it a bit messy.

I kind of like have all the sd notify bits in one place tbh. Maybe it'd be better to rename it _create_socket_and_await_notification or something?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, maybe. whatever.

Comment on lines 319 to 326
# Create a random abstract socket name. Abstract sockets start with a null
# byte.
my $random_id = join "", map { chr 65 + rand 25 } 1 .. 20;
my $path = "\0sytest-$random_id.sock";

# We replace null byte with '@' to allow us to pass it in via env. (This is
# as per the sd_notify spec).
$env->{"NOTIFY_SOCKET"} = $path =~ s/\0/\@/rg;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks like it modifies $path itself, which isn't what you want?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aahaha. Hahaha. Ha.

😭

The r flag there means don't overwrite the variable. Apparently that is the way that you do non-destructive replace or something. Would my $path_env = $path; $path_env =~ s/\0/\@/g; work, do you know?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh right. Yes, that should work. As, I think, should ($env->{"NOTIFY_SOCKET"} = $path) =~ s/\0/\@/g;

erikjohnston and others added 3 commits September 2, 2020 13:28
Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
Turns out there is already a time out.
Copy link
Member

@richvdh richvdh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@erikjohnston erikjohnston merged commit f209bce into develop Sep 2, 2020
@erikjohnston erikjohnston deleted the erikj/sd_notify_start branch September 2, 2020 17:08
pull bot pushed a commit to valkum/sytest that referenced this pull request Sep 2, 2020
Currently SyTest waits for synapse process to fully start by repeatedly
checking if it can connect to the HTTP socket. This won't work for
starting worker processest that do not listen on any HTTP socket.

Instead, we use the `sd_notify` mechanism. This works by SyTest binding
to a unix socket, passing the socket address to Synapse via env, and
waiting to receive a `READY=1` notification on the socket.
anoadragon453 added a commit that referenced this pull request Oct 21, 2020
…c_release_1_21_x

* origin/release-v1.20.1: (27 commits)
  Pin Alien::Sodium to an old version, to fix install errors (#949)
  Disable rate limiting on Dendrite (#947)
  Wait for synapse to start by using sd_notify. (#943)
  Fix bad tests (#939)
  Add ability to exclude tests for deprecated endpoints (#881)
  Abide by the spec when making /keys/query requests
  When putting device keys, may the JSON body spec compliant
  Remove trailing slashes from /state_ids and /backfill in ACL tests (#936)
  Fix /send server ACL test to match spec (#935)
  Call matrix_get_e2e_keys correctly
  Removed unused params from pydron invocation (#931)
  Deflake 'Server correctly resyncs when server leaves and rejoins a room' (#932)
  Dendrite configuration format v1 (#921)
  Disable TLS validation for Dendrite
  Set Dendrite loglevel to trace (#933)
  Merge Synapse::ViaHaproxy and -I Synapse::Dendron (#930)
  Use ProcessManager in Synapse.pm (#929)
  Add redis support to synapse docker image (#928)
  Update the README for the docker images (#927)
  Some minor cleanups to the startup scripts (#926)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants