Skip to content

=rem #18339 Use explicit handshake timeout #18542

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 19, 2015

Conversation

patriknw
Copy link
Contributor

  • instead of using transport failure detector
  • add a new config property akka.remote.handshake-timeout, but
    for netty.tcp and netty.ssl the existing netty.tcp.connection-timeout
    setting will be used

@patriknw
Copy link
Contributor Author

Refs #18339

@patriknw
Copy link
Contributor Author

This should be merged for 2.4.1, it is probably only a problem when using test transport.

case Event(HandshakeTimer, InboundUnassociated(_, wrappedHandle)) ⇒
sendDisassociate(wrappedHandle, Unknown)
stop(FSM.Failure(TimeoutReason("No response from remote for inbound association. Handshake timed out after " +
s"[${settings.HandshakeTimeout.toMillis} ms].")))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unsure about the timeout for inbound associations. It was not handled previously, and is perhaps not needed, but it might not hurt either?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should have this timeout.

@akka-ci akka-ci added the validating PR is currently being validated by Jenkins label Sep 23, 2015

# The timeout for outbound associations to perform the handshake.
# If the transport is akka.remote.netty.tcp or akka.remote.netty.ssl
# the configured connection-timeout for the transport will be used instead.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure about that. The connection-timeout relates to TCP events, which is handled by the kernel, while the Akka Protocol is at the JVM level. I am not sure these timeouts mix well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember a customer issue in which SSL was taking so long to init that connections timed out. So from that perspective, if I read this PR correctly it would be easier to tune for their case as well.

The error message would also have been nicer, so I'm for this PR, though maybe I'm missing some more fuzzy interplay with other remoting things

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking that otherwise the user would always have to change both those settings. When would you use different timeout here than the connection-timeout?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on a busy system the actual TCP connection might finish much quicker than a remoting handshake, but maybe I am just too paranoid. Less knobs to tune is maybe better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but isn't that same issue from the user's perspective? at least all sane users.

There is also a backwards compatibility aspect of this.
Let's say that someone has configured connection-timeout = 30s, then we can't suddenly introduce a new property that fail the connection attempt after 15s (which is the default for handshake-timeout)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both are good points, convinced :)

@drewhk
Copy link
Contributor

drewhk commented Sep 23, 2015

Can you add the two timeout test cases to AkkaProtocolSpec? That tests this actor in isolation.

@akka-ci akka-ci added needs-attention Indicates a PR validation failure (set by CI infrastructure) and removed validating PR is currently being validated by Jenkins labels Sep 23, 2015
@akka-ci
Copy link

akka-ci commented Sep 23, 2015

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://jenkins.akka.io/job/pr-validator-per-commit-jenkins/3730/

@patriknw patriknw force-pushed the wip-18339-handshake-timeout-patriknw branch from c19651c to 8418b14 Compare September 30, 2015 15:37
@patriknw
Copy link
Contributor Author

@drewhk I added a test for the timeouts. I'm a bit lost in those low level test utilities, but at least it tests something. See 8418b14

I decided to start the timer for outbound connections immediately when the actor is started so that also timeouts before WaitingHandshake are covered.

@akka-ci akka-ci added validating PR is currently being validated by Jenkins needs-attention Indicates a PR validation failure (set by CI infrastructure) and removed needs-attention Indicates a PR validation failure (set by CI infrastructure) validating PR is currently being validated by Jenkins labels Sep 30, 2015
@akka-ci
Copy link

akka-ci commented Sep 30, 2015

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://jenkins.akka.io/job/pr-validator-per-commit-jenkins/3784/

@patriknw
Copy link
Contributor Author

PLS BUILD

@patriknw
Copy link
Contributor Author

@drewhk please take a look at the latest here so this can be merged

@akka-ci akka-ci added validating PR is currently being validated by Jenkins and removed needs-attention Indicates a PR validation failure (set by CI infrastructure) labels Oct 19, 2015
@drewhk
Copy link
Contributor

drewhk commented Oct 19, 2015

This PR received the Seal of approval from the Remoting Department.

@akka-ci akka-ci added needs-attention Indicates a PR validation failure (set by CI infrastructure) and removed validating PR is currently being validated by Jenkins labels Oct 19, 2015
@akka-ci
Copy link

akka-ci commented Oct 19, 2015

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://jenkins.akka.io/job/pr-validator-per-commit-jenkins/3928/

@patriknw
Copy link
Contributor Author

thanks, ah bin compatibility issue

* instead of using transport failure detector
* add a new config property akka.remote.handshake-timeout, but
  for netty.tcp and netty.ssl the existing netty.tcp.connection-timeout
  setting will be used
* add test of the timeouts
* mima filter for internal ProtocolStateActor
@patriknw patriknw force-pushed the wip-18339-handshake-timeout-patriknw branch from 8418b14 to 94896e8 Compare October 19, 2015 12:35
@akka-ci akka-ci added validating PR is currently being validated by Jenkins tested PR that was successfully built and tested by Jenkins and removed needs-attention Indicates a PR validation failure (set by CI infrastructure) labels Oct 19, 2015
@akka-ci akka-ci removed the validating PR is currently being validated by Jenkins label Oct 19, 2015
@akka-ci
Copy link

akka-ci commented Oct 19, 2015

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://jenkins.akka.io/job/pr-validator-per-commit-jenkins/3932/

patriknw added a commit that referenced this pull request Oct 19, 2015
@patriknw patriknw merged commit 82f67f8 into master Oct 19, 2015
@patriknw patriknw deleted the wip-18339-handshake-timeout-patriknw branch October 19, 2015 14:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tested PR that was successfully built and tested by Jenkins
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants