Skip to content

[ssh/ssh_session] different timeouts for connect & established #3971

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 18, 2025

Conversation

xmkg
Copy link
Member

@xmkg xmkg commented Mar 7, 2025

Sometimes, ssh_connect blocks while a VM is booting up, where it won't return to the caller at all. Since the timeout is set to ~infinity the ssh_connect would block forever.

This patch fixes that by using two different timeout values for the different session connection states.

MULTI-1863

@xmkg xmkg requested review from georgeliao and Sploder12 March 7, 2025 15:12
Copy link

codecov bot commented Mar 7, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 89.13%. Comparing base (1c07b6e) to head (f7364a8).
Report is 3532 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #3971   +/-   ##
=======================================
  Coverage   89.13%   89.13%           
=======================================
  Files         257      257           
  Lines       14626    14630    +4     
=======================================
+ Hits        13037    13041    +4     
  Misses       1589     1589           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sploder12
Sploder12 previously approved these changes Mar 7, 2025
Copy link
Contributor

@Sploder12 Sploder12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! sshfs mounts didn't give me any issues and other ssh features worked as expected.

@xmkg xmkg force-pushed the bugfix/libssh-stuck-connection branch from 7af7943 to 2d1f6cd Compare March 10, 2025 11:40
@xmkg
Copy link
Member Author

xmkg commented Mar 10, 2025

Reduced the connect timeout from 10 to 5 as discussed. @ricab @andrei-toterman @georgeliao

xmkg added 2 commits March 11, 2025 16:02
Sometimes ssh_connect blocks while a VM is booting up where it won't
return to the caller at all. Since the timeout is set to ~infinity
the ssh_connect would block forever.

This patch fixes that by using two different timeout values for the
different session connection states.

Signed-off-by: Mustafa Kemal Gilor <mustafa.gilor@canonical.com>
as discussed.

Signed-off-by: Mustafa Kemal Gilor <mustafa.gilor@canonical.com>
@xmkg xmkg force-pushed the bugfix/libssh-stuck-connection branch from 2d1f6cd to f7364a8 Compare March 11, 2025 13:02
@Sploder12
Copy link
Contributor

Interestingly, this indirectly fixes #3965

@xmkg
Copy link
Member Author

xmkg commented Mar 11, 2025

Interestingly, this indirectly fixes #3965

Good to hear! Two birds with one stone 😉

@georgeliao
Copy link
Contributor

The change is sane to me. Basically, we set the ssh_connect time out to be 5 seconds and all subsequent ssh operations to be infinite timeout value once the connection is established. One thing is not clear to me is that why ssh_connect with infinite timeout value can not wait the vm boots up and run ssh service. If we try to connect and wait infinitely, it should connect once the vm is up and sshd service is up instead of hanging there. That is just my intuitive understanding about the API, maybe there are some nuances do not know about.

@xmkg
Copy link
Member Author

xmkg commented Mar 18, 2025

One thing is not clear to me is that why ssh_connect with infinite timeout value can not wait the vm boots up and run ssh service. If we try to connect and wait infinitely, it should connect once the vm is up and sshd service is up instead of hanging there

That's the question. I think some quirks are going on regarding how libssh handles the situation. Maybe the server is disconnecting the connection (because it's too early for a client to connect), but libssh somehow not picking it up. The timeout is designed to handle situations like this, but I still believe there's a bug in ssh_connect that leads to it.

@xmkg
Copy link
Member Author

xmkg commented Mar 18, 2025

Thanks for the reviews @georgeliao, @Sploder12!

@xmkg xmkg added this pull request to the merge queue Mar 18, 2025
Merged via the queue into main with commit 5c4302a Mar 18, 2025
16 checks passed
@xmkg xmkg deleted the bugfix/libssh-stuck-connection branch March 18, 2025 15:40
@ricab ricab added this to the 1.16.0 milestone Jun 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants