Skip to content

Conversation

jpitc
Copy link
Contributor

@jpitc jpitc commented Apr 7, 2017

closes #1659.

First PR here. I've read and agree to C4.1.

This change requires (some) version change, because code that depends on this fix must have a way to check for it. Not sure how new minor/patch versions are handled in the project.

Follow up to #1649 (as it turns out), extending it to reset all global state, not just dangling pointers.

@jpitc jpitc changed the title Problem: zsock_* + fork() causes undefined behaviour Problem: zsys_init() + fork() causes undefined behaviour Apr 7, 2017
Solution: zsys_shutdown() calls (new function) zsys_reset()
as part of its cleanup, which resets all global state to
their initial values. Which allows subsequent calls to
zsys_init() to work as expected.
@bluca bluca merged commit 715828d into zeromq:master Apr 7, 2017
@bluca
Copy link
Member

bluca commented Apr 7, 2017

Thanks.

I understand wanting to reset the global state, but there are 2 issues:

  1. uncoditionally clearing the context variable means leaking when the shutdown was not done properly due to errors in the application code - it's better to leave it as-is and cause an assert, so that it can be looked at
  2. this causes a backward incompatible change in the side effects of a public API, so we can't just release it, it will have to be behind the DRAFT flag and released only the next time there's an API breakage
    alternatively, a new public API can be added, eg: zsys_shutdown_reset - if you need that, feel free to send a followup PR

@jpitc jpitc deleted the fork_supprt branch April 7, 2017 09:18
@jpitc
Copy link
Contributor Author

jpitc commented Apr 7, 2017

@bluca Thanks for explaining. It sounds like you didn't mean to merge this, but you did. Slip of the mouse?

@bluca
Copy link
Member

bluca commented Apr 7, 2017

It was intentional - as you have read in the C4, we always merge PRs, apart from a few corner cases where things are really, really, really broken. This is not one of them.
I'll send a followup PR with the improvements I mentioned in a few minutes

@jpitc
Copy link
Contributor Author

jpitc commented Apr 7, 2017

@bluca In that case, there's no reason not to call zsys_reset() in the case of a successful shutdown, right? There's no contract that promises dirty state after a proper shutdown and restart (in fact this entire use case is undefined afaik). And doing that would actually be fine indefinitely afaic.

@bluca
Copy link
Member

bluca commented Apr 7, 2017

Release engineering of a public library is a delicate problem - even if there's no explicit promise, de-facto the current behaviour of these public APIs is that a shutdown follow by a re-init does not discard the previous settings of those various global context variables.
Whether we like it or not, this is how this API currently works, and there might be users relying on this, intentionally or not. For example, one might be enabling IPv6 at the beginning of an application, and then fork. Changing it would remove IPv6 from the child process.

This means this change can't go live (without the DRAFT flag) until we are ready to do an API-breaking release, which is very rare and should be at most once every couple of years.
If you'd like this change to go live sooner, as I mentioned you can add a new API call, for example called zsys_shutdown_reset, which combines both. This removes the risk of affecting existing users.

As a famous developer once said, if people are using it, it's not a bug, it's a feature.

@jpitc
Copy link
Contributor Author

jpitc commented Apr 7, 2017

@bluca, I've read that carefully (thanks for explaining carefully). I have every respect for back-compat, and in other projects have acted the curmudgeon against other developers on that point. But I think this vigilance is simply misplaced here. To reiterate, the behavior this changes has never been exposed in an official release and is not documented (either way) anywhere. So changing it cannot qualify as a breaking change by any reasonable standard.

Even the mere question of what the behavior is was a moot point until two weeks ago when #1649 was merged, because anyone who tried to exercise it was stopped by an assert before they got to. I know this because I pulled czmq 3 weeks ago, after 4.0.2, and was stopped in this way the moment I tried to invoke a czmq function in a new process after zsys_shutdown().

Seeing as it's standard C4 practice to merge a PR and then immediately change its behavior, it's clear that "it's in git between versions" isn't a de-facto contract to users. So how can changing another behavior that has only been public "in git between versions" be considered an API breaking change?
Before the upcoming 4.0.3, zsys restart wasn't even supported by czmq because it was actually made impossible by the code. How could anyone depend on its "behavior"?

Please reconsider whether you're actually protecting users (who ship stable products against an unreleased git version of czmq) or throwing a way an opportunity to correct an oversight, without hurtin anyone, that is about to become written in stone for a year or two (and then cause porting issues).

If you still refuse to reconsider, I'll drop the issue....

... And consider you wrong wrong wrong till my dying day 😉

@bluca
Copy link
Member

bluca commented Apr 7, 2017

Yes that's true, I hadn't considered that it was broken before, only that it was available

@jpitc
Copy link
Contributor Author

jpitc commented Apr 7, 2017

👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Problem: zsys_init() + fork() causes unxpected behavior
2 participants