-
Notifications
You must be signed in to change notification settings - Fork 513
Use legally-correct copyright notices. #1661
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The existing copyright support uses the (legally-unrecognized) (C), and does not support year lists due to the literal-matching nature of copyrightText. This change fixes the above problems by adding a new {copyright} variable for use with copyrightText. The default value for copyrightText is also updated to use {copyright} instead of the hard-coded "Copyright (C)" text. {copyright} provides the following features: - When suggesting a fix, automatically expands to "Copyright © " followed by the current year. - When checking copyright headers, {copyright} matches "Copyright © " followed by a comma-separated list of four-digit years, and/or year-ranges (two four-digit years separated with a hyphen). This change adds new unit tests to verify this new functionality, and updates the existing tests to work correctly with the new behaviors.
Current coverage is
|
I would prefer the text |
📝 Note that the default header's use of |
I've marked this as needs discussion. The primary new feature added by this pull request is support for year lists. This is related to #1357 (#1357 appears to be a subset of this proposal). The main areas of concern I have with this proposal are the following:
The OSS license used for this repository of analyzers does permit users to create a new analyzer which specifically handles copyright headers in a custom format and use that analyzer within their project. As long as the diagnostic IDs for file header diagnostics are changed, the custom analyzer and StyleCop.Analyzers could be used together in the same repository. |
Yes, Copyright is valid, but (C) is meaningless; perpetuating its use is just silly, so fix it or drop it. Omission of a year is the primary flaw here, regardless. The implementation being tied to the placement of Copyright was intentional for multiple reasons; I'll elaborate further once I have a real keyboard. http://www.copyright.gov/circs/circ01.pdf Page 4, "Form of Notice for Visually Perceptible Copies" specifies the year as one of three critical points. I'm perfectly happy to not handle multi-years, but ONE year is at minimum required. Travis |
The current default text is the closest we can get to © while only using characters which have the same binary representation in Windows-1252 and UTF-8. If any change is made to the default, I expect it to be the result of ongoing discussions in the .NET Foundation. Note that you can currently customize the
We use this at my office by creating a variable which holds a fixed year (currently 2015). It also uses a © symbol instead of {
"$schema": "https://raw.githubusercontent.com/DotNetAnalyzers/StyleCopAnalyzers/master/StyleCop.Analyzers/StyleCop.Analyzers/Settings/stylecop.schema.json",
"settings": {
"documentationRules": {
"companyName": "Our Company",
"copyrightText": "Copyright © {year} {companyName}. All Rights Reserved.",
"xmlHeader": false,
"variables": {
"year": "2015"
}
}
}
} I think you can make a great case for expanding on this. A special variable (name TBD) which could be used in the
|
Still not at a kb, so bear with me -- The placeholder needs to match a valid Copyright symbol as the prev. token. Thus {copyright} => ((Copyright)|©) [12]\d{3} or similar. Trying to date-match in isolation is an ultimately useless activity, as we can't tell a date from any other sort of 4-digit # -- it's the direct proximity to the magic Copyright word that gives this sequence of digits importance, and the Copyright word in turn needs those digits. I considered trying to do full regex support as the ultimate generic way to solve this, but that plan fell over when suggested fixes had to be considered. Re. encoding, that shouldn't be a concern: either tag the necessary files with a UTF-8 "BOM", or set up the HTTP server to serve the appropriate file encoding in the headers, and recode as appropriate to the output file's encoding. That said, AFAIK, all .NET languages need to support Unicode source files anyway*. If it's a Super Big Deal, then just drop the (C) altogether. Being squeamish about Unicode vs. legacy text seems oddly quaint, though. Very 90s. :) Cheers, Travis
|
I think that the original test cases should not have been modified. Those were valid test cases and should not be modified for an optional feature. |
❓ What is the point of supporting comma-separated list of years or year ranges? |
50% sure I only touched tests that failed after my change (a shocking number did), but it's entirely possible some got dragged along for the ride. The refactoring was a rather boring autopilot haze. Regardless, none of the tests' underlying flavor should have changed. If you have specific cases you'd like to try reverting, feel free to point them out. Travis |
This was caused by changing the default value of the |
In addition, circular 3 gives more information about the proper form of copyright notices. http://www.copyright.gov/circs/circ03.pdf The (c) has no meaning. There was a court case where the C was enclosed in an octagon instead of a circle and it was ruled invalid.The year should be present. In the United States since 1 March 1989, not having a copyright notice does not mean that the work is uncopyrighted, however it does cause problems. (Not registering the copyright with the Library of Congress limits the damages that can be collected.) |
Now that I'm at a keyboard... wall of text engage!
[...]
From what I can tell, the year should come immediately after either "Copyright", "Copr.", or "©". Thus, a generic year-only match is not a semantically equivalent substitute (although it would technically work, essentially you could put {year} tags anywhere, and it would not guarantee the enforcement of a full, legally recognized copyright statement). A full regex/pattern system would be the logical extension as a fully-flexible and data-driven solution, but that would require a lot of plumbing, and lead to interesting requirements in terms of interacting with the suggested fixes code. It's a lot of work just to support a single templating scenario. What I could do is put in another configuration knob that lets the customer chose an appropriate value (e.g., copyrightPhrase set by an enum like Full, Abbreviated, Symbol), and have that do the special expansion that checks for a word. Sadly, this can't be an implicit rule that requires a four-digit year-like thing after any instance of "Copyright", "Copr.", or "©", because those words/symbols could potentially show up in the copyright header and not actually be the intended statement of copyright. For example: "This program is protected by U.S. Copyright law" would trigger improperly. The user has to be able to convey the explicit semantic intent of "this is a copyright notice," and that means, as far as I can tell, the whole notice needs to be handled as a single, explicitly-stated token.
[...]
[...]
[...]
Guidance given by the copyright office specifies a single year after the "Copyright", "Copr.", or "©". This represents, at least in the US, the "best practice," and that is exactly what StyleCop is supposed to be helping us conform to. By the same reasoning, I intend to remove the multi-year support from this change.
[...]
[...]
Best practice would seem to be to not have (C) -- everyone seems to acknowledge it's not a legally recognized sequence. It's cruft. Kill the cruft. :)
[...]
[...]
Omitting copyright years may seem like a trivial choice, but as Greg points out, there are real legal ramifications of making this choice. Upon learning this, I would consider it to be not only incorrect, but professionally unethical to continue with a default that does not enforce a year-labeled copyright notice.
This was the first option I reached for when I tried to get my codebase going... sadly, it leads to StyleCop errors in the source code unless all the files use the current year. Since it is improper and incorrect to change all the files to the latest year (especially if the code in the files hasn't changed), this sadly this is a non-solution to the bigger problem. Again, we need an unambiguous semantic declaration of "this is a copyright notice here," and the engine needs to be smart enough to treat it properly. |
I'm going to go a bit out of order on the responses.
The suggested guidance is the recommended practice to follow in order to maximize the ability to obtain statutory damages following an infringement. In my experience (and IANAL), source code files tend to derive only marginal benefits from this practice because they fall into one of two categories:
Rather than focus on a hypothetical outcome of obtaining statutory damages, the default behavior of the copyright headers in this project focuses on behavior which can be easily adopted and is appropriate at minimum for a majority of open source projects. The default header is essentially as easy to maintain as not having a header at all (maximum developer efficiency), but does provide a notice regarding the copyright holder for the work. Individual projects may expand on this by customizing the copyright text; as you can see in this project we added brief information regarding the license the source code is provided under and the location where the complete license text may be obtained.
Copyright laws vary by country. By using simple replacement variables and customizable copyright text, we maximize the ability of the code to meet the needs of a varied audience. If a generic year could be written as
IMO, the long-term benefits to the developer community as a whole outweigh the marginal legal benefits provided by the more complete notice. This sentiment appears to be backed by de-facto use of notices without years in projects recently published by major organizations (including Microsoft), and is now being considered as a more official recommendation coming from the .NET Foundation itself. If you can get the .NET Foundation to change its recommendation such that it includes a date in the copyright notice, then it would be much more compelling evidence that the StyleCop Analyzers default is not suitable for most .NET projects. |
Two-and-half things first... First: I'm disappearing until next week (this was supposed to be a small one-night change to take care of a snag in getting a project spun up). Don't be alarmed. Half: The way the current patch works relies on "Copyright ©" being rare/unique enough in actual header text that it can be back-converted into a token. If that is straight-out unacceptable, reject this pull request. Second: I can do a year-only solution. However, It will be quite difficult, and pretty much be 100% based on regexes. It will be much slower than the line-by-line ordinal comparison presently used. Before I embark, I want some level of guarantee that if I code that up, get it to work, document it, and write passing tests for it, that I'm not going to get a bunch of philosophical, ideological, or theoretical-performance pushback. :) Onward to ideological debate...
Name one benefit of not having a year. None of the benefits listed on the link provided are valid for years. Point by point:
The only explicit reference to removing the copyright year states it's done solely "to avoid unnecessary churn in the code base." That argument is completely invalid if single-year notices are used. When the file is created, the copyright notice should be in it: there should be exactly 0 churn.
Now, as in right now, as in two days ago, the same day I submitted this patch. It's not a standard, it's only just been proposed, and it has made exactly zero good points for actually removing the year. Trying to dodge to "de facto" over an actual codified legal standard also doesn't seem like a particularly convincing point. Refusing to look at the consequences of the choice because they are hypothetical also doesn't strengthen this position: if the hypothetical consequences are unimportant, we shouldn't even put a copyright notice in. Copyright is automatic in the US, and even then, copyrights are only important in the hypothetical situation that somebody tries to violate our license. :) Microsoft, as a large corporation, has its own agenda. It isn't going to be interested in chasing down infringement of its sample code or other such items it releases to the wild. Microsoft is also a predominately closed-source operation, and it is less-likely (though still possible, with so many people internally having access to so much code) that single file(s) will be stolen and then used. If infringement happens, and they care, they're only going to care because huge profits were reaped, or Microsoft's huge profits were eaten into -- statutory damages are absolute chump change to begin with for Microsoft, and they would never opt to take them. This policy makes perfect sense for a company like Microsoft. For open-source work, where actual damages to the copyright holder may be zero (and where the profits of the infringer may be small by comparison to the whole product the infringement took place in), statutory damages may be the only actual value a "little guy" can get. I refer you to BusyBox, who out of half a dozen settlements, actually won one the single case that went to court... with statutory damages only. This code was embedded into hundreds, if not thousands of real devices that went out, but BusyBox still went for statutory rather than actual damages+profits... hardware is, after all, a low-margin industry. |
Based on information in Circular 3, we see that the impact of including inexact dates is the following:
During certain operations such as refactoring which can create new files or move content from one existing file to another, it may be difficult to determine if the content of the new file is original content which was not previously created, or simply relocation of content which was already created (and potentially published). Failure to be exactly correct when choosing a date for the header could result in the invalidation of the notice. The safest way to ensure indisputable protection for all content in the project is to use a copyright notice on all source files which is dated in the year that work on the project was started. For example, this repository would contain a notice dated 2014 on all files, regardless of the time when they were added. The protection would then extend to the year 2109. Since this strategy uses fixed dates, it is already supported by StyleCop Analyzers. Considering that C# 1.0 was released in the year 2000, we can assume that projects using StyleCop Analyzers have an earliest date of publication for C# code no earlier than the year 2000. As of today, the discussions in this issue therefore only impact the ability to seek statutory damages for copyright infringements which occur between 2095 (at the earliest) and 2110. I feel like this buys us a little bit of time to really think about a solution which protects developers while minimizing the amount of work required to maintain accurate copyright notices.
The best you can get on this front is filing a proposal describing the specific behavior you want to support and the manner in which you would expose it (e.g. changes to semantics of stylecop.json, etc.), and then waiting for it to be approved before starting the implementation. This covers the "...philosophical, ideological..." concerns. In addition, there are several active participants in the project who expressed a great deal of interest in finding ways to incorporate approved functionality in a manner that does not negatively impact performance, including but definitely not limited to myself, @pdelvo, and @vweijsters. That said, if things don't work out here then remember two things:
As a final note, please consider writing up some of your concerns in the .NET Foundation thread on this topic. So far the conversation has been quite one-sided. |
Thanks for putting in a reference back to this thread on the .NET foundation discussion -- I don't have an account over there, and really I'd rather be coding than adding more accounts to my keyring. :) Would it be preferred to file a fresh issue with the "proposal" tag, or pile on to #1357? |
Either way works. 😄 |
The existing copyright support uses the (legally-unrecognized) (C), and does not support year lists due to the literal-matching nature of copyrightText.
This change fixes the above problems by adding a new {copyright} variable for use with copyrightText. The default value for copyrightText is also updated to use {copyright} instead of the hard-coded "Copyright (C)" text.
{copyright} provides the following features:
This change adds new unit tests to verify this new functionality, and updates the existing tests to work correctly with the new behaviors.