-
Notifications
You must be signed in to change notification settings - Fork 77
Parse SAS format strings for XPT outputs (#257) #258
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
* SAS format roundtrip uses $CHAR3 instead of $CHAR3. - the trailing period is dropped when roundtripping XPT files but it represents the same format * Check if label_set has been set before running roundtrip test
I've made a few more updates after seeing the test failures. The XPT reader code has been updated to reflect the changes to the XPT writer. I had to make a couple of changes to the tests for what looked like false positives, and I think they make sense:
|
I've updated the format parsing code to use a Ragel parser and added an xport_format_t struct to store the parser output, and added a few more tests to check it's throwing errors correctly. I've also had a go at the fuzzer setup but a bit less sure of what I'm doing there! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good! Just found one whitespace issue, otherwise I'm happy with it.
Thanks for taking the time to implement the fuzzer – and of course the word dictionary is a nice touch.
It looks like Appveyor is failing: https://ci.appveyor.com/project/evanmiller/readstat/builds/41638815 You may need to modify the Visual Studio project file to ensure all the new files are being built. See https://github.com/WizardMac/ReadStat/blob/master/VS17/ReadStat.vcxproj |
Awesome, thanks 🙂 I've made the appropriate fixes to the VS17 project file, appveyor is looking happier now. |
Thanks, much better now! |
@evanmiller - any idea when these changes might be able to be merged into a new release of ReadStat? These changes would open up some important doors for haven, which will have a significant impact in some pharmaceutical industry projects. |
This PR adds a simple parser to extract format name, width and decimals from a SAS format string to be used when writing XPT files, closing #257.
It parses the format string in reverse, assigns width and decimals based on the pattern of numbers and periods at the end, and uses the remainder of the string as the format name.
There are also a couple of tweaks to the section that writes the long format name for XPT v8 files: