-
Notifications
You must be signed in to change notification settings - Fork 174
ORCA TDDFT and ROCIS Parsing #398
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Increases the amount of the TD header checked. Adds ROCIS test cases (currently not integrated into unit tests).
First of all, thanks for the contribution! Our convention has mostly been to parse the last (best?) results. So in this situation we would overwrite the attribute just as you've done, and add the regressions you need to cover ROCIS cases. But, I would defer to Eric about the decision making here, I think he has more context and knowledge about these types of calculations. @berquist do you think parsing the final result is good here, or do we need to consider parsing multiple spectra for some reason? Concerning the regression you broke - is that also ROCIS, causing it to fail? |
What we do for unrestricted test cases is take the same divinylbenzene geometry from restricted cases and make the cation doublet. This may be an appropriate time to start incrementally making restricted open-shell tests. See more on this later.
See the next paragraph.
I think that for now, we should test against results comparable to what we already have, which is However, I would like to expand the types of properties we can parse, since multiple programs can print multiple forms. See a related old PR #227 that parses electric length, electric velocity, and magnetic transition dipole moments from Gaussian. This is good incentive to figure out how these less general attributes or properties should be stored. This is (to me) just a ROTDDFT calculation with the TDA/CIS approximation and scaling factors for some integral terms. I think this test can be made more general (across parsers) if the molecule is switched from a boron atom to DVB, and the rest of the input is changed to something like
At least DALTON, Q-Chem, I think NWChem, and presumably Gaussian can perform ROCIS/ROTDDFT calculations, and I think we should try and write a common test for all those parsers. Without running calculations it isn't obvious to me how sensitive the results will be to changing the
It's only the old unit tests now in the regression suite, so whatever the code fix is will fix both the unit and regression tests. |
self.etoscs.append(osc) | ||
|
||
# Parse the various absorption spectra for TDDFT and ROCIS | ||
if 'ABSORPTION SPECTRUM' in line or 'ELECTRIC DIPOLE' in line: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would anything fail if this was changed to only if 'ABSORPTION SPECTRUM' in line:
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, the quadrupole corrections do not mention ABSORPTION SPECTRUM
. They usually look like: COMBINED ELECTRIC DIPOLE + MAGNETIC DIPOLE + ELECTRIC QUADRUPOLE SPECTRUM
.
Also, the X-Ray emission spectrum would be missed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, that makes sense.
src/cclib/parser/orcaparser.py
Outdated
return energy, intensity | ||
|
||
# Check for variations | ||
if (line[16:24] == 'COMBINED' and not 'ROCIS' in line) or \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason why this isn't the full string, say if line.strip() == 'COMBINED ELECTRIC DIPOLE + MAGNETIC DIPOLE + ELECTRIC QUADRUPOLE SPECTRUM'
? Same for the others. If not, this is harder to see the logic for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, maybe that is a better way to do it for most of them.
src/cclib/parser/orcaparser.py
Outdated
2 61793079.3 0.2 0.00000 0.00000 2.85949 0.00000 -0.00000 0.00000285948800 0.00000 0.00000 1.00000 0.00000 -0.00000 | ||
""" | ||
try: | ||
state, energy, wavelength, d2, m2, q2, dm, do, intensity, d2_contrib, m2_contrib, q2_contrib, dm_contrib, do_contrib = line.split() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be broken up into multiple lines?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was the most compact yet readable format that I could think of. I'll see what I can do with multiple lines.
Thanks for the input Eric, I totally missed the connection to the gauge thing. |
The DFT/ROCIS I added is basically just ROCIS with DFT orbitals and some fudge factors. The scaling parameters included are the defaults; they have been optimized for B3LYP/def2-tzvp(-f) (Roemelt, M.; Neese, F. (2013) J. Phys. Chem. A, 117, 3069). If we wanted to simplify the test input, we could drop the DFT part and do standard ROCIS, which prints the exact same way. Since DVB is symmetric, the targeted core orbital must be localized, or the orbital window ( I think saving all forms akin to PR #227 would be a great idea. This would make it easier to compare the effects of different approximations and also help protect the user from accidently grabbing the wrong absorption spectrum (that happened to me, which is what prompted the rewrite). As a side note, such a module might be nice for thermochemistry, where there are a lot of small pieces that are spit out that someone may want to look at. |
Ok, sorry for letting this sit for so long.
As long as the printing is exactly the same way, I'd like this, so it gives us a chance of reusing the test for other parsers. I discovered that Q-Chem can't do ROCIS, so this may be less relevant than I wanted.
Can we get the same output from a non-XAS/valence-only calculation? I don't think other programs can do orbital window selection like ORCA, so they'd have to calculate all possible states, which might be expensive.
A problem that I see here is that program-specific field names will be almost unavoidable as we move to less-general properties. Suggestions for what these should be named and how they're structured are welcome. If we're going to be program-specific, the overly verbose but safe solution is to use the section headers ( If any of the above changes break possible added functionality and the test can't be made general, then I'd prefer the added functionality and an ORCA-specific test. For now, if there is a section that's clearly identifiable as the preferable one, we should use that, and the general parsing strategy can be changed later when we make breaking changes.
If you could explain this in more detail (opening another issue for a question/discussion would be good), that would help our discussion about how "custom" or more complex attributes will be made user-facing in the future. |
Adds a dictionary containing all spectra computed with TDDFT, ROCIS, etc. Keys are the titles for the spectra. Values are a tuple (etenergies, etoscs).
I added a transprop dictionary that has the title of each TDDFT, ROCIS, etc. calculation as the key and a tuple of (etenergies, etoscs) as the values. The etenergies and etoscs are then set to the last parsed spectrum. Thoughts? As for testing, I added DVB with ROCIS and a test case subclassed from GenericTDTest. Then I overwrote most of the tests. I feel like keeping it with the TD test cases makes the most sense. |
I think this could be dealt with easily if we expanded our concept of attribute. If used a custom Attribute class we could develop some convention (sub-attributes, lookups, dictionary-like behavior, whatever) that would cover less widespread variants or even program-specific results. |
Now I'm unsure of what to do about this PR, since I don't want to make @jevandezande redo a lot of this work. Should we figure out the new structure before merging this, or take the "best result" and alter the structure to use the new API later? |
@@ -236,7 +236,8 @@ TD Gaussian GaussianTDDFTTest basicGaussian09 dvb_td.out | |||
TD Jaguar JaguarTDDFTTest basicJaguar8.3 dvb_td.out | |||
TD ORCA OrcaTDDFTTest basicORCA2.9 dvb_td.out | |||
TD ORCA OrcaTDDFTTest basicORCA3.0 dvb_td.out | |||
TD QChem OrcaTDDFTTest basicQChem4.2 dvb_td.out | |||
TD ORCA OrcaROCISTest basicORCA4.0 dvb_rocis.out |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@langner I am fine with adding a new unit test as long as you are.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Timeit says reading the file takes 9.9 ms on my laptop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not so much timing as whether or not the test type is general enough, though we are clearly starting to move away from that, and correctness is the most important thing.
Should the discussion of making a custom Attribute class be moved to a separate issue (it could potentially be used for other attributes)? Currently I only use ROCIS sparingly, so I am in no hurry to merge this code in. |
I was worried about that, so this is good to hear. I've made a new issue: #419. |
Well, I'd also like this particular output to be supported, since it's needed. I'm OK with any of the temporary solutions we outlined above, new attribute or a dictionary - please implement what you guys think is most appropriate. But, could you please add a comment that it is likely to change in the future. I don't think it'll happen very soon, but there's a change we'll get to 2.x in 2018. |
@jevandezande Any idea if you'll have the time to work on this some time soon, like in September? |
I'm currently at a conference, but should be able to work on stuff again
early next week.
…On Wed, Sep 6, 2017, 10:33 Karol M. Langner ***@***.***> wrote:
@jevandezande <https://github.com/jevandezande> Any idea if you'll have
the time to work on this some time soon, like in September?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#398 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ABzcCM5eA30RifnREK2nGKh_UFVU8vqeks5sfljOgaJpZM4OfHQO>
.
|
I've added a comment about |
I'm happy with things as they are now. @langner? |
I'm OK with this as well. I'll leave the clicking to Eric. |
Thanks for tolerating us, I'm glad we have a contributor that's making us think about more complicated stuff. |
This pull request updates the parsing of TDDFT and adds additional parsing for ROCIS. At the moment it parses all of the spectra, consistently overwriting until the final spectra is parsed. This is obviously less than optimal behavior, and can take a few seconds for larger systems.
Questions
TODO
Status