Skip to content

Conversation

drroe
Copy link
Contributor

@drroe drroe commented Jan 15, 2021

This merge request is a large rewrite of the way CPPTRAJ handles unit cells, and is the first major version increase in a long time (to version 5.0.0). It started as a way of addressing #860; it does that and much more. Previously, CPPTRAJ was assuming that all unit cell vectors read from trajectory files were X-aligned; i.e., the cell A vector is aligned with the Cartesian X vector, and the B vector is in the XY plane. While this is true for many MD software packages, it is by no means universally true, and part of the problem in #860 is that the symmetric shape matrix was in fact not X-aligned, causing imaging to fail.

This merge request is primarily a rewrite of the Box class to hold in addition to the unit cell 3x lengths and 3x angles the unit cell vectors, the fractional matrix (for converting to fractional coordinates), and the cell volume. All of these are now calculated whenever the Box is assigned to (via a series of new routines depending on the how the cell is originally defined). In the course of this rewrite, I managed to find and fix several bugs, as well as improve the performance of several actions. In general, performance has not suffered too much. Most actions show a small (<2%) or no performance hit, and some actions have improved speed between about 10% and 40% (detailed below).

The biggest effect of this change is that imaging now occurs on a per-frame basis (so things like unit cells that change shape from frame to frame are now properly handled). It also means that coordinate rotation that would previously make imaging impossible (from e.g. rms, principal, etc) is now handled properly. In other words, rms-fitting before imaging will no longer fail! However, being able to write this out is trajectory-format-dependent since most assume an X-aligned cell. Currently the only formats that support writing "rotated" unit cell vectors are Gromacs XTC and TRR (since they store the unit cell vectors themselves). Also, currently the helPME library which handles the reciprocal part of the PME calculation doesn't yet support directly setting unit cell vectors, so PME doesn't support rotated unit cells at this time. @andysim is working on that.

Another major change (done initially to support rotated unit cells) is that the COORDS data set has been rewritten so that no data is lost from the Frame class (aside from precision); i.e. values like time, replica indices, etc are no longer lost when converting to COORDS sets.

Bug fixes:

  • gist: Fix 'order' calculation for non-orthogonal cells when the energy calculation is run. This issue only affected the CPU code, not the GPU code.
  • lie: Fix imaging in the lie action; previously no imaging was done even if unit cell info was present.
  • pairdist: Fix histogram bins when the maxdist value is too small; previously the averaging was wrong. Results are now independent of maxdist and number of processes in parallel.

General behavior changes:

  • Output now better differentiates between different unit cell types (there are now 9 cell types recognized by cpptraj).
  • Unit cell rotation no longer a problem. can be written to formats that support it.
  • Imaging done per frame, allows for unit cell to change shape.
  • COORDS sets now store all frame info.
  • Trajectory write errors are now properly trapped (previously some formats would e.g. keep writing to a full disk).
  • NetCDF trajectory: disable empty variables (e.g. if the time variable is present but all values are zero, skip).
  • Charmm Trajectory: Unit cell is now properly stored if X-aligned but not symmetric (can happen e.g. with Amber truncated octahderon cells).

Speed-ups:

  • About 1.1-1.2x: hbond (both imaging and no imaging), nativecontacts, radial (solvent-solute)
  • About 1.3x: radial (solvent-solvent), check (non-pairlist)
  • About 1.6x: pairdist

@hainm This will require changes to pytraj. I'm trying to work on them here (https://github.com/drroe/pytraj/tree/fixshapematrix) but you might be able to do it faster. A lot of the changes in this MR are important fixes, so I'd like to merge this soon.

@drroe
Copy link
Contributor Author

drroe commented Jan 18, 2021

@hainm I think I see. The main difference is in how Frame is allocated. If using AllocateFrame() things work as anticipated (i.e. there is box info after GetFrame()). However if allocating "from scratch" (via new) there ends up being no box info. Does that sum up the issue?

@hainm
Copy link
Contributor

hainm commented Jan 19, 2021

Does that sum up the issue?

@drroe Yeah.

@drroe
Copy link
Contributor Author

drroe commented Jan 19, 2021

What are you loading as the DataSet_Coords_TRJ? What format is the file? Is it PDB by any chance? Does the same behavior happen with a different format?

@hainm
Copy link
Contributor

hainm commented Jan 19, 2021

What are you loading as the DataSet_Coords_TRJ? What format is the file? Is it PDB by any chance?

@drroe You're right again.

It's in pytraj's regression test:
https://github.com/Amber-MD/pytraj/blob/master/tests/data/small_pbc.pdb

https://github.com/Amber-MD/pytraj/pull/878/files
Amber-MD/pytraj#876

Does the same behavior happen with a different format?

I don't think so (there are only several failures in pytraj after I update the code).

@hainm
Copy link
Contributor

hainm commented Jan 19, 2021

ha, that pdb file is the trouble.

In [4]: pt.load('./small_pbc.pdb').unitcells
Out[4]: array([[0., 0., 0., 0., 0., 0.]])

In [5]: pt.load('./tz2.ortho.nc', 'tz2.ortho.parm7').unitcells[0]
Out[5]: 
array([35.26277966, 41.84554768, 36.16862953, 90.        , 90.        ,
       90.        ])

@drroe
Copy link
Contributor Author

drroe commented Jan 19, 2021

OK, I think the issue might be an unnecessary HasBox() check here: https://github.com/drroe/cpptraj/blob/9f69f4e790d5914aff79257a348bb48e2a07b1d2/src/Traj_PDBfile.cpp#L204

It didn't really do anything before and was innocuous, but the recent improvements to box triggered it. Testing now.

@drroe
Copy link
Contributor Author

drroe commented Jan 19, 2021

OK - try now @hainm

@AmberJenkins
Copy link
Collaborator

The PGI build in Jenkins failed.

@hainm
Copy link
Contributor

hainm commented Jan 19, 2021

OK - try now @hainm

it's ok now. thanks.

@hainm
Copy link
Contributor

hainm commented Jan 19, 2021

@drroe another question for that pdb file. The origonal box type is ortho but it is changed to cubic type. Is that correct?

@hainm
Copy link
Contributor

hainm commented Jan 19, 2021

@lgtm-com
Copy link

lgtm-com bot commented Jan 19, 2021

This pull request fixes 2 alerts when merging aa80e5b into 1faa9b2 - view on LGTM.com

fixed alerts:

  • 2 for FIXME comment

@drroe
Copy link
Contributor Author

drroe commented Jan 19, 2021

another question for that pdb file. The origonal box type is ortho but it is changed to cubic type. Is that correct?

As long as the cell has all 90 degree angles and all lengths equal this is correct. There are two fundamental changes at work. One is that the unit cell "type" is no longer set once and then fixed. No assumption is made about the cell when parameters are assigned; instead, the "shape" is determined as needed. This allows the box angles to be free parameters (I.e. cpptraj allows the box to change shape during processing as well as orientation). The second change is that more box shapes are recognized (9 total). The old ORTHO (all angles 90 degrees) is covered by the new cubic, tetragonal, and orthorhombic shapes depending on lengths. Does that make sense?

@hainm
Copy link
Contributor

hainm commented Jan 19, 2021

Does that make sense?

yeah, thanks.

@hainm
Copy link
Contributor

hainm commented Jan 19, 2021

@drroe Please try this patch (to master branch; git apply pytraj.patch.4.txt). I've tested in my mac.
I will do the cleanup later.
Cheers.

pytraj.patch.4.txt

@drroe
Copy link
Contributor Author

drroe commented Jan 19, 2021

@hainm OK, I will try it now. Thanks Hai!

@drroe
Copy link
Contributor Author

drroe commented Jan 19, 2021

@hainm patch applied, pytraj installed (python 3.7.9 via anaconda), tests ran (pytest -vs --ignore=test_parallel_pmap). Results:

========================== 411 passed, 38 skipped, 52 warnings in 31.14s ===========================

I will list the warnings below. If the results look good to you we can proceed. I'll have to temporarily disable the "required" status of the Jenkins build in order to merge this. @swails what do you think about making a separate merge gate for pytraj to simplify cases like this, where pytraj is blocking a merge request, but needs to be updated before it is compatible with that merge request?

Warnings:

test_io/test_io.py:89
  /home/droe/GitHub/pytraj/tests/test_io/test_io.py:89: DeprecationWarning: invalid escape sequence \ 
    fname = "abc\ xyz.nc"

tests/test_progress_bar.py::test_progress_log
tests/test_progress_bar.py::test_pickle_progress_bar
  /home/droe/Programs/anaconda/3/lib/python3.7/site-packages/pytraj-2.0.5.dev0-py3.7-linux-x86_64.egg/pytraj/utils/progress.py:148: DeprecationWarning: deprecated, use crdinfo
    setattr(self, att, getattr(traj, att))

tests/test_read_to_array.py::Test::test_0
  /home/droe/Programs/anaconda/3/lib/python3.7/site-packages/pytraj-2.0.5.dev0-py3.7-linux-x86_64.egg/pytraj/utils/tools.py:654: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray
    arr0 = np.array([[x for x in line.split()] for line in fh.readlines()])

tests/test_analysis/test_modes.py::test_analyze_modes
  /home/droe/GitHub/pytraj/tests/test_analysis/test_modes.py:26: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray
    cpp_dict = state.data[1:].to_dict()

tests/test_analysis/test_order_parameters.py: 10 warnings
  /home/droe/Programs/anaconda/3/lib/python3.7/site-packages/pytraj-2.0.5.dev0-py3.7-linux-x86_64.egg/pytraj/parallel/dataset.py:47: DeprecationWarning: Calling np.sum(generator) is deprecated, and in the future will give a different result. Use np.sum(np.fromiter(generator)) or the python sum builtin instead.
    for val in self.data)) / self.traj.n_frames

tests/test_analysis/test_order_parameters.py::TestNHOrderParamters::test_nh_paramters
  /home/droe/Programs/anaconda/3/lib/python3.7/site-packages/pytraj-2.0.5.dev0-py3.7-linux-x86_64.egg/pytraj/parallel/dataset.py:48: FutureWarning: arrays to stack must be passed as a "sequence" type such as list or tuple. Support for non-sequence iterables such as generators is deprecated as of NumPy 1.16 and will raise an error in the future.
    vecs = np.column_stack(val[0][0] for val in self.data)

tests/test_analysis/test_rmsd.py::TestSimpleRMSD::test_list_of_masks
tests/test_io/test_convert.py::Test::test_0
  /home/droe/Programs/anaconda/3/lib/python3.7/site-packages/numpy/core/_asarray.py:83: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray
    return array(a, dtype, copy=False, order=order)

tests/test_io/test_io.py::test_write_velocity_from_scratch
  /home/droe/GitHub/pytraj/tests/test_io/test_io.py:447: DeprecationWarning: deprecated, use crdinfo
    assert traj2.metadata['has_velocity']

tests/test_io/test_io.py::test_write_velocity_from_scratch
  /home/droe/GitHub/pytraj/tests/test_io/test_io.py:448: DeprecationWarning: deprecated, use crdinfo
    assert not traj2.metadata['has_force']

tests/test_io/test_io.py::test_write_force_from_scratch
  /home/droe/GitHub/pytraj/tests/test_io/test_io.py:472: DeprecationWarning: deprecated, use crdinfo
    assert traj2.metadata['has_force']

tests/test_io/test_io.py::test_write_force_from_scratch
  /home/droe/GitHub/pytraj/tests/test_io/test_io.py:473: DeprecationWarning: deprecated, use crdinfo
    assert not traj2.metadata['has_velocity']

tests/test_io/test_io.py::test_write_both_force_and_velocity_from_scratch
  /home/droe/GitHub/pytraj/tests/test_io/test_io.py:501: DeprecationWarning: deprecated, use crdinfo
    assert traj2.metadata['has_force']

tests/test_io/test_io.py::test_write_both_force_and_velocity_from_scratch
  /home/droe/GitHub/pytraj/tests/test_io/test_io.py:502: DeprecationWarning: deprecated, use crdinfo
    assert traj2.metadata['has_velocity']

tests/test_io/test_io.py::test_write_force_and_velocity
  /home/droe/GitHub/pytraj/tests/test_io/test_io.py:509: DeprecationWarning: deprecated, use crdinfo
    assert traj.metadata['has_force']

tests/test_io/test_io.py::test_write_force_and_velocity
  /home/droe/GitHub/pytraj/tests/test_io/test_io.py:510: DeprecationWarning: deprecated, use crdinfo
    assert traj.metadata['has_velocity']

tests/test_io/test_io.py::test_write_force_and_velocity
  /home/droe/GitHub/pytraj/tests/test_io/test_io.py:514: DeprecationWarning: deprecated, use crdinfo
    traj.save(fn2, overwrite=True, crdinfo=traj.metadata)

tests/test_io/test_io.py::test_write_force_and_velocity
  /home/droe/GitHub/pytraj/tests/test_io/test_io.py:517: DeprecationWarning: deprecated, use crdinfo
    assert traj2.metadata['has_force']

tests/test_io/test_io.py::test_write_force_and_velocity
  /home/droe/GitHub/pytraj/tests/test_io/test_io.py:518: DeprecationWarning: deprecated, use crdinfo
    assert traj2.metadata['has_velocity']

tests/test_math/test_matrix3x3.py: 19 warnings
tests/test_trajectory/test_trajectory.py: 2 warnings
  /home/droe/Programs/anaconda/3/lib/python3.7/site-packages/numpy/matrixlib/defmatrix.py:69: PendingDeprecationWarning: the matrix subclass is not the recommended way to represent matrices or deal with linear algebra (see https://docs.scipy.org/doc/numpy/user/numpy-for-matlab-users.html). Please adjust your code to use regular ndarray.
    return matrix(data, dtype=dtype, copy=False)

tests/test_trajectory/test_fancy_indexing.py::TestSlicingTrajectory::test_array_like
  /home/droe/GitHub/pytraj/tests/test_trajectory/test_fancy_indexing.py:49: DeprecationWarning: elementwise comparison failed; this will raise an error in the future.
    new_xyz = traj[bool_arr].xyz

tests/test_trajectory/test_fancy_indexing.py::TestSlicingTrajectory::test_array_like
  /home/droe/GitHub/pytraj/tests/test_trajectory/test_fancy_indexing.py:50: DeprecationWarning: elementwise comparison failed; this will raise an error in the future.
    aa_eq(traj[bool_arr].xyz, traj_mem[bool_arr].xyz)

-- Docs: https://docs.pytest.org/en/stable/warnings.html

@hainm
Copy link
Contributor

hainm commented Jan 19, 2021

If the results look good to you we can proceed.

yeah, looks good.

@drroe
Copy link
Contributor Author

drroe commented Jan 19, 2021

OK - I will for the meantime make Travis the required test so this can get merged. Can change back after.

@AmberJenkins
Copy link
Collaborator

The PGI build in Jenkins failed.

@lgtm-com
Copy link

lgtm-com bot commented Jan 19, 2021

This pull request fixes 2 alerts when merging 226713c into 1faa9b2 - view on LGTM.com

fixed alerts:

  • 2 for FIXME comment

@drroe drroe merged commit e9a7720 into Amber-MD:master Jan 19, 2021
@drroe
Copy link
Contributor Author

drroe commented Jan 19, 2021

@hainm It's live! Go ahead and update pytraj and I'll test again.

@hainm
Copy link
Contributor

hainm commented Jan 19, 2021

I will do that tonight. Cheers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants