-
Notifications
You must be signed in to change notification settings - Fork 67
Open
Description
Currently you need to specify some sort of data set to cluster on, but it's not strictly necessary. Issue was raised by user Tim Davis on the Amber mailing list. Here's what I posted in response:
Hi,
You can load a pairwise distance file either via a 'readdata' command
prior to cluster or by using the 'loadpairdist' option, e.g.
readdata Cmatrix.cmatrix name PW
runanalysis cluster crd1 pairdist PW ...
or
cluster C1 loadpairdist pairdist pw.out ...
If you don't want to do any clustering you can use the 'readinfo'
option to read the results of previous clustering.
I guess it can be a bit annoying if you just want a summary that you
have to specify something to cluster. I never really considered that
case to be honest. If you want, you can "fool" cpptraj by creating a
"fake" data set to cluster that has the same size as the data you
want, then read that in and "cluster" on that after reading in your
pairwise distance matrix. Here's an example where I've modified one of
cpptraj's cluster test cases to do just that.
Original:
# Test loading PW distances from Cmatrix file
cat > cluster.in <<EOF
readdata Cmatrix.cmatrix name PW
parm ../tz2.parm7
loadtraj ../tz2.nc name MyTraj
runanalysis cluster crd1 crdset MyTraj :2-10 clusters 3 epsilon 4.0
summary summary2.dat \
complete nofit pairdist PW \
cpopvtime normpop.agr normpop
EOF
cpptraj -i cluster.in
The trajectory tz2.nc is 101 frames. So I create a fake data set to
cluster with 101 entries via something like:
#!/bin/bash
rm fakedata.dat
for ((i=1; i<= 101; i++)) ; do
echo "$i" >> fakedata.dat
done
Then you can use the following modified input: note the replacement of
'crdset' with 'nocoords' and 'data':
# Test loading PW distances from Cmatrix file
cat > cluster.in <<EOF
readdata Cmatrix.cmatrix name PW
parm ../tz2.parm7
readdata fakedata.dat name MyData
runanalysis cluster crd1 data MyData nocoords :2-10 clusters 3 epsilon
4.0 summary summary2.dat \
complete nofit pairdist PW \
cpopvtime normpop.agr normpop
EOF
cpptraj -i cluster.in
That seems to work just fine. I'll add a feature request to cpptraj
GitHub to make clustering on existing pairwise distance matrices
easier. Thanks for the report!
Metadata
Metadata
Assignees
Labels
No labels