Technical Reference¶
Basic help¶
BitQT help is displayed in the console when typing bitqt -h
$ bitclust -h
usage: bitqt -traj trajectory [options]
BitQT: A Graph-based Approach to the Quality Threshold Clustering of Molecular
Dynamics
optional arguments:
-h, --help show this help message and exit
Trajectory options:
-traj trajectory Path to trajectory file [required]
-top topology Path to the topology file
-first first_frame First frame to analyze (counting from 0) [default: 0]
-last last_frame Last frame to analyze (counting from 0) [default: last
frame]
-stride stride Stride of frames to analyze [default: 1]
-sel selection Atom selection (MDTraj syntax) [default: all]
Clustering options:
-cutoff k RMSD cutoff [default: 2]
-min_clust_size m Minimum size of returned clusters [default: 2]
-nclust n Number of clusters to retrieve [default: 2]
Output options:
-odir bitQT_outputs Output directory to store analysis [default:
bitQT_outputs]
As simple as that ;)
Arguments in Details¶
-traj (str):
This is the only argument that is always required. Valid
extensions for trajectories are .dcd
, .dtr
, .hdf5
, .xyz
, .binpos
,
.netcdf
, .prmtop
, .lh5
, .pdb
, .trr
, .xtc
, .xml
,
.arc
, .lammpstrj
and .hoomdxml
.
-top (str):
If trajectory format (automatically inferred from file extension)
includes topological information this argument is not required. Otherwise, user
must pass a path to a topology file. Valid topology extensions are .pdb
,
.pdb.gz
, .h5
, .lh5
, .prmtop
, .parm7
, .prm7
, .psf
,
.mol2
, .hoomdxml
, .gro
, .arc
and .hdf5
.
-first (int, default=0):
The first frame to analyze (starting the count from 0)
-last (int, default=last):
Last frame to analyze (starting the count from 0).
The last frame is internally detected.
-stride (int, default=1):
Stride of frames to analyze. You might want to use this argument for reducing the trajectory size when performing exploratory analysis.
-sel (str, default='all'):
Atom selection. BitQT inherits MDtraj
very flexible syntax selection. Common cases are listed at the
Syntax Selection section.
-cutoff (int, default=2):
RMSD cutoff for similarity measures given in Angstroms
(1 A = 0.1 nm).
-min_clust_size (int, default=2):
Minimum number of frames inside returned clusters.
0 is not a meaningful value, and 1 implies an unclustered frame (no other frame is
similar). Greater values of this parameter may speed up the algorithm with
loss of uniformity in retrieved clusters.
-nclust (int, default=all):
The maximum number of calculated clusters. Change the default
for a better performance whenever you only need to inspect the first clusters.
-ref (int, default=0):
Reference frame to align trajectory.
-odir (str, default="./bitQT_outputs"):
Output directory to store analysis.
BitQT checks for outdir existence to avoid overwriting it.
Syntax Selection¶
BitQT inherits atom selection syntax from MDTraj, which is similar to that in VMD. We reproduce below some of the MDTraj examples. Note that in BitQT, all keywords (or their synonyms) string are passed directly to -sel
argument. For more details on possible syntax, please refer to
MDTraj original documentation.
BitQT recognizes the following keywords.
Keyword |
Synonyms |
Type |
Description |
|
|
|
Matches everything |
|
|
|
Matches nothing |
|
|
|
Whether atom is in the backbone of a protein residue |
|
|
|
Whether atom is in the sidechain of a protein residue |
|
|
|
Whether atom is part of a protein residue |
|
|
|
Whether atom is part of a water residue |
|
|
Atom name |
|
|
|
Atom index (0-based) |
|
|
|
|
1 or 2-letter chemical symbols from the periodic table |
|
|
Element atomic mass (daltons) |
|
|
|
|
Residue Sequence record (generally 1-based, but depends on topology) |
|
|
|
Residue index (0-based) |
|
|
|
Residue name |
|
|
|
1-letter residue code |
|
|
Chain index (0-based) |
Operators¶
Standard boolean operations (and
, or
, and not
) as well as their
C-style aliases (&&
, ||
, !
) are supported. The expected logical
operators (<
, <=
, ==
, !=
, >=
, >
) are also available, as
along with their FORTRAN-style synonyms (lt
, le
, eq
, ne
,
ge
, gt
).
Range queries¶
Range queries are also supported. The range condition is an expression of
the form <expression> <low> to <high>
, which resolves to <low> <=
<expression> <= <high>
. For example
# The following queries are equivalent
-sel "resid 10 to 30"
-sel "(10 <= resid) and (resid <= 30)"