Principle Componant Analysis with Gromacs Instructions
The energy minimised protein structure is needed, protein-vacuum-min.gro, as
produced in Gromacs Instructions; along with protein.top, and the .itp files
produced.
== Generation of the Solvated Protein ==
The first step to producing the solvated protein is to place it in periodic
boundary conditions. The type of periodic boundary condition used will depend
on the shape of the protein, a spherical-esk protein will want dodecahedron
while a rectangular-esk protein will want triclinic. A 1nm spacing will also
be required around the box.
editconf_d -f protein-vacuum-min.gro -o protein-PBC.gro -bt dodecahedron -d 1.0
Water can now be added to your protein in periodic boundaries
genbox_d -cp protein-PBC.gro -cs spc216.gro -p protein.top -o protein-water.gro
You now have a solvated protein, the original protein.top has been backed up as
#protein.top.1#.
If the protein has a net charge this needs to be fixed by adding some counter ions,
to do this we need a .tpr file ({{:pca-em.mdp|pca-em.mdp}}).
grompp_d -v -f pca-em.mdp -c protein-water.gro -p protein.top -o protein-water.tpr
Now this can be used to generate ions. Depending on if you just want to neutralise the
protein or place in a box of a certain concentration (generally atom type SOL will
want to be selected to be replaced).
For just neutralise:
genion_d -s protein-water.tpr -o protein-solvated.gro -p protein.top -np ? -nn ?
-pname NA -nname CL
where the ?'s need to be replace with the number of positive and negative ions respectively.
For a given concentration, e.g. 0.15M:
genion_d -s protein-water.tpr -o protein-solvated.gro -p protein.top -conc 0.15
-neutral -pname NA -nname CL
The protein should be energy minimised, for this the relevant gromacs binary run file,
protein-solvated.tpr, needs to be created. For this you need a control parameter file,
pca-em.mdp, and then run the command:
grompp_d -v -f pca-em.mdp -c protein-solvated.gro -p protein.top -o protein-solvated.tpr
This then allows you to run the minimization on the structure:
mdrun_d -v -deffnm protein-solvated -c protein-solvated.pdb -s protein-solvated.tpr
The greatest strain has been removed from the system by the energy minimisation; however,
there will be areas in the system where the water is not in an optimum position. Therefore
we need to let the water move keeping the protein restrained ({{:pca-pr.mdp|pca-pr.mdp}}).
trjconv_d -f protein-solvated.trr -o protein-solvated-min.gro -b {last frame} -sep
-s protein-solvated.tpr -ndec 32
mv protein-solvated-min1.gro protein-solvated-min.gro
grompp_d -v -f pca-pr.mdp -c protein-solvated-min.gro -p protein.top -o protein-PR.tpr
mdrun_d -v -deffnm protein-PR
Check the system to make sure it at an equilibrium. If not re-run.
The system now needs to be coupled to the temperature and relaxed at these new
conditions ({{:pca-nvt.mdp|pca-nvt.mdp}}).
trjconv_d -f protein-PR.trr -o protein-PR-min.gro -b {last frame} -s protein-PR.tpr -ndec 32
grompp_d -v -f pca-nvt.mdp -c protein-PR-min.gro -p protein.top -o protein-nvt.tpr
mdrun_d -v -deffnm protein-nvt
Again check to make sure the system is equilibrated. If not-re-run.
The system now needs to be coupled to the pressure and relaxed at these new conditions
({{:pca-npt.mdp|pca-npt.mdp}}).
grompp_d -v -f pca-npt.mdp -c protein-nvt.gro -p protein.top -o protein-npt.tpr
mdrun_d -v -deffnm protein-npt
Again check to make sure the system is equilibrated. If not-re-run.
== Protein Simulation ==
After all this equilibration, the actual simulation can be run. For the principle component
analysis 40ns of simulation should be sufficient ({{:pca-md.mdp|pca-md.mdp}}).
grompp_d -v -f pca-md.mdp -c protein-npt.gro -p protein.top -o protein-md.tpr
mdrun_d -v -deffnm protein-md
== Principle Component Analysis ==
To perform the principle component analysis you need to produce a covariance matrix of all
the atom motions. The protein-md.trr file can be used to generate this, if you have more
than one for the same run you will need to join them with:
trjcat_d -f protein-md.trr protein-md2.trr -o protein-mdtotal.trr -settime
It seems to be better to centre your protein first and fit onto it's self removing the largest
rotational and translational movement (see Ah! Stupid Gromacs, why won't you centre my protein?).
g_covar_d -f protein-md.trr -s protein-md.tpr -av average.gro -last 50 -mwa -nofit
Outputs are eigenvec.trr which contains the eigenvectors (first eigenvector labelled 0 is
the minimised coordinates) and eigenval.xvg which contain the mass weighted eigenvalues.
Some of the analysis options can be seen with:
g_anaeig_d -h