Durhamlogo                                                                                                                                        BSIlogo

 

Principle Componant Analysis with Gromacs Instructions



The energy minimised protein structure is needed, protein-vacuum-min.gro, as
produced in Gromacs Instructions; along with protein.top, and the .itp files
produced.

== Generation of the Solvated Protein ==
The first step to producing the solvated protein is to place it in periodic
boundary conditions. The type of periodic boundary condition used will depend
on the shape of the protein, a spherical-esk protein will want dodecahedron
while a rectangular-esk protein will want triclinic. A 1nm spacing will also
be required around the box.
editconf_d -f protein-vacuum-min.gro -o protein-PBC.gro -bt dodecahedron -d 1.0

Water can now be added to your protein in periodic boundaries
genbox_d -cp protein-PBC.gro -cs spc216.gro -p protein.top -o protein-water.gro

You now have a solvated protein, the original protein.top has been backed up as
#protein.top.1#.

If the protein has a net charge this needs to be fixed by adding some counter ions,
to do this we need a .tpr file ({{:pca-em.mdp|pca-em.mdp}}).
grompp_d -v -f pca-em.mdp -c protein-water.gro -p protein.top -o protein-water.tpr

Now this can be used to generate ions. Depending on if you just want to neutralise the
protein or place in a box of a certain concentration (generally atom type SOL will
want to be selected to be replaced).
For just neutralise:
genion_d -s protein-water.tpr -o protein-solvated.gro -p protein.top -np ? -nn ?
-pname NA -nname CL

where the ?'s need to be replace with the number of positive and negative ions respectively.
For a given concentration, e.g. 0.15M:
genion_d -s protein-water.tpr -o protein-solvated.gro -p protein.top -conc 0.15
-neutral -pname NA -nname CL

The protein should be energy minimised, for this the relevant gromacs binary run file,
protein-solvated.tpr, needs to be created. For this you need a control parameter file,
pca-em.mdp, and then run the command:
grompp_d -v -f pca-em.mdp -c protein-solvated.gro -p protein.top -o protein-solvated.tpr

This then allows you to run the minimization on the structure:
mdrun_d -v -deffnm protein-solvated -c protein-solvated.pdb -s protein-solvated.tpr

The greatest strain has been removed from the system by the energy minimisation; however,
there will be areas in the system where the water is not in an optimum position. Therefore
we need to let the water move keeping the protein restrained ({{:pca-pr.mdp|pca-pr.mdp}}).
trjconv_d -f protein-solvated.trr -o protein-solvated-min.gro -b {last frame} -sep
-s protein-solvated.tpr -ndec 32
mv protein-solvated-min1.gro protein-solvated-min.gro
grompp_d -v -f pca-pr.mdp -c protein-solvated-min.gro -p protein.top -o protein-PR.tpr
mdrun_d -v -deffnm protein-PR

Check the system to make sure it at an equilibrium. If not re-run.

The system now needs to be coupled to the temperature and relaxed at these new
conditions ({{:pca-nvt.mdp|pca-nvt.mdp}}).
trjconv_d -f protein-PR.trr -o protein-PR-min.gro -b {last frame} -s protein-PR.tpr -ndec 32
grompp_d -v -f pca-nvt.mdp -c protein-PR-min.gro -p protein.top -o protein-nvt.tpr
mdrun_d -v -deffnm protein-nvt

Again check to make sure the system is equilibrated. If not-re-run.

The system now needs to be coupled to the pressure and relaxed at these new conditions
({{:pca-npt.mdp|pca-npt.mdp}}).
grompp_d -v -f pca-npt.mdp -c protein-nvt.gro -p protein.top -o protein-npt.tpr
mdrun_d -v -deffnm protein-npt

Again check to make sure the system is equilibrated. If not-re-run.

== Protein Simulation ==
After all this equilibration, the actual simulation can be run. For the principle component
analysis 40ns of simulation should be sufficient ({{:pca-md.mdp|pca-md.mdp}}).
grompp_d -v -f pca-md.mdp -c protein-npt.gro -p protein.top -o protein-md.tpr
mdrun_d -v -deffnm protein-md

== Principle Component Analysis ==
To perform the principle component analysis you need to produce a covariance matrix of all
the atom motions. The protein-md.trr file can be used to generate this, if you have more
than one for the same run you will need to join them with:
trjcat_d -f protein-md.trr protein-md2.trr -o protein-mdtotal.trr -settime

It seems to be better to centre your protein first and fit onto it's self removing the largest
rotational and translational movement (see Ah! Stupid Gromacs, why won't you centre my protein?).
g_covar_d -f protein-md.trr -s protein-md.tpr -av average.gro -last 50 -mwa -nofit

Outputs are eigenvec.trr which contains the eigenvectors (first eigenvector labelled 0 is
the minimised coordinates) and eigenval.xvg which contain the mass weighted eigenvalues.

Some of the analysis options can be seen with:
g_anaeig_d -h