Copyright (C) 2009 Jim Warwicker

    These programs are free software: you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
    the Free Software Foundation, either version 3 of the License.

    These programs are distributed in the hope that they will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
    GNU General Public License for more details.

    You should have received a copy of the GNU General Public License
    along with this program.  If not, see <http://www.gnu.org/licenses/>.

This README relates to the Warwicker laboratory code for calculating pKas in
protein structures, using the Debye-Huckel (DH) method.
Calculations for multiple proteins can be made, followed by analysis of
grouped properties, e.g. relating to subcellular groupings, which was the
usage made by Pedro Chan and Jim Warwicker (July 2009).

(Manuscript submitted to BMC Genomics: Chan and Warwicker

Current distribution: July 2009

The shell scripts, Fortran code, and this README are in the proteomepk.tar
archive.

Assuming that you have unpacked the downloaded archive (e.g. tar xvf
proteomepk.tar), then the following will help you.

FIRST, all files, source, compiled programs, data files, coordinate files,
output, are currently assumed to be in the working directory.  You can play
around with the paths in the script "pdb_to_DHpk.scr" if you wish.

SECOND, to build the Fortran programs,

run:		proteomepk_build.scr.

This uses an F77 compiler; it works on a 32-bit machine running SuSE Linux 9.2, and
should be portable at least to other 32 bit Linux systems.

THIRD, the analysis is carried out in two stages:

run:		pdb_to_DHpk.scr
input:		DHpdb.list and pdb files

nb CURRENTLY the DHpdb.list file, and the DHpdb_scan.list file, use 2 PDB files that are
included in this distribution.  So you can test the programs before looking at
your own data.
nb The DHpdb.list and DHpdb_scan.list files are similar:
DHpdb.list has one PDB/chain per record:
1ci4.A
1qrv.A
etc, where the (single) chain to be used is separated by the dot from the pdb
DHpdb_scan.list also has one PDB/chain per record:
1ci4A 01
1qrvA 01
etc, where the PDB/chain are now combined, with a single space to a two digit
integer that identifies the subcellular location (see submitted manuscript).

In the first stage, a list of pdb chains in DHpdb.list is analysed through to
calculation of ionisable group pks.  These calculations use the compiled
Fortran code, (much of which has come from the Warwicker lab electrostatics
programs).

run:		pk_energy_scan
input:		DHpdb_scan.list and output files from the first stage

Now the pks and pH-dependence properties are calculated and averaged over
sub-cellular locations (see submitted manuscript), so long as the following
environment variable is set thus, prior to running pk_energy_scan:

setenv	LOCALISE	'yes'

This run of pk_energy_scan then gives a list to terminal of various
pH-dependent properties (that can also be redirected to file), and a file
(combo_scan.txt) of pH-dependent properteies calculated for each protein chain.

These are the properties used in the Chan and Warwicker manuscript.