Copyright (C) 2009 Jim Warwicker These programs are free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License. These programs are distributed in the hope that they will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see . This README relates to the Warwicker laboratory code for calculating pKas in protein structures, using the Debye-Huckel (DH) method. Calculations for multiple proteins can be made, followed by analysis of grouped properties, e.g. relating to subcellular groupings, which was the usage made by Pedro Chan and Jim Warwicker (July 2009). (Manuscript submitted to BMC Genomics: Chan and Warwicker Current distribution: July 2009 The shell scripts, Fortran code, and this README are in the proteomepk.tar archive. Assuming that you have unpacked the downloaded archive (e.g. tar xvf proteomepk.tar), then the following will help you. FIRST, all files, source, compiled programs, data files, coordinate files, output, are currently assumed to be in the working directory. You can play around with the paths in the script "pdb_to_DHpk.scr" if you wish. SECOND, to build the Fortran programs, run: proteomepk_build.scr. This uses an F77 compiler; it works on a 32-bit machine running SuSE Linux 9.2, and should be portable at least to other 32 bit Linux systems. THIRD, the analysis is carried out in two stages: run: pdb_to_DHpk.scr input: DHpdb.list and pdb files nb CURRENTLY the DHpdb.list file, and the DHpdb_scan.list file, use 2 PDB files that are included in this distribution. So you can test the programs before looking at your own data. nb The DHpdb.list and DHpdb_scan.list files are similar: DHpdb.list has one PDB/chain per record: 1ci4.A 1qrv.A etc, where the (single) chain to be used is separated by the dot from the pdb DHpdb_scan.list also has one PDB/chain per record: 1ci4A 01 1qrvA 01 etc, where the PDB/chain are now combined, with a single space to a two digit integer that identifies the subcellular location (see submitted manuscript). In the first stage, a list of pdb chains in DHpdb.list is analysed through to calculation of ionisable group pks. These calculations use the compiled Fortran code, (much of which has come from the Warwicker lab electrostatics programs). run: pk_energy_scan input: DHpdb_scan.list and output files from the first stage Now the pks and pH-dependence properties are calculated and averaged over sub-cellular locations (see submitted manuscript), so long as the following environment variable is set thus, prior to running pk_energy_scan: setenv LOCALISE 'yes' This run of pk_energy_scan then gives a list to terminal of various pH-dependent properties (that can also be redirected to file), and a file (combo_scan.txt) of pH-dependent properteies calculated for each protein chain. These are the properties used in the Chan and Warwicker manuscript.