ParsePDB.pm – A Perl Parser for PDB Files

NAME
SYNOPSIS
DESCRIPTION
SEE ALSO
AUTHOR
COPYRIGHT

Download
ParsePDB.zip
Manual only
Error.pm (Dependency)

NAME

ParsePDB – A Perl-based parser for PDB files (Protein Data Bank format).

SYNOPSIS

Over 60 methods allow the access to the PDB content. Most of these commands accept parameters to filter the PDB for specific entities:

Model
Chain
Residue (number)
Residue (label, e.g. the amino acid)
Atom (number)
Atom (label, e.g. all alpha carbons)
Atom (element)

A few examples of the available methods:

# initialize a PDB object
$PDB = ParsePDB->new (FileName => '4mbn.pdb');
$PDB->Parse;

# renumber items in the PDB
$PDB->RenumberModels (ModelStart => '1');
$PDB->RenumberChains (ChainStart => 'A');
$PDB->RenumberResidues (ResidueStart => 'A');
$PDB->RenumberAtoms (AtomStart => '1');

# count items in the PDB
$PDB->CountModels;
$PDB->CountChains;
$PDB->CountResidues;
$PDB->CountAtoms;

$PDB->WriteChains;

$PDB->IdentifyModels;
$PDB->IdentifyChains;
$PDB->IdentifyResidues;
$PDB->IdentifyAtoms;

$PDB->Get;

$PDB->Get (ChainLabel => 'A');
$PDB->Get (ChainLabel => 'A', AtomType => 'CA');

$PDB->Get (Residue => 5);
$PDB->Get (ResidueLabel => 'ALA');

$PDB->Get (AtomType => 'C | O | N');

$PDB->GetCoordinates (AtomType => 'C | O | N');
$PDB->GetAngles (Residue => 1);
$PDB->GetCentreOfMass (ChainLabel => 'B');

DESCRIPTION

Driven by the need of breaking a protein into its subgroups (models and chains), ParsePDB has been coded with the intention to create a package that is powerful enough to handle PDBs with a fair amount of functions but is still easy to handle.

Keeping the complexity at a minimum, a protein can be read, parsed and its chains written into single files with just three commands, which are as easy as new, Parse and WriteChains. Given certain parameters, the atoms and residues can be counted, renumbered and filtered (i.e. just certain elements or residues can be extracted). Most of the command names are designed in such a way that they may take a little time to type, but are easy to remember and meaningful when being read.

The parser is provided with an extensive manual and the script testparser, which contains examples for all available functions of the library to quickly test the features and play about with several different parameters for the methods.

INSTALLATION

The PDB parser itself is a Perl package, indicated by the extension .pm. To install the package globally on your system, you can use the provided makefile to copy it to your library path. To do this, login as root and follow the standard routine:

perl Makefile.PL
make
make install

Since the package uses Error.pm to handle exceptions, this package needs to be installed, too. If it is not already installed then make will issue a warning message about that. Error.pm was written by Graham Barr and is maintained by Shlomi Fish. The version which was used during the development of the PDB parser was 0.15.

After installation, the parser can be used in a script by including the library with the use command:

use ParsePDB;

AUTHOR

Benjamin M. Bulheller
Jonathan D. Hirst

COPYRIGHT

Many thanks to Dr. Daniel Barthel for useful discussions during the development of the library.

The PDB parser is an integral part of the DichroCalc web interface. If you represent any data in a publication please cite

DichroCalc – Circular and Linear Dichroism Online
B.M. Bulheller & J.D. Hirst, Bioinformatics, 25, 539–540 (2009).