Bioperl course

Catherine Letondal

Katja Schuerer

March, 14 2005

Abstract

Introduction to Bioperl (www.bioperl.org). This course introduces to the bioperl modules with examples and exercises. The course content has been upgraded to bioperl 1.0 (differences with bioperl version 0.7 are displayed in yellow color in the diagrams). Apart from this upgrade, the presentation has been reorganized by topics, rather than by daily schedule, without any major changes in the content. A PDF version is now available.

Contact: help@pasteur.fr

Comments are welcome.


Table of Contents

1. General introduction
1.1. Bioperl documentation
1.2. General bioperl classes
2. Sequences
2.1. The Bio::SeqIO class
2.1.1. Format Converter
2.2. Sequence classes
2.2.1. Introduction
2.2.2. Building mechanisms summary
2.2.3. A deeper insight into the Bio::Seq class
2.2.4. Bioperl 'Sequence' classes structure
2.3. Features and Location classes
2.3.1. Feature
2.3.2. Code reading: extracting CDS
2.3.3. Tag system
2.3.4. Location
2.3.5. Graphical view of features
2.4. Sequence analysis tools
3. Alignments
3.1. AlignIO
3.2. SimpleAlign
3.3. Code reading: protal2dna.
4. Analysis
4.1. Blast
4.1.1. Running Blast
4.1.2. Parsing Blast
4.1.3. Bio::Tools::BPlite family parsers
4.1.4. PSI-BLAST (Position Specific Iterative Blast)
4.1.5. bl2seq: Blast 2 sequences
4.1.6. Blast Internal classes structure
4.2. Genscan
5. Databases
5.1. Database classes
5.2. Accessing a local database with golden
6. Perl Reminders
6.1. UML
6.2. Perl reminders to use bioperl modules
6.2.1. References
6.2.2. Filehandles and streams
6.2.3. Exceptions
6.2.4. Getopt::Std
6.2.5. Classes
6.2.6. BEGIN block
6.3. Perl reminders for a further advanced understanding of bioperl modules
6.3.1. Modules
6.3.2. Compiler instructions
6.3.3. Tie
A. Solutions
A.1. Sequences
A.2. Features
A.3. Alignments
A.4. Analysis
A.5. Databases

List of Figures

2.1. Bio::Seq class structure
2.2. Relation of a SwissProt entry and the corresponding Bio::Seq object components
2.3. Bio::Annotation package structure
2.4. Bioperl 'Sequence' classes structure
2.5. Features Classes structure
2.6. Correspondance between an EMBL entry and bioperl tags
2.7. Location Classes Structure
2.8. Graphical view of some features of the SwissProt entry BACR_HALHA
3.1. AlignIO Classes diagram
3.2. Align Classes diagram
4.1. Blast Classes diagram
4.2. BPLite Classes diagram
4.3. Blast internal classes diagram
4.4. Genscan Classes Structure
5.1. Database Classes structure
6.1. UML meanings
A.1. Bio::SeqIO structure

List of Examples

2.1. SwissProt -> Fasta
2.2. Loading a sequence from a remote server
2.3. Find the references to the PDB database entries
2.4. Adding a feature to a Genbank entry
3.1. Format conversions with AlignIO
3.2. Basic methods of SimpleAlign
3.3. Filter gap columns
4.1. StandAloneBlast run
4.2. StandAloneBlast parsing
4.3. Parsing from a Blast file
4.4. Parsing with BPLite
4.5. Running PSI-blast
4.6. PSI-Blast SearchIO class
4.7. Running bl2seq
4.8. Genscan parsing
4.9. Genscan parsing, with sub-sequences
5.1. Database class use
5.2. Database Index creation

List of Exercises

2.1. Bio::SeqIO
2.2. An universal converter
2.3. An universal converter using new
2.4. Display a sequence in fasta format
2.5. More on annotations
2.6. Transmembran helices
2.7. extractcds
2.8. Extract translations from a Genbank entry
3.1. Run Clustalw
3.2. Create an alignment without gaps
4.1. Running Blast on a Swissprot entry
4.2. Running Blast: Setting parameters
4.3. Running Blast: Saving output
4.4. Running a Remote Blast
4.5. Display Blast hits
4.6. Class of a Blast report
4.7. Parse a Blast output file
4.8. Parse Blast results on standard input
4.9. Filtering hits by length
4.10. Filtering hits by position
4.11. Display the best hit by databank
4.12. Multiple queries
4.13. Extracting the subject sequence
4.14. Extracting alignments
4.15. Locate EST in a genome
4.16. Record Blast hits as sequence features
4.17. Print informations from a BPLite report
4.18. Create a Bio::Tools::BPlite from a file
4.19. Create a Bio::Tools::BPlite from standard input
4.20. Parse a PSI-blast report
4.21. Build a Bio::SimpleAlign object
4.22. Code reading: Bio::Tools::Genscan module
5.1. Parse a Genscan report and build a database entry
5.2. Parse a Genscan report and build a database entry with the genomic sequence
5.3. Build a small bioperl module (for the golden program)