[CMBI Logo] Centre for Molecular and Biomolecular Informatics
University of Nijmegen, Toernooiveld 1, 6525 ED Nijmegen
[CMBI Logo]
Dictionary

| A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z |

Warning

This dictionary contains links to WWW pages that are scattered all over the world. Although we do our best to only use pointers to pages that seem likely to stay in place for a long time, we can absolutely not guarantee that this page is free of so-called dead links.

A

Alanine amino acid Ala, A
Small, hydrophobic residue with a strong preference for alpha-helix.

Annotation Annotation is the process of adding previous knowledge to (often new) data. For example, we call it annotation if the curator of SwissProt adds the DE record "This is a molecule that functions... etc".

Arginine amino acid Arg, R
Large, polar residue that is normally positively charged.

Asparagine amino acid Asn, N
Intermediately sized, polar residue, bad for helix, good for turn formation. Can easily form hydrogen bonds with the own or local backbone.

Aspartic acid amino acid Asp, D
Intermediately sized, polar residue that normally is negatively charged. Asp is bad for helix, except in the N-teminal turn. It is a good residue for turn formation. Can easily form hydrogen bonds with the own or local backbone. Despite that its side chain is only one carbon shorter than the side chain of glutamic acid, its characteristics are very different. This difference is mainly caused by the fact that aspartic acid can easily form hydrogen bonds with the own or local backbone, while glutamic acid does cannot do that so easily.

B

BLAST Basic Local Alignment Search Tool
Look at NIH for a detailed explanation. Another explanation can be found at:University of connecticut, from where I quote: (Basic Local Alignment Search Tool) is a method to ascertain sequence similarity (for definitions of similarity and homology click here). The program takes a query sequence and searches it against the database selected by user. It aligns a query sequence against the every subject sequence in the database. The results are reported in a form of a ranked list followed by a series of individual sequence alignments, plus various statistics and scores. Every hit in that list is assigned with a similarity score S. Further, that score is analyzed how likely it is to arise by chance. For that purpose so called E-value is calculated for every hit. E-value for the score S tells the expected number of hits of the score S or higher in the database.

Beta-turn Point where a beta strand turns by 180 degrees to fold back on itself, forming two anti-parallel beta strands. Make such a sharp turn in a protein costs bending energy. This energy has to be regained. Therefore five residues are seen more often than all the other ones in beta-turns: Ser, Asp, Asn, Gly, and Pro. The first three are good for a turn because they can form hydrogen bonds with the own or local backbone. Glycine is good because it doesn't have a side chain, and therefore there is no strain, and Proline is OK, because its covalent bond puts it in a conformation that is just nice for certain positions in certain turn types. There are several types of turns, and the central positions of different turn types prefer different residue types.

B-factor. In theory, the B-factor indicates the mobility of an atom. Low B-factor = low mobility; high B-factor = high mobility. In practice many forms of experimental data are absorbed in the B-factor. Normally B-factors can be interpreted as follows:

Bifurcated hydrogen bond. Most authors only talk about bifurcation when a hydrogen points inbetween two acceptors, thereby seemingly forming two hydrogen bonds at the same time (I am not 100% sure, but I think they actually form one after the other in rapid succession). A few authors also talk about bifurcation when two hydrogens, both attached to the same heavy atom, make hydrogenbonds to the same hydrogenbond acceptor.

C

C-alpha trace. The picture of a protein gets considerably simpler when less atoms are displayed. A commonly used reduced complexity representation is the so-called C-alpha trace. In this representation only the alpha carbons of the amino acids (or phosphates of the nucleic acids) are shown, connected by straight lines.

C-Cap Last residue in a helix.
Helices need capping at both ends. The first residue in the helix is called the N-cap, the last one (at the C-terminal end) is called C-cap. The definition of the ends of a helix is not clear because different people use different definitions. Some consensus of all these methods is that a residue belongs to the helix if it has at least one normal helical hydrogen bond with the helix, and the alpha carbon of that residue is not too far away from the helix cylinder.

Cysteine amino acid, Cys, C
Amino acid of small to intermediate size. The S-H group in the side chain is very reactive, can easily get oxydized, or form a so-called cys-cys bridge with the S-H group of another cysteine in the same or in another protein. Cysteines are often involved in binding metal ions such as copper or zinc. Cysteine is hydrophobic.

Chi-angle. Torsion angles in the side chain are called chi-angles. Actually, one should use the greek character pronounced chi, but this stupid editor cannot put that in my WWW page...

D


E

EBI European Bioinformatics Institute.
This is an outstation of the EMBL, which is the European Molecualar Biology Laboratorium in Heidelberg, Germany.
The EBI's main task is to maintain the big world-wide centralised databases such as the nucleotide database, etc. They also work on bioinformatics tools, and a large series of smaller (niche) databases.

EMBL European Molecular Biology Laboratory.
This is, as the name already suggests, a central European (not EC though) research facility specialising in molecular biology and related fields. As molecular biology is nearly impossible without strong bioinformatics support, the EMBL has a large (and influential) bioinformatics department.


F

Forced mariage
One speaks of a forced mariage if an observation is not caused by the event studied. For example, when studying the accessibility of amino acids one observes that the hydrophobic residue proline, sits very, very often at the surface of proteins. This is caused by the fact that prolines often sit in beta-turns, and beta-turns almost exclusively sit at the surface. So, proline is forced to sit at the surface because it likes to sit in beta-turns, and beta-turns have a mariage with the protein surface.

G

Glutamic acid amino acid, Glu, E
This a large, polar residue that normally is negatively charged. Glutamate is good for helices. Despite that its side chain is only one carbon longer than the side chain of aspartic acid, its characteristics are very different. This difference is mainly caused by the fact that aspartic acid can easily form hydrogen bonds with the own or local backbone, while glutamic acid does cannot do that as easily.

Glycine amino acid, Gly, G
Very small residue without side chain (instead of a side chain, there is just a proton). Due to the absence of the side chain, glycine does not suffer from all kinds of strain between the side chain and the own or the local backbone. This makes glycine a good candidate for beta turns. In the unfolded form of the protein, glycine has more degrees of backbone freedom than all other residues. Therefore, glycine is not so good for protein folding as the other residues. Consequently, Gly -> Ala mutations are often attempted when a protein needs to be made more thermo stable by means of mutagenesis.

H

Histidine amino acid, His, H
A large, polar amino acid with a five membered ring containing two nitrogens in its side chain. These nitrogens can both be protonated or deprotonated around neutral pH values. Therefore histidine can be observed in four states:

  1. Both nitrogens protonated, positive
  2. Only the N-delta nitrogen protonated, neutral
  3. Only the N-epsilon nitrogen protonated, neutral
  4. None of the nitrogens protonated, negative
This easy in modifying its protonation state makes histidine a prime candidate for functions in active sites. Histidine is also often involved in binding metal ions such as copper or zinc (often also, zinc binding in the active site).

Hydrophobicity (detailed explanation).
Hydrophoob literally means afraid of water. The term hydrophobicity is commonly used to indicate a degree of being hydrophoob. The more hydrophoob a residue is, the more it likes to be at the inside of a protein.

Homology Two sequences are homologous if they share a common evolutionary history, or in other words, if there existed an ancestral molecule in the past that was ancestral to both of the sequences.

Hydrogen bond If an atom carrying a hydrogen comes close to an atom that has a lone pair (too many electrons) these two atoms start sharing the proton and the electrons. This bond, which has a partly covalent bond character is stronger than when the atoms attrack each other by induced dipoles or something similar. Such a bond that involves a shared proton is called a hydrogen bond.

Haplotype. Haplotypes are simply long stretches of DNA--including perhaps as many as 100,000 bases--at a given location on a chromosome. To their surprise, genome researchers have found that many such blocks come in just a few different versions (see SNP), a discovery that should simplify the search for associations between DNA variations and complex diseases such as cancer, diabetes, and mental illness.
A haplotype map will thus, it is hoped, be a tool for pinning down the genes that contribute to the development of those conditions.
The ethical problem is that haplotype maps can more easily than any other genetic information be linked to race, etc.

I

Isoleucine amino acid, Ile, I
Isoleucine is a hydrophobic residue of intermediate size. Its side chain branches at the beta carbon. This beta-branched residue therefore likes to sit in beta strands. Funny enough, it is also not bad for a helix.

Isomer
A chemical entity with the same number and types of atoms as another chemical entity, but possessing different properties. There are structural isomers, geometric isomers, optical isomers, and stereo-isomers


J


K


L

Leucine amino acid, Leu, L
This is a hydrophobic residue of intermediate size. It very much likes to sit in a helix.

Lysine amino acid, Lys, K
This is a large, polar residue that normally is positively charged.


M

Methionine amino acid, Met, M
Methionine is a large, hydrophobic residue. Its sulphur is sometimes involved in metal binding, and is a bit reactive (can be oxidized), but clearly less reactive than the S-H group in cysteine.


N

N-Cap. First residue in a helix.
Helices need capping at both ends. The first residue in the helix is called the N-cap, the last one (at the C-terminal end) is called C-cap. The definition of the ends of a helix is not clear because different people use different definitions. Some consensus of all these methods is that a residue belongs to the helix if it has at least one normal helical hydrogen bond with the helix, and the alpha carbon of that residue is not too far away from the helix cylinder.

NMR Nuclear magnetic resonance.
Every atomic nucleus has a so-called spin. Using strong magnetic fields and radio pulses, these spins can be manipulated. The spins of nuclei that are close to each other in space influence each other. These interactions can often be measured, and from that data it is possible to get a rough estimate of interatomic distances, torsion angles, etc. This information can be used to calculated atomic coordinates.

O


P

PAM Point Accepted Mutation.
Click here for detailed information.

PDB Protein data bank.
This is the site that collects (with help from the EBI) macromolecular structures that are solved using NMR or Xray.

Phenylalanine amino acid, Phe, F
This is a big, hydrophobic, aromatic residue.

Proline amino acid, Pro, P
This is an intermediately large hydrophobic residue. Proline is actually not an amino acid, but an imino acid (because the delta-carbon in the side chain is bound to the backbone nitrogen; this nitrogen therefore lacks one proton because of which we call it imino instead of amino), but that is only a matter of semantics. Due to the bond between the delta carbon in the side chain and the backbone nitrogen, proline is very rigid. This rigidity makes it a good candidate for beta-turns.

Q


R

R-factor. The R-factor is a quality indicator used by crystallographers. Formally speaking, it is related to the percentage error between the experimentally observed reflection intensities and the intensities that can be calculated from the coordinates. In principle, the protein structure is solved once the R-factor is below 20. Unrefined structures (in the past refinement was often too CPU time consuming; nowadays refinement is more or less standard) often have R-factors up to 60. Refined structures with an R-factor in the 20 - 25 range are probably for the largest part correct. Refined structures with an R-factor above 25 normally are wrong. Read the PDB file header to see if a structure was refined (it will read in or near REMARK 3: REFINEMENT and on the next line the name of the program used for refinement).

Resolution.. The resolution of a structure is the best indicator of its potential quality. The explanation of what exactly is the resolution is too difficult, even for many crystallographers. One additional problem is that high resolution (which is good) is indicated by low numbers. Resolution is expressed in units of Ångstrom. A resolution of 1.3 is very good, whereas a resolution of 2.8 is poor. In general, at and above 3.0 structures have a good chance of being wrong in parts, and the precission of the atomic coordinates is plus or minus an Ångstrom. Between 2 and 3 Ångstrom molecules should be checked extensively with the WHAT_CHECK server. Better than 2.0 Ångstrom we start talking about structures that have a good chance of being correct in most of their aspects, although the precission of coordinates at around 2.0 Ångstrom resolution is still on the order of 0.5 Ångstrom. Around 1.5 Ångstrom, one must really be a determined idiot to get a structure wrong, and at this resolution or better you can trust the precission data that the crystallographers give in the article.

S

Serine amino acid, Ser, S
This is a small, alcoholic, hydrophylic residue. It likes beta turns.

Similarity Quantity that indicates for example the percentage identical amino acids between two sequences. Similarity is an observed quantity, that might be for example be expressed in percent of residues that are similar between two aligned sequences. Similarity is a bad measure, because it is subjective. The author of the software decides whether Gln and Asp are similar or not. The percentage identity is a much better measure.
There is an important difference between similarity and homology. Similarity is a value between 0.0 and 1.0, or between 0 and 100%. On the other hand, there are no degrees of homology. The sequences are either homologous or not.

Sheet, or beta-sheet. Two or more beta-strands that are hydrogenbonded form together a beta-sheet.

SNP. Single nucleotide polymorphism, or point mutation in the DNA. SNPs are the main reason that you and I (fortunately) look different...

T

Threonine amino acid, Thr, T
This is a small, alcoholic residue, that is somewhat hydrophylic. Due to the branched nature of the beta carbon in its side chain, it likes beta strands.

Tryptophan amino acid, Trp, W
This is the biggest amino acid. It is hydrophobic, despite the hydrogen bond donor in the side chain. It is aromatic.

Tyrosine amino acid, Tyr, Y
This is a large, aromatic, alcoholic amino acid.

U

V

Valine amino acid, Val, V
This is a small, hydrophobic residue. Due to the branched nature of the beta carbon in its side chain, it likes beta strands.

W

X

Xray Rontgendiffraction. A technique to determine the three dimensional coordinates of atoms in molecular structures. Structure determination by Xray requires that crystals are grown of the molecule of interest. These crystals are subjected to X rays (rontgen radiation). The regular nature of the crystal causes that the diffracted X rays form distinct patterns on a rontgen sensitive film or a specialised rontgen radiation detector. From these patterns the coordinates of the atoms can be calculated.

Y

Z

© 2000 Feb 21 G Vriend

Course homepage Course hotlist Dictionary Pointer