Virus Structure

Human immunodeficiency virus (HIV-1) will be discussed in detail here as an example of virus structure. In some cases, comparisons between HIV-1 structure and that of other viruses will be made, either because other viruses are better understood or to illustrate variations in these processes.

Schematic representation of the structure of HIV:

HIV is a fairly complex virus, although by no means the most complicated known. The virus is thought to contain 2 identical copes of a positive sense (i.e. mRNA) single-stranded RNA strand about 9,500 nucleotides long. These may be linked to each other to form a genomic RNA dimer.

The RNA dimer is in turn associated with a basic nucleocapsid (NC) protein (p9/6). By analogy with other RNA viruses, this nucleoprotein filament may be helical, although this has not actually been determined in the case of HIV.

The ribonucleoprotein particle is encapsidated by a capsid made up of a capsid protein (CA), p24. The capsid environment also contains other viral proteins such as integrase and reverse transcriptase. It also contains a wide variety of other macromolecules derived from the cell including tRNAlys3, which serves as a primer for reverse transcription. The capsid has an icosahedral structure.

The capsid is in turn encapsidated by a layer of matrix protein (MA), p17. This matrix protein is associated with a lipid bilayer or envelope. The matrix protein may be:

The HIV envelope is derived from the host cell plasma membrane and is acquired when the virus buds through the cell membrane. An envelope is a common feature in animal viruses but uncommon in plant viruses. In the case of herpesviruses, the envelope is derived from the nuclear membrane. Other viruses such as vaccinia derive an envelope from the golgi body. A viral envelope contains the lipid and protein constituents of the membrane from which it is derived. In addition it also contains viral proteins often forming spikes or peplomers. The major HIV protein associated with the envelope is gp120/41. This functions as the viral antireceptor or attachment protein. gp41 traverses the envelope, gp120 is present on the outer surface and is noncovalently attached to gp41. The precursor of gp120/41 (gp160) is synthesized in the endoplasmic reticulum and is transported via the golgi body to the cell surface.

Other types of virus structure are known:

  1. Naked helical nucleocapsids, e.g. tobacco mosaic virus (TMV).
  2. Naked icosahedral capsids, e.g. members of the picornavirus family.
  3. Encapsidated nucleocapsids, two concentric shells of protein surrounding a nucleoprotein filament, e.g. the orbivirus bluetongue. This virus is not enveloped.
Rhabdoviruses have a similar structure to that described for HIV, they are enveloped and have a matrix protein. However the capsid is a helical nucleofilament.

Helical Nucleocapsids:

TMV, an RNA virus, is a particularly well understood example of virus structure. Protein subunits can be placed around the circumference of a circle to form a disc. If the discs are stacked, then a tube is created with room for the nucleic acid down the middle. A closer examination of these virus structures shows that the coat proteins are not arranged cylindrically but helically. This is because of the propensity for nucleic acids to adopt helical structures. By arranging the protein subunits helically then equivalent bondings between the proteins and nucleic acid can be made- except for the two end subunits. All known filamentous viruses are helical. Typically they are 15-19nm wide. The length depends on the size of the genome but 300-500nm is within the normal range. The structure of TMV can be described in terms of the number of subunits per turn of the helix i.e. 16.3. The pitch or rise per turn of the helix i.e. 2.28nm and the axial rise per subunit i.e. 0.14nm.

It is possible that the nucleoprotein filament of HIV has a similar structure. Nucleocapsid proteins are usually basic (+) proteins to neutralize and facilitate the packaging of acidic (-) nucleic acid.

Icosahedral Nucleocapsids:

Electron microscopy suggests that many viruses are roughly spherical. A detailed examination shows that they are actually icosahedral. Icosahedral viruses are very common plant and animal viruses. The HIV capsid layer is thought to have an icosahedral structure. At the moment precise details of the HIV capsid structure are not known but some general considerations are described below:

The subunits of the capsid are located around the vertices or face of an icosahedron. An icosahedron has 20 equilateral triangles arranged around the face of a sphere. It is defined by having 2, 3 and 5 fold axis of symmetry.

There are several reasons why viruses adopt icosahedral symmetry.
One is that triangulating a dome into 20 is the best way of producing a shell of equivalently bonded identical structures. It is the minimum free energy structure. The real situation is more complicated than this, because all known viruses have more than 20 subunits. 60 subunits can obviously be arranged symmetrically around an icosahedron. Only a very few viruses have such a small number of subunits e.g. ØX174. This may be because it is difficult to maintain an integral particle with a small number of subunits. Viruses generally fit 60 x N subunits into their capsids. N is sometimes called the triangulation number and values of 1,3,4,7,9,12 and more are permitted. However more than 60 subunits cannot be arranged in an equivalent fashion around an icosahedron.

Consider a T=4 virus:

The minimum free energy solution is to divide each triangle into 4 further triangles and place a subunit at each corner. There will be 12 x 20 subunits, i.e. 240 with 12 pentamers and 30 hexamers. They are obviously not equivalent, but 180 are and the remaining 60 are making similar contacts. They are said to be quasi equivalent.

Protein subunits are not spaced independently but cluster because this maximizes the intermolecular interactions which stabilize the particle. To illustrate this consider a particle with 180 subunits, 3 kinds of clustering are possible:

One consequence of this clustering is that bonds between subunits in a capsomer are stronger than bonds between capsomers which means they can be isolated for functional and structural studies.

Why subunit construction ?

Necessity: A triplet codon has a MW approximating to 1000 and codes for an amino acid of average MW 150. So at best a nucleic acid can only code for 15% of its weight as a protein. As viruses are composed of 50-90% by weight protein there must be more than one protein and subunit construction is essential.

Self Assembly: Seminal experiments in 1955 by Fraenkel Conrat and Williams. It was shown that TMV virus spontaneously formed when mixtures of purified coat protein and its genomic RNA were incubated together. This means that the structure that TMV adopts is self-ordered and therefore corresponds to a free energy minimum.

Fidelity: DNA, RNA and protein synthesis are all subject to occasional error By using a smaller protein and hence gene, it means there is less chance of an error occurring.

Economy: The correct structure can be formed with the minimum of waste since if a subunit is synthesised or folded incorrectly then only a small unit has to be discarded.

Complexity: There are physical constraints which prevent the tight packing of say an octahedron or tetrahedron. Put crudely, the holes between the subunits would be too big and the particle too leaky. Small number of contacts would be insufficient for stability.

The larger the number of subunits the more stable the virus becomes. The larger the virus particle and the bigger and more complex its genome can be.

Structure of bean pod mottle virus (BPMV):

The structure of several viruses have now been determined using X-ray crystallography. We will consider one example to illustrate the simplicity of their structures.

BPMV is an icosahedral comovirus. It has a bipartite RNA single stranded genome of positive sense. It is a T=3 virus like the picornaviruses. Many of these have evolved from a common ancestor and the proteins have a similar fold.

In a T=3 virus such as poliovirus there are 3 different subunits per 60 triangles. BPMV is subtly different. It is composed of 60 triangular units. Each unit is composed of 3 antiparallel beta-barrel proteins.

60 copies each of coat protein of two types:

S: 22Kd
L: 42Kd

These are made as a polyprotein, C-B-A. A is cleaved off. S(A) has 1 beta-barrel structure and L(C-B) has 2. This gives 180 subunits if you take the beta-barrel as the subunit. Each domain is 180-190aa long. Subunit structures have been determined. B and C are the L subunit covalently linked, interface stabilized by hydrophobic interactions. Helix in A domain interacts with a helix in B. Quaternary structure is clearly spherical due to polypeptide protrusions. It forms a wedge 50Ä long 17Ä at narrow end and 30Ä wide at large end. Two shades of green are the large subunit, small subunit is blue.

It is also possible to map some of the RNA chain in this structure, i.e. about 7 bases. Looks like A-form RNA and it's magnitude is such that this must occur some 60 times in the structure. This suggests that the RNA itself has icosahedral symmetry. Contacts between the RNA and protein are visible, in a pocket between the C and B domains. The interactions are electrostatic and van der Waals bonds.


The structure of HIV is complex. Its overall shape is thought to be icosahedral. The glycoprotein spikes showed a skewed icosahedral symmetry which is thought to reflect the icosahedral structure of matrix. The triangulation number is 71(!). The capsid is also icosahedral and may contain 1890 subunits with a triangulation number of 63(!). The nucleocapsid may adopt a helical nucleofilament structure or may also be icosahedral (not known).

What have we learned from studying virus structures ?

Knowledge of the biomolecular interactions which occur in these large structures is of fundamental importance. It may now be possible to design small compounds which bind to the surface of a virus and prevent it from entering a cell. Most animal viruses initiate infection and entry into host cells by attaching to receptors on the host cell membrane. Clearly the viral receptor attachment site must remain conserved. However the immune system may react to the surface of a virus and produce neutralizing antibodies. This can result in the continual changing of the antigenic surface of a virus that chronically infects a host. There is an obvious conflict here. Now known that the surfaces of a number of viruses e.g. polio, have deep clefts, canyons or pits. Residues in these canyons are often much more conserved than surrounding surface residues. It is therefore thought that these clefts may constitute the receptor binding site and that residues at the bottom of the canyon are inaccessible to antibodies and therefore under no pressure to change. Some evidence supports this hypothesis:

References: Click here.

Search for more information on this topic.

Return to BS335 HomePage

© Dr Shaun Heaphy.