3.0: Prelude to Biological Macromolecules - Biology

Food provides the body with the nutrients it needs to survive. These macromolecules (polymers) are built from different combinations of smaller organic molecules (monomers). What specific types of biological macromolecules do living things require? How are these molecules formed? What functions do they serve? In this chapter, these questions will be explored.

Dendritic Encapsulation of Function: Applying Nature's Site Isolation Principle from Biomimetics to Materials Science

The convergence of our understanding of structure–property relationships for selected biological macromolecules and our increased ability to prepare large synthetic macromolecules with a structural precision that approaches that of proteins have spawned a new area of research where chemistry and materials science join with biology. While evolution has enabled nature to perfect processes involving energy transfer or catalysis by incorporating functions such as self-replication and repair, synthetic macromolecules still depend on our synthetic skills and abilities to mesh structure and function in our designs. Clearly, we can take advantage of our understanding of natural systems to mimic the structural features that lead to optimized function. For example, numerous biological systems make use of the concept of site isolation whereby an active center or catalytic site is encapsulated, frequently within a protein, to afford properties that would not be encountered in the bulk state. The ability of the dendritic shell to encapsulate functional core moieties and to create specific site-isolated nanoenvironments, and thereby affect molecular properties, has been explored. By utilizing the distinct properties of the dendrimer architecture active sites that have either photophysical, photochemical, electrochemical, or catalytic functions have been placed at the core. Applying the general concept of site isolation to problems in materials research is likely to prove extremely fruitful in the long term, with short-term applications in areas such as the construction of improved optoelectronic devices. This review focuses on the evolution of a natural design principle that contributes to bridging the gap between biology and materials science. The recent progress in the synthesis of dendrimer-encapsulated molecules and their study by a variety of techniques is discussed. These investigations have implications that range from the preliminary design of artificial enzymes, catalysts, or light-harvesting systems to the construction of insulated molecular wires, light-emitting diodes, and fiber optics.

3.0: Prelude to Biological Macromolecules - Biology

Below are the courses available from the BIOPH subject code. Select a course to view the available classes, additional class notes, and class times

BIOPH 201 - Introduction to Biophysics View Available Classes

Physical principles important to the operation of biological systems. Biological applications of free energy, entropy, random walks, and diffusion dynamics at low Reynolds number cooperativity and 2-state systems structural self-assembly kinetic modeling molecular motors and enzymes membranes and potentials genetic networks sequences and evolution. Prerequisites: MATH 100/114/117/134/144, PHYS 124/144 or EN PH131. PHYS 126/130/146 recommended.

BIOPH 401 - Advanced Biophysics View Available Classes

Physical properties of biological macromolecules and macromolecular assemblies biopolymer folding ligand binding and allostery lipid membranes cellular electricity and nerve conduction models of molecular motors stochasticity in biology numerical and experimental techniques in biophysics synthetic biology. Prerequisites: MATH 209/215/317, MATH 201/334/336, BIOPH 201, PHYS 234, PHYS 230/281, PHYS 310.

BIOPH 501 - Advanced Biophysics View Available Classes

Physical properties of biological macromolecules and macromolecular assemblies biopolymer folding ligand binding and allostery lipid membranes cellular electricity and nerve conduction models of molecular motors stochasticity in biology numerical and experimental techniques in biophysics synthetic biology. Prerequisites: MATH 209/215/317, MATH 201/334/336, BIOPH 201, PHYS 234, PHYS 230/281, PHYS 310.

For Undergraduate and Graduate Students

ZOOL 4155 Vertebrate Paleontology Techniques (0-3)
ZOOL 4157 Advanced Vertebrate Paleontology Techniques (0-3)
ZOOL 4181 Vertebrate Physiology Methods (0-3)
ZOOL 4354 Paleozoic and Mesozoic Vertebrate Paleontology (3-0)
ZOOL 4356 Cenozoic Vertebrate Paleontology (3-0)
ZOOL 4380 Vertebrate Physiology (3-0)
ZOOL 4384 Neurobiology (3-0)
ZOOL 4476 Fish, Amphibians, and Reptiles (3-3)
ZOOL 4478 Birds and Mammals (3-3)

For Graduate Students Only

Introduction: Louis Pasteur and the discovery of molecular chirality

In 1848, a 25 year old chemist named Louis Pasteur made a startling - and some thought brash - claim to the scientific community. Pasteur was inexperienced, to say the least: he had only earned his doctorate the previous year, and had just started his first job as an assistant to a professor at the Ecole normale superieure, a university in Paris. Jean-Baptiste Biot, a highly respected physicist who had already made major contributions to scientific fields as diverse as meteorites, magnetism, and optics, was intrigued but unconvinced by Pasteur's claim. He invited the young man to come to his laboratory and reproduce his experiments.

Decades earlier, Biot had discovered that aqueous solutions of some biologically-derived substances, such as tartaric acid, quinine, morphine, and various sugars, were optically active: that is, the plane of polarized light would rotate in either a positive (clockwise, or right-handed) or negative (counter-clockwise, or left-handed) direction when passed through the solutions. Nobody understood the source of this optical property. One of the biological substances known to be optically active was a salt of tartaric acid, a compound found in abundance in grapes and a major by-product of the wine-making industry. fig 1b

The compound was dextrorotatory in solution &ndash in other words, it rotated plane-polarized light in the positive (right-handed, or clockwise) direction. Curiously, though, chemists had also found that another form of processed tartaric acid was optically inactive, despite that fact that it appeared to be identical to the optically active acid in every other respect. The optically inactive compound was called 'acide racemique', from the Latin racemus, meaning 'bunch of grapes'.

Louis Pasteur's claims had to do with experiments he said he had done with the 'racemic' acid. Jean-Babtise Biot summoned Pasteur to his laboratory, and presented him with a sample of racemic acid which he himself had already confirmed was optically inactive. With Biot watching over his shoulder, and using Biot's reagents, Pasteur prepared the salt form of the acid, dissolved it in water, and left the aqueous solution in an uncovered flask to allow crystals to slowly form as the water evaporated.

Biot again summoned Pasteur to the lab a few days later when the crystallization was complete. Pasteur placed the crystals under a microscope, and began to painstakingly examine their shape, just as he had done in his original experiments. He had recognized that the crystals, which had a regular shape, were asymmetric: in other words, they could not be superimposed on their mirror image. Scientists referred to asymmetric crystals and other asymmetric objects as being 'chiral', from the Greek word for 'hand'. Your hands are chiral objects, because although your right hand and your left hand are mirror images of one another, they cannot be superimposed. That is why you cannot fit your right hand in a left-handed glove.

More importantly, Pasteur had claimed that the chiral crystals he was seeing under the lens of his microscope were of two different types, and the two types were mirror images of each other: about half were what he termed 'right handed' and half were 'left-handed'. He carefully separated the right and left-handed crystals from each other, and presented the two samples to Biot. The eminent scientist then took what Pasteur told him were the left-handed crystals, dissolved them in water, and put the aqueous solution in a polarimeter, an instrument that measures optical rotation. Biot knew that the processed tartaric acid he had provided Pasteur had been optically inactive. He also knew that unprocessed tartaric acid from grapes had right-handed optical activity, whereas left-handed tartaric acid was unheard of. Before his eyes, however, he now saw that the solution was rotating light to the left. He turned to his young colleague and exclaimed, " Mon cher enfant, j&rsquoai tant aime ́ les sciences dans ma vie que cela me fait battre le coeur!&rsquo (My dear child, I have loved science so much during my life that this makes my heart pound!)

Biot had good reason to be so profoundly excited. Pasteur had just conclusively demonstrated, for the first time, the concept of molecular chirality: molecules themselves - not just macroscopic objects like crystals - could exhibit chirality, and could be separated into distinct right-handed and left-handed 'stereoisomers'. Tying together ideas from physics, chemistry, and biology, he had shown that nature could be chiral at the molecular level, and in doing do he had introduced to the world a new subfield which came to be known as 'stereochemistry'.

About ten years after his demonstration of molecular chirality, Pasteur went on to make another observation with profound implications for biological chemistry. It was already well known that 'natural' tartaric acid (the right-handed kind from grapes) could be fermented by bacteria. Pasteur discovered that the bacteria were selective with regard to the chirality of tartaric acid: no fermentation occurred when the bacteria were provided with pure left-handed acid, and when provided with racemic acid they specifically fermented the right-handed component, leaving the left-handed acid behind.

Pasteur was not aware, at the time of the discoveries described here, the details of the structural features of tartaric acid at the molecular level that made the acid chiral, although he made some predictions concerning the bonding patterns of carbon which turned out to be remarkably accurate. In the more than 150 years since Pasteur's initial tartaric acid work, we have greatly expanded our understanding of molecular chirality, and it is this knowledge that makes up the core of this chapter. Put simply, stereochemistry is the study of how bonds are oriented in three-dimensional space. It is difficult to overstate the importance of stereochemistry in nature, and in the fields of biology and medicine in particular. As Pasteur so convincingly demonstrated, life itself is chiral: living things recognize different stereoisomers of organic compounds and process them accordingly.

3.0: Prelude to Biological Macromolecules - Biology

Physical properties of biological macromolecules and macromolecular assemblies biopolymer folding ligand binding and allostery lipid membranes cellular electricity and nerve conduction models of molecular motors stochasticity in biology numerical and experimental techniques in biophysics synthetic biology. Prerequisites: MATH 209/215/317, MATH 201/334/336, BIOPH 201, PHYS 234, PHYS 230/281, PHYS 310.

BIOPH 501 - Advanced Biophysics

★ 3 (fi 6)(SECOND, 3-0-0)

Physical properties of biological macromolecules and macromolecular assemblies biopolymer folding ligand binding and allostery lipid membranes cellular electricity and nerve conduction models of molecular motors stochasticity in biology numerical and experimental techniques in biophysics synthetic biology. Prerequisites: MATH 209/215/317, MATH 201/334/336, BIOPH 201, PHYS 234, PHYS 230/281, PHYS 310.

EN PH 131 - Mechanics

★ 3 (fi 6)(EITHER, 3-1S-3/2)

Kinematics and dynamics of particles gravitation work and energy linear momentum angular momentum systems of particles introduction to dynamics of rigid bodies. Prerequisites: MATH 100 or 117, and ENGG 130. Corequisite: MATH 101 or 118. Restricted to Engineering students. Other students who take this course will receive *3.0.


Version 2.8.4 © ATSAS team 1995-2020
Version 3.0 © ATSAS team 2014-2020

Written by D. Svergun 1 , C. Barberato, M. Malfois, V. Volkov 2 , P. Konarev 1 & M. Petoukhov 1
1 European Molecular Biology Laboratory, Hamburg Outstation
Notkestr. 85, Geb. 25a
22607 Hamburg
2 Institute of Crystallography
Russian Academy of Sciences
Leninsky pr. 59
119333 Moscow

CRYSOL is a program for evaluating the solution scattering from macromolecules with known atomic structure and fitting it to experimental scattering curves from small angle X-ray scattering (SAXS). As an input one can use a PDB file with an X-ray or NMR structure of a protein or a protein-DNA(RNA) complex.

If you use CRYSOL 2.x in your work, please cite:
Svergun D.I., Barberato C. and Koch M.H.J. (1995) CRYSOL - a Program to Evaluate X-ray Solution Scattering of Biological Macromolecules from Atomic Coordinates. J. Appl. Cryst. 28, 768-773. DOI

CRYSOL 3.0 evaluates the hydration shell and its scattering in a different way: instead of using an envelope function, CRYSOL 3.0 "incrusts" the surfaces and interiors (were possible) of atomic structure with the dummy water beads. This way of the shell representation is more suitable for macromolecules with complex non-globular shapes and/or with cavities.


A molecule of high relative molecular mass, the structure of which essentially
comprises the multiple repetition of units derived, actually or conceptually, from
molecules of low relative molecular mass.

1. In many cases, especially for synthetic polymers, a molecule can be regarded
as having a high relative molecular mass if the addition or removal of one or a
few of the units has a negligible effect on the molecular properties. This statement
fails in the case of certain macromolecules for which the properties may be
critically dependent on fine details of the molecular structure.

2. If a part or the whole of the molecule fits into this definition, it may be described
as either macromolecular or polymeric, or by polymer used adjectivally. [4]

The term macromolecule (macro- + molecule) was coined by Nobel laureate Hermann Staudinger in the 1920s, although his first relevant publication on this field only mentions high molecular compounds (in excess of 1,000 atoms). [5] At that time the term polymer, as introduced by Berzelius in 1832, had a different meaning from that of today: it simply was another form of isomerism for example with benzene and acetylene and had little to do with size. [6]

Usage of the term to describe large molecules varies among the disciplines. For example, while biology refers to macromolecules as the four large molecules comprising living things, in chemistry, the term may refer to aggregates of two or more molecules held together by intermolecular forces rather than covalent bonds but which do not readily dissociate. [7]

According to the standard IUPAC definition, the term macromolecule as used in polymer science refers only to a single molecule. For example, a single polymeric molecule is appropriately described as a "macromolecule" or "polymer molecule" rather than a "polymer," which suggests a substance composed of macromolecules. [8]

Because of their size, macromolecules are not conveniently described in terms of stoichiometry alone. The structure of simple macromolecules, such as homopolymers, may be described in terms of the individual monomer subunit and total molecular mass. Complicated biomacromolecules, on the other hand, require multi-faceted structural description such as the hierarchy of structures used to describe proteins. In British English, the word "macromolecule" tends to be called "high polymer".

Macromolecules often have unusual physical properties that do not occur for smaller molecules.

Another common macromolecular property that does not characterize smaller molecules is their relative insolubility in water and similar solvents, instead forming colloids. Many require salts or particular ions to dissolve in water. Similarly, many proteins will denature if the solute concentration of their solution is too high or too low.

High concentrations of macromolecules in a solution can alter the rates and equilibrium constants of the reactions of other macromolecules, through an effect known as macromolecular crowding. [9] This comes from macromolecules excluding other molecules from a large part of the volume of the solution, thereby increasing the effective concentrations of these molecules.

All living organisms are dependent on three essential biopolymers for their biological functions: DNA, RNA and proteins. [10] Each of these molecules is required for life since each plays a distinct, indispensable role in the cell. [11] The simple summary is that DNA makes RNA, and then RNA makes proteins.

DNA, RNA, and proteins all consist of a repeating structure of related building blocks (nucleotides in the case of DNA and RNA, amino acids in the case of proteins). In general, they are all unbranched polymers, and so can be represented in the form of a string. Indeed, they can be viewed as a string of beads, with each bead representing a single nucleotide or amino acid monomer linked together through covalent chemical bonds into a very long chain.

In most cases, the monomers within the chain have a strong propensity to interact with other amino acids or nucleotides. In DNA and RNA, this can take the form of Watson-Crick base pairs (G-C and A-T or A-U), although many more complicated interactions can and do occur.

Structural features Edit

DNA RNA Proteins
Encodes genetic information Yes Yes No
Catalyzes biological reactions No Yes Yes
Building blocks (type) Nucleotides Nucleotides Amino acids
Building blocks (number) 4 4 20
Strandedness Double Single Single
Structure Double helix Complex Complex
Stability to degradation High Variable Variable
Repair systems Yes No No

Because of the double-stranded nature of DNA, essentially all of the nucleotides take the form of Watson-Crick base pairs between nucleotides on the two complementary strands of the double-helix.

In contrast, both RNA and proteins are normally single-stranded. Therefore, they are not constrained by the regular geometry of the DNA double helix, and so fold into complex three-dimensional shapes dependent on their sequence. These different shapes are responsible for many of the common properties of RNA and proteins, including the formation of specific binding pockets, and the ability to catalyse biochemical reactions.

DNA is optimised for encoding information Edit

DNA is an information storage macromolecule that encodes the complete set of instructions (the genome) that are required to assemble, maintain, and reproduce every living organism. [12]

DNA and RNA are both capable of encoding genetic information, because there are biochemical mechanisms which read the information coded within a DNA or RNA sequence and use it to generate a specified protein. On the other hand, the sequence information of a protein molecule is not used by cells to functionally encode genetic information. [1] : 5

DNA has three primary attributes that allow it to be far better than RNA at encoding genetic information. First, it is normally double-stranded, so that there are a minimum of two copies of the information encoding each gene in every cell. Second, DNA has a much greater stability against breakdown than does RNA, an attribute primarily associated with the absence of the 2'-hydroxyl group within every nucleotide of DNA. Third, highly sophisticated DNA surveillance and repair systems are present which monitor damage to the DNA and repair the sequence when necessary. Analogous systems have not evolved for repairing damaged RNA molecules. Consequently, chromosomes can contain many billions of atoms, arranged in a specific chemical structure.

Proteins are optimised for catalysis Edit

Proteins are functional macromolecules responsible for catalysing the biochemical reactions that sustain life. [1] : 3 Proteins carry out all functions of an organism, for example photosynthesis, neural function, vision, and movement. [13]

The single-stranded nature of protein molecules, together with their composition of 20 or more different amino acid building blocks, allows them to fold in to a vast number of different three-dimensional shapes, while providing binding pockets through which they can specifically interact with all manner of molecules. In addition, the chemical diversity of the different amino acids, together with different chemical environments afforded by local 3D structure, enables many proteins to act as enzymes, catalyzing a wide range of specific biochemical transformations within cells. In addition, proteins have evolved the ability to bind a wide range of cofactors and coenzymes, smaller molecules that can endow the protein with specific activities beyond those associated with the polypeptide chain alone.

RNA is multifunctional Edit

RNA is multifunctional, its primary function is to encode proteins, according to the instructions within a cell’s DNA. [1] : 5 They control and regulate many aspects of protein synthesis in eukaryotes.

RNA encodes genetic information that can be translated into the amino acid sequence of proteins, as evidenced by the messenger RNA molecules present within every cell, and the RNA genomes of a large number of viruses. The single-stranded nature of RNA, together with tendency for rapid breakdown and a lack of repair systems means that RNA is not so well suited for the long-term storage of genetic information as is DNA.

In addition, RNA is a single-stranded polymer that can, like proteins, fold into a very large number of three-dimensional structures. Some of these structures provide binding sites for other molecules and chemically-active centers that can catalyze specific chemical reactions on those bound molecules. The limited number of different building blocks of RNA (4 nucleotides vs >20 amino acids in proteins), together with their lack of chemical diversity, results in catalytic RNA (ribozymes) being generally less-effective catalysts than proteins for most biological reactions.

Introduction to Biological Assemblies and the PDB Archive

When exploring Structure Summary pages on the RCSB PDB website, you will notice images and coordinate files for the "Biological Assembly" and the "Asymmetric Unit". In many PDB entries, these are the same. However, for some entries (mostly those solved by X-ray crystallography), you may notice a difference between the asymmetric unit and the biological assembly. If you have wondered whether the coordinates for the given structure represent the biologically-relevant assembly, read on to find out more about the meaning of these terms and how the corresponding data are archived in the files.

The primary coordinate file of a crystal structure typically contains just one crystal asymmetric unit and may or may not be the same as the biological assembly. This introduction describes the terms asymmetric unit and biological assembly, lists where information about these can be found in various files formats (PDB and mmCIF), and explains how biological assembly files in the PDB archive are derived. Since the PDBML format is derived from the mmCIF format file, a separate discussion of this format is not included here.

Table of Contents

Asymmetric Unit

The asymmetric unit is the smallest portion of a crystal structure to which symmetry operations can be applied in order to generate the complete unit cell (the crystal repeating unit). Symmetry operations most common to crystals of biological macromolecules are rotations, translations and screw axes (combinations of rotation and translation).

Application of crystallographic symmetry operations to an asymmetric unit yields one unit cell that when translated in three dimensions makes up the entire crystal.

Below is a simple example. The asymmetric unit (green upward arrow) is rotated 180 degrees about a two-fold crystallographic symmetry axis (black oval) to produce a second copy (purple downward arrow). Together the two arrows comprise the unit cell. The unit cell is then translationally repeated in three directions to make a 3-dimensional crystal.

The asymmetric unit contains the unique part of a crystal structure. It is used by the crystallographer to refine the coordinates of the structure against the experimental data and may not necessarily represent a whole biologically functional assembly.

A crystal asymmetric unit may contain:

  • one biological assembly
  • a portion of a biological assembly
  • multiple biological assemblies

The content of the asymmetric unit depends on the crystallized molecule's position(s) and its conformations within the unit cell. Depending on the crystallization conditions and local packing two distinct scenarios may occur:

  • Copies of the macromolecule or complex within a crystal unit cell have identical conformations and occupy symmetry-related positions. As a result, the biological assembly may either be composed of one copy of the macromolecule/complex or it may be composed of two or more symmetry related molecules/complexes coming together to form a larger assembly.
  • Copies of the macromolecule or complex take on slightly different conformations and occupy unique positions in the crystal asymmetric unit. As a result, each of the different positions of the macromolecule/complex may correspond to structurally similar but not identical biological assemblies.

Hemoglobin, a molecule with four protein chains (two alpha-beta dimers), provides good examples from PDB entries for each of these cases:

Asymmetric unit with one biological assembly Asymmetric unit with a portion of a biological assembly Asymmetric unit with multiple biological assemblies
Entry 2hhb contains one hemoglobin molecule (4 chains) in the asymmetric unit. Entry 1out contains half a hemoglobin molecule (2 chains) in the asymmetric unit. A crystallographic two-fold axis generates the other 2 chains of the hemoglobin molecule. Entry 1hv4 contains two hemoglobin molecules (8 chains) in the asymmetric unit.

Biological Assembly

The biological assembly (also sometimes referred to as the biological unit) is the macromolecular assembly that has either been shown to be or is believed to be the functional form of the molecule. For example, the functional form of hemoglobin has four chains.

Depending on the particular crystal structure, symmetry operations consisting of rotations, translations or their combinations may need to be performed in order to obtain the complete biological assembly. Alternately, a subset of the deposited coordinates may need to be selected to represent the biological assembly. Thus, a biological assembly may be built from:

  • one copy of the asymmetric unit
  • multiple copies of the asymmetric unit
  • a portion of the asymmetric unit

Hemoglobin is used again to demonstrate each of these cases:

Biological assembly composed of one copy of the asymmetric unit Biological assembly composed of multiple copies of the asymmetric unit Multiple biological assemblies in the asymmetric unit
In entry 2hhb, the biological assembly is equivalent to the asymmetric unit. In entry 1out the biological assembly includes two asymmetric units. In entry 1hv4 the biological assembly is one-half of the asymmetric unit.
No operations are necessary. Application of a crystallographic symmetry operation (a 180 rotation around a crystallographic two-fold axis) produces the complete biological assembly. The entry contains two structurally similar, but not entirely identical copies of the biological assembly within the crystal asymmetric unit.

A biological assembly is not always a multi-chain grouping.

For example, the functional unit of dihydrofolate reductase (shown here from entry 7dfr) is a monomer and the biological assembly also contains only one chain.

A molecule may occasionally appear to be multimeric within a crystal based on crystal packing. However, there may be no evidence or biological relevance in support of a multimeric state in solution. When the entry is processed, all probable assemblies are computed based on the buried surface area and interaction energies. These predicted assemblies may or may not coincide with what the author considers to be the biologically relevant assembly for the molecule. The biological assemblies reported in the entry include a remark to explain whether it is "author provided", "software determined" or both.

For example, the T4 lysozyme structure presented in entry 3fad has a single chain in the asymmetric unit. Normally, lysozyme functions as a monomer. The "author provided" and also the "software determined" biological assembly for this entry is a monomer. Based on crystal packing, buried surface area and interaction energies, the software (PISA 1 ) predicts that this specific mutant/crystal form of T4 lysozyme may form a dimer. The assemblies defined for PDB entry 3fad are shown below:

Asymmetric unit (monomer) Author & Software Determined Biological Assembly (monomer) Software Determined Biological Assembly (dimer)
The asymmetric unit is a monomer. These are the deposited coordinates. The "author provided" and "software determined" biological assemblies are both monomers. The software, PISA, predicts that this molecule may also form a dimer. Hence the second biological assembly is only "software determined".

In the web file download options, various versions of the biological assembly files are marked as (A) for author provided and (S) for software determined.

Viral capsid crystal structures often contain only part of the crystal asymmetric unit. These entries require non-crystallographic symmetry operators to be applied to the deposited coordinates in order to generate the crystal asymmetric unit.

Icosahedral virus capsids have a complex symmetry with 60 equivalent positions generated by 5-fold, 3-fold, and 2-fold rotation operations that intersect at a single central point. The deposited coordinates for an icosahedral virus crystal structure most often consist of the unique chain(s) for the icosahedral asymmetric unit and a set of non-crystallographic symmetry operators to generate the crystal asymmetric unit. Additional crystallographic symmetry operators may be needed to generate the biological assembly and/or the crystallographic unit cell. The various assemblies for an icosahedral virus crystal structure are illustrated for the case of PDB entry 1qqp below:

Icosahedral asymmetric unit Crystal asymmetric unit Biological Assembly Crystallographic unit cell
The deposited coordinates represent 1 icosahedral asymmetric unit. This unit is represented by ribbons in all views. The crystal asymmetric unit is pentameric. The biological assembly is an icosahedron (as show above). The complete crystal unit cell contains 2 icosahedral virus particles.

In addition to crystal structures of virus capsids, the PDB archive holds virus structures determined by electron microscopy, fiber diffraction and solid state NMR. In all cases of assemblies with regular point or helical symmetry, the PDB entry includes the coordinates of the repeating unit and the appropriate crystallographic and/or non-crystallographic symmetry operators required to generate the biological assembly.

For example, in the fiber diffraction structure of filamentous bacteriophage PF1, in entry 1ql2, the asymmetric unit contains 3 helices while the biological assembly is a helical virus, generated by applying matrices that represent the helical rotation and translation.

Biological Assembly Description in mmCIF and PDB Format Files

Instructions for Generating Biological Assemblies in mmCIF Format Files

In mmCIF format files, details about the structural elements that form each biological assembly are found in the pdbx_struct_assembly, pdbx_struct_assembly_gen and pdbx_struct_oper_list categories. The first two categories describe the generation of each biological assembly for the structure and present details about it, while the third one lists the transformations required for generating the biological assembly. The category pdbx_struct_assembly_gen links the transformations in pdbx_struct_oper_list with the chains to which they apply (note that the chain identifiers are the asym_ids used throughout the mmCIF file). Any specific biological assembly related remarks from the authors are stored in the struct_biol category.

A Simple Example - Entry 3c70

In the pdbx_struct_oper_list category, the 1_555 notation is crystallographic shorthand to describe a particular symmetry operator (the number before the underscore) and any required translation (the three numbers following the underscore). Symmetry operators are defined by the space group and the translations are given for the three-unit cell axis (a, b, and c) where 5 indicates no translation and numbers higher or lower signify the number of unit cell translations in the positive or negative direction. For example, 4_565 indicates the use of symmetry operator 4 followed by a one-unit cell translation in the positive b direction.

Example of a Viral Capsid -- Entry 2bfu

In the case of viruses and other complex assemblies with non-crystallographic symmetry, the biological assembly is more complex and may also be composed of many sub-assemblies. The data items in pdbx_struct_assembly list all the possible sub-assemblies, while those in _pdbx_struct_assembly_gen list the process of generating these assemblies. The struct_oper_list category gives a list of matrices (both crystallographic and non-crystallographic operators) required to create the various biological assemblies from the given coordinate file. This list also includes the matrices: "P" to transform the deposited coordinates to a standard point frame, and "X0" which is the transformation required to move the deposited coordinates into the crystal frame 2 . Thus, the deposited coordinates may be transferred to either the standard or crystal frames using these matrices.

The data category _pdbx_struct_oper_list is used for all viruses and holds the matrices for BIOMT records that appear in REMARK 350 of the PDB format file. In cases where the assembly definition listed in struct_oper_list requires sequential multiplication of matrices (example entry 1m4x), the pdbx_struct_oper provides the final list of matrices which are applied to the deposited coordinates. In all data blocks shown below, the matrices 5-58 were edited out for brevity. In addition to these categories, non-crystallographic symmetry (NCS) symmetry operators are listed in the _struct_ncs_oper category.

Please see the mmCIF dictionary for additional details and further information on the mmCIF format.

Instructions for Generating Biological Assemblies in PDB Format Files

In PDB format files, information about the biological assembly is given in REMARKs 300 and 350. REMARK 300 provides a free text remark regarding the biological assembly and may include specific comments provided by the author. REMARK 350, on the other hand presents all transformations (rotational and translational), both crystallographic and non-crystallographic, that are needed to generate the biological assembly. In addition to transformation information provided by the author, descriptions of potential assemblies that can be computationally determined are also provided when available. Author-provided and software-determined biological assemblies are marked appropriately.

A Simple Example - Entry 3c70

In the entry 3c70, REMARK 300 is a free text remark followed by REMARK 350 which includes the transformations required to generate the biological dimer from the deposited coordinates.

In this example, the asymmetric unit is composed of a single chain (chain A). The biological dimer is generated from two copies of the asymmetric unit. The first copy is identical to the deposited asymmetric unit (note the identity operation in green). The second copy is generated by applying a crystallographic symmetry operation consisting of a rotation matrix (red) and a translation vector (blue). Note that this biological assembly is both author provided and software (PISA) predicted.

An Example from a Viral Capsid -- Entry 2bfu

In this example the deposited coordinates include two chains (L and S) that comprise the icosahedral asymmetric unit (1/60th of the complete virus capsid). REMARK 300 is a free text remark while REMARK 350 provides the transformations required for generating the icosahedral virus. Note: matrices 5 through 58 in REMARK 350 have been omitted here for brevity.

The crystallographic asymmetric unit of entry 2bfu is composed of 10 chains (chains L, S and four other copies of each chain generated by the following matrices):

The first matrix is a unit matrix and corresponds to the deposited coordinates. Since these are already given in the PDB format file, they are flagged with "1" on the right hand side of the matrix. The other four matrices generate a five-fold symmetric sub-assembly of the virus.

Note: Not all PDB or mmCIF coordinate files contain information regarding generation of the assumed biological assembly.

Displaying and Downloading Biological Assembly Coordinate Files

wwPDB-created coordinate files for the biological assemblies (or biological units) are archived in the directory data/biounit/coordinates.

These files can also be accessed from the RCSB PDB website. For any given entry, the default view on the Structure Summary page shows the biological assembly. The forward and backward arrows at the top of the visualization box allow toggling between the asymmetric unit and biological assembly images. In the case that there are multiple biological assemblies for the entry, the forward arrow can be used to browse through all of them. The biological assembly files can be downloaded from the "Download Files" menu options on the top right corner. For an example see entry 2bfu.

Specific databases, such as PISA 1 may also be used to study the biological assemblies of PDB entries.


Shuchismita Dutta, Rachel Kramer Green, and Catherine L. Lawson


1 E. Krissinel and K. Henrick (2007) Inference of macromolecular assemblies from crystalline state. J. Mol. Biol. 372: 774-797.

2 C.L. Lawson, S. Dutta, J.D. Westbrook, K. Henrick, H.M. Berman (2008) Representation of viruses in the remediated PDB archive. Acta Cryst. D64: 874-882

About PDB-101

PDB-101 helps teachers, students, and the general public explore the 3D world of proteins and nucleic acids. Learning about their diverse shapes and functions helps to understand all aspects of biomedicine and agriculture, from protein synthesis to health and disease to biological energy.

Why PDB-101? Researchers around the globe make these 3D structures freely available at the Protein Data Bank (PDB) archive. PDB-101 builds introductory materials to help beginners get started in the subject ("101", as in an entry level course) as well as resources for extended learning.

Isothermal Titration Calorimetry (ITC)

Location: CfSB (Center for Structural Biology) Room 113

Description: The MicroCal VP-ITC is useful for the characterization of the thermodynamics of a binding reaction in solution. In an ITC experiment, aliquots of a titrant (typically a protein, peptide or small molecule) are injected into the cell containing a macromolecule solution. With each titration injection, the molecules interact and heat is either generated or absorbed. The VP-ITC measures this heat of binding to determine the binding constants (K), reaction stoichiometry (n), enthalpy (ΔH) and entropy (ΔS). In addition, varying the temperature of the experiment allows the determination of the heat capacity (ΔCp) for the reaction. Since the heat of binding is a naturally occurring event, the ITC does not require immobilization and/or modification of the reactants. There are no limits on the protein or ligand size nor is the system dependent on the optical properties of the samples. The major limitation of the ITC is that it requires relatively high concentrations of samples.

  • To monitor binding interactions, such as but not limited to: antigen-antibody, DNA-drug, receptor-target, protein-ligand or protein-protein
  • Determination of reaction stoichiometry
  • Measurement of binding constants
  • To measure the thermodynamic properties of binding – enthalpy, entropy and Gibbs free energy

Watch the video: Biological Molecules Igcse Biology (January 2022).