How does the molecular machinery choose where to cut a chromosome for recombination?

I'm wondering about a few technicalities of crossover in meiosis. The point of crossover is to create new chromosomes that don't have the same allele combinations as the original two chromosomes. Usually, the chromosomes are cut at the same place on both chromosomes, and each piece is then stitched to that place on the other. This is to avoid unequal recombination, a scenario in which one chromosome has several instances of a gene and the other no instance at all. I'm wondering how the molecular machinery knows where to cut.

So here's my question: How does the molecular machinery choose where to cut a chromosome for recombination?

This question has two parts: At what type of place does it occur (does the machinery choose a completely random place, regardless of where genes start and end, does it just cut at the start of genes, or does it do something else)? Given that it happens at this type of place (e.g. start of a gene), how does it decide that it will cut here (the start of this gene) and not there (the start of that gene)?

The question is very broad and complicated, since the situation may differ in prokaryotes and eukaryotes. Nevertheless, I'm citing a good paper that is closely related to your question:

Studies in yeast show that initiation of recombination, which occurs by the formation of DNA double-strand breaks, determines the distribution of gene conversion and crossover events that take place in nearby intervals. Recent data in humans and mice also indicate the presence of highly localized initiation sites that promote crossovers clustered around the region of initiation and seem to share common features with sites in yeast. On a larger scale, chromosomal domains with various recombination rates have been identified from yeast to mammals. This indicates a higher level of regulation of recombination in the genome with potential consequences on genome structure… DSBs (Double Strand Breaks) occur in highly localized regions and spread over 70-250 bp. DNA sequence analysis reveals no unique conserved consensus sequences, although a degenerate 50-bp motif partly correlates with DSB sites. However, one common feature is that DSBs are located in accessible regions of the chromatin next to either promoters or binding sites for transcription factors. Based on two studies, DSB activity does not correlate with local transcriptional activity, but depends on transcription-factor binding (HIS4 in S. cerevisiae and ade6-M26 in S. pombe).

Bernard de Massy, Distribution of meiotic recombination sites. TRENDS in Genetics Vol.19 No.9 September 2003

I'll sum that up into a more answer-like form. Seems like the process is not random, because the double strand breaking events are clearly not evenly distributed across a genome. As the paper says there are no specific consensus motifs found as of yet, though there clearly is something special before promoters and TF binding sites, which makes them more likely to be a breaking site. How the machinery choses the place? Once again, as the paper says the breaking event depends on TF binding. But that is for S. cerevisiae. There are 17 hot spots found in human and mice genomes some of which are intergenic (they occupy introns or 5'/ 3' flanking regions).

Here is the distribution of recombination frequencies across one chromosome (the figure is taken from the paper).

Here is a list of recombination sites in humans and mice

In humans and mice anyway ,a lot of it boils down to the recognition of a specific sequence that marks recombination hotspots by PRDM9.

Edit - I'm expanding in response to the comment below…

Meiotic recombination occurs at vastly greater frequencies in some locations in the genome than others and these are called Recombination Hotspots. For example, the figure here, sourced from">

shows the recombination rate at the hotspot and further away from it for Chimps and Humans.

These hotspots are recognised by the cutting machinery thanks to the binding of PRDM9, a zinc finger protein, to a DNA sequence that it specifically recognises and is present at hotspots. The DNA sequence varies from species to species (as does the sequence and the function of PRDM9) but in humans is a well-characterised motif 13 base pairs long and of the sequence CCNCCNTNNCCNC (where N is any one of the 4 bases in DNA) , and accounts for the activity of nearly 40% of known hotspots.

The first link I posted has evidence to suggest that variation in composition of PRDM9 is a determinant of which hotspots get used. PRDM9, it appears, stimulates the formation of a specific histone modification - H3k4me3 (DNA is wrapped around histones, there are five histones - H1, H2A, H2B, H3, H4, and each of them has a tail that can be chemically modified, and eight of these (2 of H2A, 2 of H2B, 2 of H3 and 2 of H4) form a nucleosome - in this case the fourth lysine residue of the tail of Histone H3 is trimethylated as regulated by PRDM9, and facilitates crossing over and recombination initiation. Edit Summary

In meiosis, homologous recombination occurs. Two Spo11 proteins utilize tyrosines to induce a double-stranded break in the DNA. Spo11 has no specific cleavage site. However, cleavage by Spo11 lead to the discovery of Spo11-Oligonucleotide (Spo11 w/ attached oligonucleotide post-cleavage) complexes which could be mapped to 'hotspots' which the linked study finds a number of discriminating factors to Spo11 cleavage in meiosis:

From the precise locations of the 2.2 million Spo11-oligonucleotide sequences, Pan et al. [1] showed that local DNA composition also influences Spo11 cleavage sites. As expected from previous studies, Spo11 does not have a specific recognition or cleavage site. However, sequence biases were detected: the 10 to 12 bp surrounding the cleavage site and predicted to be bound directly by Spo11 is relatively AT rich, predicting relatively narrow and deep helix grooves facing the bound Spo11 dimer. Cleavage favors sites immediately 3' of a C and disfavors G in the same position. Also, within a 32-bp core surrounding the Spo11 cleavage sites, a twofold rotational symmetry for complementary dinucleotide composition can be discerned, suggesting separate contributions of the flanking 'half sites' to Spo11 binding and/or cleavage. In addition, cleavage sites are negatively correlated with positioned nucleosomes. Similarly, Spo11 is generally occluded from cleaving sites where transcription factors are bound, even though the binding sites of several different transcription factors positively correlate with hotspot sites.

While all the answers above are correct and insightful for the process of DSBs formation, recombination has another layer of complexity.

Mediating proper chromosomal segregation is more likely the real point of crossovers since crossovers still happen in inbred genomes, which do not result in novel haplotypes. CO create a heteroduplex structure between DNA strands, Holliday junctions. This structure creates tensions forces which are needed to pass the spindle assembly checkpoint before the meiotic cell can progress into anaphase (see papers by Nicklas work on grasshopper chromosomes and Hirose et al 2011). In most* organisms, at least one CO is needed per chromosome arm to pass the spindle assembly checkpoint and mediate proper disjunction. Lacking COs or misplaced COs can result in non-disjunction and increased risk for aneuploidy (Hassold, Hall, Hunt 2007). (*an exception are male Drosophila melanogaster, which have a CO independant chromosome segregation pathway).

Additionally it should be noted that DSB =/= COs that is, not all DSBs are converted into COs. In all organisms, DSB outnumber the resulting COs. The majority of DSBs are resolved into non-crossovers (NCOs). There is new evidence (in mice at least) that these numbers are not proportional. That is, if total DSBs in the genome are reduced, the total CO number is unaffected. (Cole et al 2012).

I think a more interesting phrasing of your question would be, How does the genome choose which DSBs to resolve into COs vs NCO?

Dernburg Lab

Our group investigates chromosome organization and dynamics. We focus on meiosis, the specialized cell division process that gives rise to reproductive cells such as sperm, eggs, pollen, and spores. Meiotic errors underlie many human birth defects such as Down Syndrome, and also contribute to human infertility, especially in older women. Successful meiosis requires a unique series of chromosome interactions: each chromosome must pair with its homologous partner, and these paired chromosomes then exchange genetic information through homologous recombination. Crossover recombination gives rise to genetic diversity, and also creates physical links between chromosomes that enable them to segregate away from each other. We investigate these mechanisms using the nematode Caenorhabditis elegans as our primary model organism. This experimental system has enormous experimental advantages, including rapid and powerful genetics, robust genome editing, outstanding cytology, and the opportunity to directly observe meiosis through in vivo imaging. We are also studying the evolution and plasticity of meiosis by comparing these events in C. elegans to other nematodes, such as Pristionchus pacificus.

During meiosis chromosomes undergo remarkable and dynamic reorganization. A hallmark of meiotic entry is the reorganization of each replicated chromosome into a linear array of loops anchored to a central axis, which regulates many aspects of meiosis. We have characterized the molecular organization and function of chromosome axes through diverse biochemical and structural approaches, including crystallography (in collaboration with the laboratory of Kevin Corbett, UCSD/LICR) and superresolution microscopy (with Michal Wojcik and Ke Xu, UC Berkeley). Meiotic chromosomes also interact with molecular motors in the cytoplasm via a "LINC" complex, which enables them to move rapidly along the nuclear surface. By directly imaging these movements in living animals, we learned how they accelerate chromosomes' ability to find their partners and regulate their interactions with other chromosomes. In many organisms this movement is mediated by telomeres, the special repetitive DNA sequences at the ends of chromosomes, but in C. elegans this role has been acquired by "pairing centers," broad regions near one end of each chromosome that span hundreds of kilobases. We defined the molecular requirements for pairing center function and continue to investigate their roles in meiotic dynamics and regulation.

A longstanding mystery about meiosis is how each pair of chromosomes undergoes at least one crossover, while the total number of crossovers is usually quite low. Indeed, in C. elegans, as in many other species, one and only one crossover occurs between each pair of homologs. Recent evidence has indicated that the synaptonemal complex (SC), a special polymer that assembles between paired chromosomes, plays an important role in this crossover control. Through live imaging in C. elegans, we discovered that the SC behaves as a unique, liquid crystalline compartment. We are exploring how this unusual material self-assembles through phase separation, and how it regulates meiotic recombination. We have identified biochemical signals and a conformational switch within this compartment that regulate crossover formation. Insights into this compartmentalized signaling mechanism are illuminating how cells make a variety of switch-like, spatially localized decisions.

Meiosis underlies the evolution of eukaryotes, and is also shaped by evolution, showing fascinating variation in the details of its execution. For example, in many organisms the process of meiotic homolog pairing depends on recombination machinery, but C. elegans is among the known exceptions to this rule. To better understand how evolution can rewire meiotic regulation, we are developing another nematode known as Pristionchus pacificus, an emerging satellite model organism, for molecular analysis of meiosis. By comparing how this process works in two different nematodes (C. elegans and P. pacificus), we are uncovering core aspects of meiotic regulation that are shared across eukaryotes, and exploring how novel mechanisms such as recombination-independent pairing have emerged in specific lineages.

New insights in biology are driven by novel technologies. In particular, advances in fluorescence microscopy, DNA sequencing, genome editing, and mass spectrometry have enabled our lab to answer longstanding questions. Future advances will continue to influence our research directions. While our research group focuses on hypothesis-driven projects rather than methods development, we also work to expand our experimental toolbox through collaboration and innovation. For example, we developed methods for long-term in vivo imaging of adult C. elegans, which has enabled us to probe the dynamics of chromosome movement and synaptonemal complex assembly. We also adapted the auxin-inducible degradation (AID) system for use in C. elegans. This method enables rapid, robust protein depletion in response to an inexpensive, nontoxic small molecule, and also makes it possible to create more complex strains than we could using conventional loss-of-function alleles. We welcome potential researchers who are interested in applying cutting-edge tools to the study of chromosome organization.

What is CRISPR?

Viruses are infectious particles of genetic material that affect virtually every type of organism, and therefore defense mechanisms against viruses have evolved. In the case of prokaryotes, the CRISPR locus is a region of genomic DNA to which viral genome sequences can be added to serve as a "memory" of previous infections for future defense against the same virus. The name CRISPR stands for Clustered Regularly Interspaced Short Palindromic Repeats. The short regions (20-50 nucleotides) of viral DNA are the "spacers" and are used to transcribe CRISPR RNAs (crRNAs). The crRNAs interact with an additional RNA (tracrRNA) and a CRISPR-associated (Cas) protein, which cleaves complementary DNA that enters the cell, thus breaking the viral replication cycle.

EXPERIMENT: Do spacer sequences confer bacteriophage resistance?

A wild-type strain of Streptococcus thermophilus was exposed to two types of bacteriophages (phage 858 and phage 2972) and mutant bacteria resistant to each strain were identified. The spacer sequences in the CRISPR locus in the mutants were then determined and compared to the two bacteriophage genome sequences. Mutants with spacer sequences that corresponded to a sequence in the bacteriophage genome were resistant to specifically to that bacteriophage (Barrangou et al. 2007). Even a few mismatched bases did not confer resistance.

Modifying CRISPR/Cas systems for genome editing

The discovery that an RNA could guide an endonuclease to a specific genomic position to cut DNA presented an opportunity to modify this system to edit genomes. Scientists discovered that they could combined the two RNAs (tracrRNA and crRNAs) into a single small guide RNA (sgRNA). The Cas enzyme can be provided as DNA via a plasmid or RNA encoding the protein, which is produced by the cellular gene expression machinery, or directly as purified protein. The Cas9 protein does require a consensus sequence to interact with DNA (5' NGG 3'), but this motif is relatively common, allowing targeting of most loci.

But how can creating double-stranded breaks in genome benefit an organism? The answer lies in cellular DNA repair. Double-stranded breaks are typically repaired by non-homologous end-joining (NHEJ) or homologous recombination. NHEJ frequently makes small insertions or deletions, which could for example, disrupt the coding sequence of a protein to inactivate it. Homologous recombination mediated repair can occur either by a using the homologous chromosome as a template for repair or by providing exogenous DNA with the desired edit, effectively "tricking" the cell into making the repair based on the donor DNA provided. Importantly, at this time, the efficiency of CRISPR gene editing is not 100% so correctly edited cells must be identified. Additionally, scientists are continuing to study the incidence of off-target (non-intended) edits when using CRISPR-based systems.

Order the steps in CRISPR gene editing.

  1. Cellular DNA repair
  2. Cas endonuclease binds genomic DNA at location determined by the complementary small guide RNA
  3. Identification of edited cells
  4. Cleavage of genomic DNA
  5. Cas endonuclease and small guide RNA are introduced into cells
  6. Select a guide RNA sequence

For help, use this link for a review of the steps (Click-N-Learn by HHMI Biointeractive).

3 Three mechanisms of DNA cut-and-paste

The establishment of a cell-free system for different types of specialised recombination has allowed elucidation of the biochemical steps of the DNA strand exchange reactions. These in vitro studies have identified three different mechanisms for cutting and rejoining DNA molecules, corresponding to the three major classes of proteins described above: the DDE transposases, the resolvase/invertase, and the λ Int families of site-specific recombinases. A common feature of the three mechanisms is that they proceed by transesterification reactions without requiring high-energy cofactors such as ATP ( Fig. 3) [3, 31, 49]. The energy of the broken phosphodiester bonds is conserved for the formation of new bonds. The DDE transposases use a one-step transesterification mechanism, whereas the two distinct families of site-specific recombinases use contrasting two-step transesterification mechanisms involving different amino acid residues in the formation of a covalent DNA-enzyme intermediate ( Fig. 3). Understanding of these mechanisms has now reached the atomic level of resolution by having obtained crystal structure data for each of the three families of proteins.

Chemistry of transposition and site-specific recombination reactions. Recombination DNA strand breakage and joining occur by transesterification reactions in which the phosphate of the scissile phosphodiester bond is subject to nucleophile attack by a hydroxyl group (arrows). Endonucleolytic cleavage at the transposon ends (A) and the strand-transfer reaction that join the ends to the target DNA (B) are one-step transesterifications in which the nucleophile is a water molecule and the 3′-OH end of the element, respectively. Strand exchange catalysed by site specific recombinases (C and D) occurs by two steps of transesterification (cleavage and rejoining) involving a covalent protein-DNA intermediate. The nature of the catalytic residue and the line of entry of the nucleophile is different between the two recombinase families. For cleavage catalysed by the invertase/resolvase family (C), the nucleophile hydroxyl is derived from a serine and the leaving group is the 3′-OH of the deoxyribose. For the λ integrase family (D), the catalytic residue is a tyrosine and the leaving group is the 5′-OH. For both recombinase families, the rejoining step is the reverse of the cleavage step. Phosphate backbones are drawn in thick and thin lines to distinguish the donor and target DNA (panel B) or the two recombination partner DNA strands (panels C and D).

Chemistry of transposition and site-specific recombination reactions. Recombination DNA strand breakage and joining occur by transesterification reactions in which the phosphate of the scissile phosphodiester bond is subject to nucleophile attack by a hydroxyl group (arrows). Endonucleolytic cleavage at the transposon ends (A) and the strand-transfer reaction that join the ends to the target DNA (B) are one-step transesterifications in which the nucleophile is a water molecule and the 3′-OH end of the element, respectively. Strand exchange catalysed by site specific recombinases (C and D) occurs by two steps of transesterification (cleavage and rejoining) involving a covalent protein-DNA intermediate. The nature of the catalytic residue and the line of entry of the nucleophile is different between the two recombinase families. For cleavage catalysed by the invertase/resolvase family (C), the nucleophile hydroxyl is derived from a serine and the leaving group is the 3′-OH of the deoxyribose. For the λ integrase family (D), the catalytic residue is a tyrosine and the leaving group is the 5′-OH. For both recombinase families, the rejoining step is the reverse of the cleavage step. Phosphate backbones are drawn in thick and thin lines to distinguish the donor and target DNA (panel B) or the two recombination partner DNA strands (panels C and D).

3.1 The DDE transposases: a universal one-step transesterification reaction

The biochemistry of reactions catalysed by the DDE recombinases has been examined in detail for several different systems including three bacterial transposable elements: the bacteriophage Mu, IS10 and Tn7 (for recent reviews, see [6, 11, 15, 50]. In all cases, the transposase executes a set of critical reactions that join the 3′ ends of the element to the target DNA, whereas connection of the 5′ ends and completion of transposition requires processing events performed by host repair and replication functions ( Fig. 4). An important variation between the three bacterial systems is in the number and nature of cuts that sever the transposon from the flanking donor DNA, resulting in different transposition end-products. For the three transposons, transposition initiates with a pair of specific single strand cleavages exposing the 3′-OH ends of the element ( Fig. 4 see also Fig. 3A). For IS10 and Tn7 which use a non-replicative cut-and-paste mechanism, the other strand is also cleaved to reveal the transposon 5′ ends and to excise the element from its initial genomic locus. IS10 excision yields flush transposons ends [51], whereas the double-strand breaks at the ends of Tn7 are staggered by three nucleotides, with the 5′ end cleavages occurring in the adjacent donor backbone [52]. In sharp contrast, the second strand cleavage does not occur at this stage of bacteriophage Mu transposition which remains connected at its 5′ ends to the flanking DNA sequences. In a second step, the 3′-OH ends of the excised transposon (IS10 and Tn7) or the nicks in the donor molecule (Mu) participate in a concerted strand transfer reaction that join both ends of the element to staggered phosphates of the two target DNA strands ( Fig. 4 see also Fig. 3B).

Biochemical steps underlying the non-replicative or replicative transposition of three bacterial elements. The shaded rectangles represent the DNA strands of the transposable elements ends. For IS10 and Tn7, reactions occurring at a single end are shown. Black and white rectangles are the flanking donor sequences and the target DNA, respectively. The black and white arrowheads show the 3′-end and 5′-end cleavage, respectively. Curved arrows indicate the nucleophilic attack transferring the 3′-OH ends on staggered phosphates of the target DNA (black dots). Crenellated lines represent the few target nucleotides that are duplicated during the transposition process. For the three elements, the biochemical steps are catalysed by the transposase in a complex where the transposon ends are in synapsis. For IS10, the target is captured after completion of the double-strand breaks at the transposon ends, whereas for Mu and Tn7, the presence of the target within the complex is required to activate the cleavage reactions. The cross-hatching represents replication events that complete transposition after complex dissociation.

Biochemical steps underlying the non-replicative or replicative transposition of three bacterial elements. The shaded rectangles represent the DNA strands of the transposable elements ends. For IS10 and Tn7, reactions occurring at a single end are shown. Black and white rectangles are the flanking donor sequences and the target DNA, respectively. The black and white arrowheads show the 3′-end and 5′-end cleavage, respectively. Curved arrows indicate the nucleophilic attack transferring the 3′-OH ends on staggered phosphates of the target DNA (black dots). Crenellated lines represent the few target nucleotides that are duplicated during the transposition process. For the three elements, the biochemical steps are catalysed by the transposase in a complex where the transposon ends are in synapsis. For IS10, the target is captured after completion of the double-strand breaks at the transposon ends, whereas for Mu and Tn7, the presence of the target within the complex is required to activate the cleavage reactions. The cross-hatching represents replication events that complete transposition after complex dissociation.

In the case of IS10 and Tn7, host processing of the resulting strand transfer intermediate results in the filling-in of the short single-strand gaps lying at each transposon-target DNA junction. This repair, which in the case of Tn7 is also presumed to remove the overhanging three nucleotides at the 5′ ends, generates the short target duplications that flank the element in its new locus ( Fig. 4). After eventual cleavage of the donor backbone by an undetermined mechanism, the strand transfer product of bacteriophage Mu can be processed in a similar way during the non-replicative integration of the phage. Alternatively, in the absence of donor backbone dissociation, complete replication of the element gives rise to a cointegrate, which is the final product of replicative transposition ( Fig. 4).

Variations of the transposition pathway have been reported for both prokaryotic and eukaryotic transposons. In particular, the transposase of IS911, a member of the IS3 family, has been found to carry out a distinctive single-strand circularisation reaction in which one transposon end is transferred to a target site three nucleotides distant from the other end [53, 54]. In vivo, the resulting ‘figure-eight’ molecule is processed, giving rise to a circular excised form of the transposon thought to be a transposition intermediate that can efficiently insert in a new locus. Although mechanistically distinct, this ‘site-specific’ circularisation of IS911 is reminiscent of the excision reaction performed by site-specific recombinases. Interestingly, the formation of excised transposon circles has been observed for other members of the IS3 family [54, 55], and also for IS1 [56] and for diverse eukaryotic transposons, suggesting that it may represent a major mode of transposition [54].

As mentioned above, the initial cleavage that generates the 3′-OH ends and the strand transfer reaction catalysed by bacterial DDE transposases are chemically equivalent to the reactions catalysed by the retroviral integrase proteins (IN) for the integration of the retrovirus cDNA into the genome of an infected cell [6, 7, 57]. IN-mediated cleavages at the ends of HIV cDNA and the subsequent strand transfer step proceed with inversion of chirality at the target DNA phosphates. The same result was obtained for the strand transfer reaction catalysed by the bacteriophage MuA transposase [58, 59]. This analysis indicates that the reactions catalysed by the DDE recombinases occur by a one-step transesterification mechanism in which the scissile phosphodiester bond is directly attacked, either by a water molecule (end cleavage), or by the 3′-OH end of the element (strand transfer) without the formation of a protein-DNA covalent intermediate ( Fig. 3). Both transesterification steps require divalent cations (Mg 2+ or Mn 2+ ) and it was postulated that the role of the DDE residues was to form a metal ion binding pocket in the active site [49]. This suggestion is strongly supported by the recent finding that the catalytic domain of IN, MuA and other enzymes involved in metal ion-dependent phosphoryl transfers exhibit very similar structures.

The crystal structure of the DDE-containing catalytic core of MuA shows that this region of the protein contains two sub-domains [60]. The 70 C-terminal residues form a β-barrel, on one face of which is a large region of positive potential that could play a role in the non specific DNA binding activity associated with the whole catalytic core. It was proposed that this activity might be important for interactions either with the DNA sequences surrounding the cleavage points at the Mu ends or with the target DNA, in order to position the substrate into the enzyme active site [60]. Remarkably, in spite of a very low level of sequence homology, the structure of the N-terminal sub-domain containing the DDE triad shows striking similarities, not only with the catalytic domain of the IN proteins of HIV and ASV retroviruses, but also with the structure of more functionally distant enzymes such as the E. coli and HIV ribonucleases H (RNase H) and the E. coli Holliday junction resolving enzyme RuvC (reviewed in Refs. [61–63]). In the structure shared by these different proteins (a central five-stranded β-sheet surrounded by α-helices on either side) three or four catalytically important acidic residues, i.e., the DDE motif in the case of the MuA and IN proteins, are clustered to form a possible two metal ion binding pocket as actually observed in the structure of HIV RNase H active site [64]. This structural similarity between unrelated proteins strongly supports the view that the DDE recombinases belong to a larger group of enzymes, including polymerases and ribozymes, that catalyse transesterification reactions using a common ‘carboxylate-chelated two-metal ion’ catalytic mechanism [61, 63]. In this mechanism, initially proposed for the 3′-5′ exonuclease activity of DNA polymerase I, the two chelated metal ions participate in the activation of the nucleophile hydroxyl and the scissile phosphodiester bond by stabilising the phosphate in a penta-coordinated form [65].

However, DDE recombinases are not simple nucleases or polynucleotidyl transferases. These enzymes are distinguished by the fact that they consecutively catalyse two (or three) transesterification reactions at both transposons ends. Successful transposition requires that these two sets of reactions be temporally and spatially coordinated in order to prevent incomplete recombination events which are likely to be deleterious to the host and to the transposable element itself. Recent work from different laboratories has provided new interesting insight into how this control is effected in the case of Mu (reviewed in Refs. [66, 67]). A key step in Mu transposition is the assembly of a synaptic complex into which catalytically inert monomers of MuA become activated by the formation of a tetrameric core tightly bound to the two transposon ends. Assembly of the tetramer thus implies a structural transition by which the enzyme and the DNA cleavage sites become engaged for catalysis ( [6, 15] see below). Complementation experiments performed by mixing distinct catalytic mutants of MuA have shown that the tetramer active sites are built by interlocking separate domains of distinct transposase subunits. In each active site, one monomer provides the DDE catalytic core (domain II), whereas a different monomer donates another catalytic region located in the C-terminal part of the protein [68]. Although the exact function of this second catalytic domain (domain IIIA) is not known, it appears to play a role in the activation and/or positioning of the DNA [69]. For the strand transfer reactions, the active site domain IIIA is provided by MuA monomers occupying the inner part of the Mu ends whereas the DEE-containing domain II is provided by transposase subunits bound on the external sites [70, 71]. By reciprocality of domain sharing, this spatial arrangement appears to be reversed for the ends cleavage reactions [71]. Furthermore, in both reaction steps, the DDE domain II operates in trans, i.e., the subunit bound to one end catalyses the cleavage and joining of the opposite end [70, 72]. This intricate architecture of the MuA enzymatic complex is viewed as a mechanism which ensures that substrate synapsis and catalysis are tightly coupled. The involvement of a structural transition upon tetramerisation and substrate binding is also consistent with the fact that the DDE residues of the MuA catalytic core appear to be in an inactive configuration [60]. Taking these data into account, Yang et al. propose a pathway in which the Mu ends are first nicked within the two cleavage active sites and then swing onto two other catalytic pockets to be transferred on the target DNA [71].

As for Mu, a single transposase protein carries out the three reaction steps underlying IS10 transposition and the assembly of a precleaved synaptic complex is required to ensure the coordination of the processing events occurring at both ends [50, 73, 74]. IS10 transposition pathway is also highly ordered. Cleavage of the transferred strand (3′ end) always precedes cleavage of the non-transferred strand (5′ end), and complete excision of the transposon is required before a target DNA molecule is captured by the complex for the strand transfer reactions [75, 76]. In sharp contrast to Mu, a single transposase monomer catalyses the three chemical steps of IS10 transposition at each of the two transposon ends [77]. Thus, two monomers of transposase appear to carry out the entire transposition reaction without sharing domains. The mechanism which allows the repeated use of a single DDE-containing catalytic pocket is not known. Current models propose that consecutive structural rearrangements must occur between each reaction step to re-adjust the active site to novel substrate configurations [50, 77].

A third and distinctive example of transposase architecture is provided by Tn7 [11]. In contrast to Mu and IS10, Tn7 transposition reactions are catalysed by two DDE-containing enzymes with clearly separate activities. TnsA mediates cleavage at the 5′ ends of the transposon whereas TnsB, which is also responsible for the recognition and binding of Tn7 ends, promotes the 3′ end cleavage and carries out the strand transfer reaction [22]. Although the respective functions of TnsA and TnsB can be selectively blocked by mutation of their DDE residues, the presence of both proteins is required for the formation of an active complex. Tn7 transposase is thus a heteromeric enzyme containing two different catalytic subunits [22]. The possibility that the active sites of the TnsA+B transposase core could be assembled by contribution of distinct TnsA or TnsB protomers has not been examined thus far. As with IS10, there is currently no evidence for cis or trans activity by TnsA and/or TnsB. Nevertheless, the biochemical separation of the 5′ and 3′ ends processing reactions has a remarkable implication. Inactivation of TnsA converts the normal cut-and-paste transposition pathway of Tn7 into a Mu-like replicative mechanism, the 3′ ends breakage and joining reactions catalysed by TnsB within the transposase core being functionally equivalent to those performed by the MuA tetramer in Mu transposition [78].

These three different examples of ‘division of labour’ within a transposition complex illustrate the remarkable flexibility by which a common chemical mechanism can be adapted in different ways to accomplish complex and highly controlled recombination reactions.

3.2 Site-specific recombinases: two-step transesterifications by distinct mechanisms

Unlike transposases, site-specific recombinases, at least in principle, execute all recombination DNA breakage and joining reactions without involving host repair or replication functions [3–5, 31]. These reactions are biochemically related to those catalysed by the topoisomerase enzymes that regulate the intracellular level of DNA supercoiling [79]. Indeed, site-specific recombinases often exhibit type I topoisomerase activity by which they can relax supercoiled DNA substrates. However, in a recombination reaction between two sites, DNA strands are not only broken and rejoined, they are also exchanged. Therefore, as discussed above for the DDE recombinases, site-specific recombination requires the assembly of a synaptic complex containing multiple recombinase subunits and the relevant DNA recombining partners.

For most of the systems of the two major families (i.e., the resolvase/invertase and the λ Int families), recombination takes place within a short (∼30 bp) DNA segment called the ‘core’ or ‘crossover’ site onto which two recombinase subunits bind, usually by recognising specific sequences with dyad symmetry [3, 80, 81]. In the synaptic complex, the two core sites are brought in close proximity. It is generally admitted that the four recombinase subunits bound on the two duplexes participate in the recombination reaction ( Figs. 5 and 6). A few systems, all belonging to the λ Int family, involve two distinct recombinase proteins. In the E. coli inversion system Fim, the two recombinases, FimB and FimE, act independently on the same recombination sites to control the ‘on’ or ‘off’ position of this particular genetic switch [82]. By contrast, the XerC and XerD recombinases cooperate in all Xer-mediated DNA rearrangements and the two proteins bind with distinct specificities to separate regions of the different recombination sites of this system [83, 84].

Concerted DNA breakage and rejoining reactions catalysed by resolvase/invertase family enzymes. The subunit rotation model is shown. The ovals represent recombinase subunits with the conserved catalytic serine ‘S’. Thick and thin lines are the top and bottom strands of the recombination sites, respectively. The short vertical bars are the 2 bp of the overlap region between the two cleavage points. Black arrows represent the nucleophilic attacks of phosphates (black dots) by hydroxyl groups (arrowheads). The four DNA strands are cleaved (a), exchanged by 180° rotation of the half-site bound subunits (b) and religated in the recombinant configuration (c).

Concerted DNA breakage and rejoining reactions catalysed by resolvase/invertase family enzymes. The subunit rotation model is shown. The ovals represent recombinase subunits with the conserved catalytic serine ‘S’. Thick and thin lines are the top and bottom strands of the recombination sites, respectively. The short vertical bars are the 2 bp of the overlap region between the two cleavage points. Black arrows represent the nucleophilic attacks of phosphates (black dots) by hydroxyl groups (arrowheads). The four DNA strands are cleaved (a), exchanged by 180° rotation of the half-site bound subunits (b) and religated in the recombinant configuration (c).

Sequential strand exchange by the λ Int family site-specific recombinases. The DNA strand swapping/isomerisation model is presented. The letter ‘Y’ refers to the conserved catalytic tyrosine. Other symbols are as in Fig. 5. The top strands (thick lines) are cleaved first (a), swapped between the two partners (b), and then religated (c). The branch point of the generated Holliday junction intermediate is positioned at the middle of the (6-bp) overlap region and the top strands are crossed. Isomerisation of the Holliday junction to a recombinant configuration in which the bottom strands are crossed requires the reorganisation of the DNA helices and the four half-sites-bound recombinase subunits within the complex (d). The resulting Holliday junction isoform is resolved by repeating steps a to c in order to exchange the bottom strands (e).

Sequential strand exchange by the λ Int family site-specific recombinases. The DNA strand swapping/isomerisation model is presented. The letter ‘Y’ refers to the conserved catalytic tyrosine. Other symbols are as in Fig. 5. The top strands (thick lines) are cleaved first (a), swapped between the two partners (b), and then religated (c). The branch point of the generated Holliday junction intermediate is positioned at the middle of the (6-bp) overlap region and the top strands are crossed. Isomerisation of the Holliday junction to a recombinant configuration in which the bottom strands are crossed requires the reorganisation of the DNA helices and the four half-sites-bound recombinase subunits within the complex (d). The resulting Holliday junction isoform is resolved by repeating steps a to c in order to exchange the bottom strands (e).

The recombination locus of all systems, even those working with a single recombinase, contain a certain degree of asymmetry so that ‘top’ and ‘bottom’ strands can be distinguished. In the simplest cases, this asymmetry is entirely encoded within the core site whereas in other systems, external elements may contribute to the recombination site polarity by imposing a specific geometry on the synaptic complex ( [80, 81] see also below). The catalytic mechanism used by the two families of site-specific recombinases is different as is the structural organisation of the enzymes.

3.2.1 The resolvase/invertase family: concerted breakage and rejoining of four DNA strands

The best characterised recombinases of this family are the invertases Gin from bacteriophage Mu and Hin from Salmonella sp. and the resolvases of Tn3 and γδ transposons [3, 32, 36, 80, 81, 85]. Although the chemistry of strand exchange used by these recombinases appears to be highly conserved, important variation is found in the assembly of the recombination synapse determining the selectivity of the reaction, i.e., inversion or resolution (see below).

In a recombination reaction catalysed by resolvases or invertases, double strand breaks staggered by 2 bp occur at the middle of the two paired core sites, giving rise to recessed 5′ ends and 3′-OH overhangs ( Fig. 5 see also Fig. 3C). One recombinase subunit is linked to each of the 5′ ends through the conserved serine residue of the family [86, 87]. This serine presumably provides the primary nucleophile hydroxyl group in the cleavage reaction [45]. The ligation step that follows strand exchange can be viewed as the converse of the cleavage: the protein-DNA phosphoseryl bond of one strand is attacked by the 3′-OH end of the partner to release the enzyme and reseal the DNA backbone in the recombinant configuration ( Fig. 5 see also Fig. 3C).

Although cleavages of the four DNA strands or the religation steps can be experimentally uncoupled by using recombination site variants or mutated recombinases, both types of reaction are normally highly coordinated [88, 89]. The cleaved complex in which four enzyme-linked recombination half-sites are held together by recombinase subunit interactions seems to be an obligate intermediate [88]. Thus, recombination by the resolvase/invertase family occurs by a mechanism in which four DNA strands are broken and rejoined in a concerted manner.

The standard substrates for invertases and resolvases are supercoiled molecules (see below) and the topological change induced by recombination has been found to be equivalent to a right-handed 180° rotation of one pair of cleaved half-sites relative to the other prior the rejoining step ( Fig. 5) [90–92]. In this mechanism, the 2 bp ‘overlap’ regions that separate the top and bottom strand cleavage positions need to be identical in the two core sites to stably reconnect the recombinant DNA duplexes [93, 94]. It has been shown recently that in reactions involving non-homologous overlap regions, strand exchange mediated by Tn3 resolvase proceeds through apparent 360° (2×180°) rotation of the half-sites without rejoining the mis-paired strands in the recombinant (180° rotation) configuration [89].

The crystal structure of the γδ resolvase dimer complexed to the core recombination site has been determined recently [95]. The DNA-bound resolvase monomer contains two globular domains lying on opposite faces of the DNA helix and an extended arm region that connects the two domains. The small C-terminal DNA binding domain is involved in the recognition of the outer part of the core site consensus sequence by making specific contacts in both the major and minor grooves. Its structure, containing a helix-turn-helix DNA binding motif, is similar to that previously found for the DNA binding domain of Hin invertase [96]. The arm region that joins the two globular domains also contributes to DNA binding through interactions in the minor groove. The large N-terminal catalytic domain contains the active site serine and other catalytic residues, as well as a set of residues forming a hydrophobic core at the dimer interface. This domain also appears to be important for higher order interactions between resolvase dimers in the recombination complex [97–99]. The DNA in the co-crystal is bent by 60° away from the enzyme catalytic domain.

The γδ resolvase-DNA complex appears to be in an inactive configuration, the catalytic serines of the two resolvase subunits being too far away from the scissile phosphates to cleave the DNA [95]. However, the active serine of either monomer is closest to the DNA cleavage position proximal to the half-site bound by the recombinase subunit. This correlates with the fact that the active serine and several other catalytic residues of γδ resolvase act in cis on the nearest scissile phosphodiester bond [88]. The crystal structure also suggests that, as in MuA, catalytic residues may be shared between the two monomers to form the active site. While there is currently no experimental evidence to support the possibility for a composite catalytic site, mutations in one resolvase subunit seem to activate the adjacent subunit in trans, presumably by altering inter-subunit interactions at the dimer interface ( [88] M.R. Boocock and N.D.F. Grindley, unpublished). Likewise, mutations in the equivalent dimerisation domain of the DNA invertases Gin, Hin and Cin, make the recombinases more reactive and independent of structural elements controlling the formation of the recombination synaptic complex ( [92, 100–103] see also below). These observations indicate that activation of the active site serine recombinases appears to require conformational changes during the assembly of an enzymatically competent recombination structure as discussed above for the DDE transposases. Indeed, some level of structural flexibility as revealed by the γδ resolvase-DNA co-complex could be compatible with limited distortions [95].

The conformational changes that take place during strand exchange appear less straightforward. Since resolvase cleaves the DNA in cis and seems not to dissociate from its binding site during recombination [104], two models have been proposed to account for the apparent 180° half-site rotation in the reaction. In the ‘subunit rotation’ model, strand exchange is coupled to a rotational rearrangement of the DNA-linked recombinase subunits within the tetramer [91]. A major difficulty with this mechanism is that the recombinase dimer interface holding the cleaved ends in the complex must be disrupted transiently during the dissociation/reassociation process ( Fig. 5). An alternative ‘static subunit’ model postulates that the recombinase tetramer does not dissociate and that recombination occurs by localised conformational and topological rearrangement of the DNA molecules within the complex [95, 99]. This second model is difficult to reconcile with the recent observation that Tn3 resolvase brings about multiple rounds of rotations without rejoining the single 180° rotation intermediate. It is argued that in a static subunit mechanism, such an iteration of the reaction would entangle the DNA within the catalytic complex to an unacceptable level [89, 105].

3.2.2 The λ Int family: sequential pairs of DNA strand exchange

Unlike the recombinases of the resolvase/invertase family, site-specific recombinase related to λ Int such as the Cre recombinase of phage P1, E. coli XerC and XerD and the Flp protein from the yeast 2μ plasmid, exchange the two pairs of DNA strands separately and sequentially ( Fig. 6) [3, 4, 31, 106].

To initiate the first strand exchange, the tyrosine residue of the conserved catalytic motif RHRY attacks a specific scissile phosphate in one strand (defined here after as the top strand) of each recombination core sites, thereby forming a 3′ phosphotyrosyl-linked recombinase-DNA complex and generating a free 5′-OH end ( Fig. 6). The polarity of this cleavage reaction is thus reversed when compared to that of the resolvase/invertase-mediated cleavages (compare also Fig. 3C and Fig. 3D). In a second step, the recombinase-DNA phosphotyrosyl bond is attacked by the 5′-OH end from the partner duplex to generate a four-way branched structure, or ‘Holliday junction’ intermediate, in which only two DNA strands have recombined. To resolve this intermediate and complete the recombination reaction, the two other (bottom) strands are exchanged by repeating the cleavage/religation process 6–8 bp downstream of the first strand cleavage position ( Fig. 6).

As in the resolvase/invertase-mediated recombination, sequence homology in the 6–8 bp overlap region that separates the top and bottom strand cleavage positions in the two partner recombination core sites is also essential for most (but not all) of the systems belonging to the λ Int family. Although the homology appears to play a role after the synapsis of the two core sites, the exact mechanism by which it influences the reaction remains unclear. The classical model supposes that sequence identity between recombination sites is required for a reversible process called ‘branch migration’ that moves the branch point of the Holliday junction from its site of formation at one end of the overlap region to the site of resolution at the opposite end. This is achieved by stepwise melting and reannealing of the complementary strands of the parental and partner DNA duplexes, [5, 107]. This view is now challenged by an alternative ‘strand swapping/isomerisation’ mechanism ( Fig. 6) [108]. The model proposes that, after cleavage, two or three nucleotides from the parental overlap sequences are melted and then swapped between the partner duplexes. In this mechanism, sequence homology is required in the reannealing reaction that orients the 5′-OH end of the invading strand for ligation. Movement of the Holliday junction is limited to the 1–3 central bp of the overlap region. This movement would simply be unstacking-restacking events whereby the Holliday junction DNA helices are reorganised from a ‘parental’ (top strands crossed) configuration to a ‘recombinant’ (bottom strands crossed) configuration in which the bottom strand cleavage sites are adequately positioned for strand exchange ( Fig. 6) [108–110]. This view is supported by recent work showing that one Holliday junction isomer preferentially undergoes top strand exchange, whereas resolution of the other iso-form predominantly occurs by bottom strand exchange [111, 112]. Also, in the recombinase-Holliday junction complex, the centre of the overlap region is partially unstacked [113].

Although variations of both models can be envisioned, the strand swapping mechanism seems more suitable for other recombination systems of the λ Int family, such as conjugative transposons and integrons, which are less strict in their requirement for sequence homology between partner recombination sites [23, 43]. These systems may be less sensitive to DNA mispairing by catalysing both strand exchanges without annealing recombinant strands. Alternatively, the Holliday junction iso-form generated by a single strand exchange may be a stable recombination product that is not processed back by the recombinases but could be well resolved by some other host function (as discussed in Ref. [113]).

Another matter of divergence within the λ Int family concerns the catalytic role of the different recombinase subunits within the synaptic complex. A wealth of data indicate that the yeast recombinase Flp uses a trans cleavage mechanism in which the catalytic RHR triad of one monomer activates the scissile phosphodiester adjacent to its binding site, while the tyrosine nucleophile is contributed by a different Flp monomer bound on a different half-site. As for MuA, this active site sharing observed for Flp was proposed as a mechanism for coupling catalysis and recombination site pairing [31, 114]. However, this view was weakened by experiments showing that the two collaborating monomers are bound on the same core site and that cleavage may actually precede synapsis [115, 116]. An alternative model based on an asymmetrical (head to tail) assembly of Flp monomers has been proposed, again suggesting that active site-dependent molecular bridges may be required [117]. In contrast, all current data indicate that XerC and XerD recombinases act on the closest cleavage site by providing all catalytic residues in cis [118]. For λ Int, experiments supporting both cis and trans mechanisms have been reported [119, 120]. These apparently conflicting results have raised a number questions that have been debated in the recent literature [48, 121–123].

Although the details of the molecular interactions between recombinases subunits are not known, some new insight has been provided by the recent acquisition of structural data for the C-terminal catalytic moieties of λ Int and the Haemophilus phage 1 integrase (HP1 Int) and for the complete 298 aa XerD recombinase [124–126]. The C-terminal catalytic domain of these three recombinases exhibits a similar fold where the conserved residues RHR are exposed in a basic groove likely to contact the DNA for the activation of the scissile phosphodiester bond. However, the structure of the region containing the active tyrosine (the extreme C-terminal part of the domain) is very different in the three structures. This region also seems to be involved in recombinase inter-subunit interactions. In λ Int, the tyrosine is positioned on a disordered flexible loop in a configuration which, by allowing conformational changes, could be consistent with either a cis or trans cleavage [124]. In contrast, the structures of both HP1 Int and XerD support a cis mechanism, the tyrosine being docked at a fixed position within the catalytic pocket. In the HP1 Int catalytic domain, which crystallised as a dimer, the tyrosine is in a position ready for in line nucleophilic attack of the scissile phosphate [125], whereas in the model proposed for XerD-DNA complex, some local alteration of the structure would be required to bring the residue in an active configuration [126]. As for γδ resolvase, it is thought that this conformational activation is induced by protein interactions between XerD and the partner recombinase XerC (our unpublished results). Another major conformational change would be required to move the globular N-terminal domain of XerD (absent from the λ and HP1 Int structures) that blocks the access to the active site. The fact that the peptide connecting the two XerD domains is disordered in the crystal is again indicative of structural flexibility [126].

Examples of Molecular Markers | Plant Genetics

Molecular markers are DNA sequences, whose inheritance pattern can be established. Classic examples of molecular markers are: 1. Restriction Fragment Length Polymorphism (RFLP) 2. Randomly Amplified Polymorphic DNA (RAPD) 3. Amplified Fragments Length Polymorphism (AFLP) 4. Sequence Characterized Amplified Region (SCAR).

Example # 1. Restriction Fragment Length Polymorphism (RFLP):

This novel non-PCR based approach is based on hybridization of probe to fragments of genomic DNA following digestion with restriction enzymes. The polymorphic nature of DNA was first elucidated by RFLP technology. Genetic diversity among plant species is mainly due to the variation in the DNA sequence, while ascertaining the polymorphic nature of the genetic material in which genetic information is stored in the DNA sequence.

The genomic DNA from the sample being tested is digested with restriction endonucleases. This enzyme cut DNA at specific sites with sequences known as restriction enzyme recognition sequences. The restric­tion digestion resultant DNA fragments are separated by electrophoresis on an agarose gel and subjected for southern blotting technique.

It blots DNA from gel to a nylon membrane to allow hybridization with probe. The pre-labelled probe is used to hybridize with blotted DNA. Suit­able DNA probe is usually from the species being studied. Labelled probes are mainly a genomic clones are cDNA. Sometimes ribosomal RNA genes (multiple copy genes) are useful for analysis.

Radiolabeled and non-radiolabelled probes are available in this process. The labelling of probes was traditionally carried out using radioisotopes. The non-radioactive labelled ones are biotin digoxigen as fluorescence material. The nature of hybridization pattern reveals polymorphism in the DNA sample and exhibit sequence difference between individuals.

Although replication of DNA proceeds accurately several agents within the cell is responsible for alteration in se­quence. Single base pair changes takes place in the DNA could be responsible for sequence alteration or certain chromosomal abberations like translocation, invertion, and deletion could be mainly attributed to large amount of variation.

Either low or high profiles changes in the DNA sequence consequently result in the loss or gain of recognition site. The restriction enzymes, suppose to cut sequence fails to do so due to alteration with in recognition sequence which in turn lead to restriction fragments of different length. Thus, one DNA fragments obtained using one specific restriction enzyme is considered as one RFLP.

Distribution of altered nucleotide through­out the genome sequence results in restriction fragments of different length between genotypes and could be detected on southern blot followed by hybridization with suitable probes. For exam­ple, restriction enzyme cut the DNA sequence at specific sequence.

Loss of one or more nucleotide in the restriction recognition sequence result in the lack of recombination and EcoRI enzyme fail to cleavage DNA strands at the altered site. As a result EcoRI generated larger DNA fragments. In this fashion presence of RFLP in the chromosomes of two homologous chromosomes can be detected based on the presence of restriction site in the DNA molecule of one chromosome which produces shorter fragments and lack of cut (restriction) site in other homologous chromosome produces longer fragments. Thus, it is possible to distin­guish two chromosomes in such individuals based on this RFLP.

In addition to chromosomal alteration repeated sequence of 9 to 65 base pairs long that occur a variable number of times (10-300 times) may also provide valuable level of polymorphism in higher organism, which can be easily detected by hybridization. This special case of RFLP analysis has been referred to as VNTR (variable number of tandem repeats) analysis.

These VNTR sequences are being considered as valuable markers, otherwise known as minisatellites and acts as valuable tool in distinguish the genotype of plants such as rice (Oryza sativa) and bean (Phaseolus vulgaris). These minisatellites markers can be studied using probes generated by PCR.

RFLP requires large quantities of high quality DNA for detection of single copy loci and detects only a fraction of existing sequence variability. Discovery of polymerase chain reaction (PCR) provides additional approach to analysis, which can overcome required quantity of DNA. The PCR can amplify specific region of DNA obtained. The amplified sequence can be compared by RFLP directly on stained agarose gel without entering into the southern hybridization.

Fig. 24.1A. Overview of RFLP technique.

Example # 2. Randomly Amplified Polymorphic DNA (RAPD):

These are dominant markers, discovered in 1990. It is one of the widely used techniques to characterise nature of DNA from plant and other organism. One of the main bases for this technique is to use PCR with short oligonucleotide primers of random sequence. Based on this approach Randomly Amplified Polymorphic DNA (RAPD) markers can be generated.

This technique is also the basis for arbitrarly primed polymerase reaction (AP-PCR) and DNA amplification finger printing (DAF). The principle behind RAPD is the amplification with short primers (9-10 mer) is such that many sites in genomic samples are potential template for primers. Variation in the concentration of primers or template and conditions used in PCR result in the amplification of different products to generate RAPD profiles.

Selection of standard primers, nucleotide, DNA polymorphism type and magnesium concentrations contribute in reproducibility. When genomic DNA from two plants species are subjected for RAPD, often produces different amplification patterns. A particular fragment generated for species but not for other species represents DNA polymorphism based on RAPD profiles. This difference is the basis and can be used as genetic marker or RAPD marker.

Advantages of RAPD over RFLP are that it requires crude extract of DNA. However, in RFLP, requires relatively pure DNA. Even small amount of DNA at nanogram level (5-20 ng) is sufficient for RAPD. In addition, use of radio isotopes is not essential and whole genome can be surveyed using random primers.

Its automation is completely and comparatively easy and exhibit intermediate in reliability. RAPD are efficient in analysis of less known species because they can be applied without prior knowledge of gene sequence. RAPD can be performed similar to those of polymerase chain reaction using genomic DNA from the species of interest and random primers.

Random primers are commercially available and they can be prepared by different combinations of nucleotides. These primers are synthesised based on choosing random sequence of the DNA. Each random primer when used will anneal to a various regions of the DNA and possible different loci can be analysed.

The RAPD method requires use of agarose gel to analyse the PCR products, for example, DNA sequence between two plant variety and individual can be distinguished by their primer binding site. Binding of one particular primer to one variety fail to bind to other due to lack of binding site, which results in the lack of particular band among PCR amplified product.

RAPD method can be refined to reveal more polymorphism if combined with restriction digestion, for example, certain economically important cereal crop like wheat exhibit very little genetic variation. Extracted DNA from wheat sample is first digested with restriction enzyme before subjected for RAPD, which reveals more refined DNA polymorphism.

Example # 3. Amplified Fragments Length Polymorphism (AFLP):

Amplified fragments length polymorphism (AFLP) is a PCR-based technique that involved restriction digestion of genomic DNA followed by ligation of adaptors to DNA fragments generated and consequently follows selective PCR amplification of these fragments.

DNA from plant to be analysed is digested with a restriction nucleases and short double stranded adapters (short oligonucleotides) are ligated to the ends of DNA fragments. The purpose of adding adaptors is because sequence of adaptors and adjacent restriction sites serve as primer binding sites for further amplification.

These restriction endonuclease sites are then amplified by PCR using primers complementary to the added adaptors and restriction sites. The degree of additional specificity can be (Fig. 24.1B) provided by a few specific nucleotide attached to 3′ and of the PCR primer. The choice of enzymes and primer length is crucial for maximum results in difficult application.

Fig. 24.1 Overview of AFLP technique.

The amplified fragments are separated on an agarose gel, subjected for southern blot and can be visualised by autoradiography or fluorescent sequencing equipment. The level of polymorphism detected by AFLP is lower than other techniques like microsatellites.

The AFLP technique is however, enable in analysing a large number of polymorphic loci simultaneously using single primer combination on one gel provides valuable information over other mapping methods. Research kits for labelling AFLP primers are manufactured under licence from key gene and life technology USA.

Example # 4. Sequence Characterized Amplified Region (SCAR) and Sequence Tagged Sites (STS):

Sequence characterized amplified regions (SCAR) is a PCR based method developed by sequencing markers amplified in arbitrary primers experiments can be converted into SCARs. Amplified RAPD products are cloned, sequenced and primer sequence are ascertained from the end of band is identified as RAPD markers.

In SCAR Method, designing of longer primers are required to offer greater degree of specificity. The sequence of the amplified product is used to design longer primers (Fig. 24.3). Thus, longer primers with increased specificity can amplify single repeatable band. Better assessment of F2 individual can be accomplished by converting dominant RAPD to codominant SCAR.

Reproducible potential of SCAR with longer primer has advantage over short primer used for RAPD analysis. Similarly, RFLP can be converted into SCAR by sequencing two ends on the genomic DNA and primers are designed based on end sequences. The polymorphic regions could be amplified by PCR in which primers were used directly on genomic DNA. Absence of amplified fragment length polymorphism proceeds with restriction digestion of PCR fragments to detect RFLP within the amplified fragments.

STS are short stretches of unique DNA sequence that can be amplified PCR from genomic library or genomic DNA using specific oligonucleotide primers. The STS primers can be designed by sequencing track of genomic RFLP markers, which are further used to develop the primers.


There are several different approaches to clone and you will need to find the right approach for your research. Below are some examples of popular cloning methods to generate a recombinant DNA construct:

Restriction Enzyme Based Cloning

Restriction enzymes are enzymes which cut DNA near at a specific short nucleotide sequence called a restriction site. The restriction enzyme based cloning method depends on the activity of restriction enzymes to ‘cut’ both a vector and a DNA insert and the method also depends on a DNA ligase to ‘paste’ the DNA fragment into the vector. This method is useful when have one DNA insert to incorporate into the plasmid.

By using PCR, you can add restriction enzyme sites on your DNA insert to accommodate this method. Your DNA insert must not contain an internal restriction site similar to the restriction site on your plasmid. Your restriction enzyme can cut your DNA insert at this internal restriction site and produce unwanted smaller pieces of DNA fragments.

You can choose to use one restriction enzyme or two enzymes to cut your DNA fragment and vector. When using two enzymes, both enzymes must be compatible or work well in the same restriction enzyme buffer.

Restriction Enzyme Based Cloning. 1. Short sequences containing restriction sites are added into the 5’ ends of primers for DNA amplification by PCR. 2. Both the vector and DNA fragment are digested with restriction enzymes to create cohesive ends. 3. The vector and DNA fragment are ligated. 4. The recombinant DNA enters the host cell during transformation.

PCR Cloning

PCR cloning relies on a process called ligation, which is a method of inserting a DNA fragment into a vector using DNA ligase. The reason ligation is important for this step is because it is responsible for inserting the PCR product into a ‘T-tailed’ plasmid.

PCR amplified inserts contain an adenine residue at the 3’ end of the DNA fragments (‘A-tailed’ ends). A ‘T-tailed’ plasmid vector has a single 3’ deoxythymidine (T) at each end of the arms of a linearized plasmid. Therefore, these PCR products can be ligated into ‘T-tailed’ vectors by using DNA ligase, and this step is followed by transformation.

You can choose this method when your restriction enzymes are not compatible or you find an internal restriction enzyme site in your DNA insert.

One disadvantage of this method is you will need a specific ‘T-tailed’ vector to perform PCR cloning. But ‘T-tailed’ vectors may not have supportive elements for your protein research, such as promoter region or protein tag.

PCR Cloning. 1. PCR Product with A-tailed ends is combined with T-tailed vector. 2. During ligation, PCR product is inserted into the vector.

Ligation Independent Cloning (LIC)

Ligation independent cloning (LIC) is performed by generating short sequences at the end of a DNA insert that match to the short sequences of a plasmid vector. Enzymes with 3’ to 5’ exonuclease activity chew 3’ ends and generate cohesive ends between the DNA fragment and the linearized vector. The two materials are then combined for annealing step. During transformation, the host organism repairs the nicks on the recombinant DNA.

The advantage of this method is it won’t create any new restriction sites or unwanted sequences in the final DNA construct.

LIC Cloning. 1. Short sequences which matches with sequences on the plasmid are added into the 5’ ends of primers for DNA amplification by PCR. 2. Plasmid is linearized by using restriction enzyme. 3. Both DNA insert and vector are treated with 3’ to 5’ exonuclease to create cohesive overhangs. 4. Both DNA and vector are annealed. 5. After transformation, the host cell repairs the nicks on the recombinant DNA.

Seamless Cloning (SC)

The seamless cloning (SC) technique (similar to LIC) depends on matching short sequences at the ends of a DNA fragment to the short sequences on a plasmid vector. SC method requires an enzyme with 5’ to 3’ exonuclease activity to create 3’ overhangs, a DNA polymerase to fill in gaps, and a DNA ligase to seal the nicks.

The advantage of LIC and SC over the restriction enzyme based cloning is it allows insertion of more than one DNA fragment into a vector. In addition, when you find an internal restriction enzyme site on your DNA fragment, you can use LIC or SC as an optional cloning method.

Seamless Cloning. 1. Short sequences are added into the 5’ ends of primers for DNA amplification by PCR. 2. Vector is digested by a restriction enzyme. 3. Both DNA fragment and vector are treated with an enzyme with 5’ to 3’ exonuclease activity to create cohesive overhangs. 4. During ligation, the DNA fragment is inserted into the vector.

Recombinational Cloning

This method requires site-specific DNA recombinase enzymes, which exchange and recombine DNA pieces with particular recombination sites.

The first step in this method is to insert a DNA fragment into an entry vector generating an entry clone. Another way to create an entry clone is by swapping and recombining a donor vector into an entry clone.

After creating an entry clone, the next step is to swap and recombine the entry clone into a destination clone. The benefit of this approach is it can be used to place more than five elements into a single vector. It is commonly used to identify protein-binding interactions or to optimize protein expression, purification and solubility. To perform this method, you will need a particular plasmid which has recombination sites.

Recombinational Cloning. 1. DNA fragment is inserted into an entry vector to create an entry clone. 2. Entry clone and destination vector are combined by a recombinase enzyme to create a destination clone.

In this article, we briefly explained about five molecular cloning methods which are commonly used by researchers. Molecular cloning has been used for many different purposes in a variety of research fields. In agriculture, cloning can shorten the time required to insert and develop a new beneficial trait in crops, such as a drought tolerant trait or a pest resistant trait. On the other hand, molecular cloning can be used to produce therapeutic proteins, such as a clot dissolving protein and interferon, or to synthesize other useful proteins, such as insulin, growth hormones, and monoclonal antibodies. Overall, the application of molecular cloning continues to improve and create more remarkable developments in the fields of agriculture, pharmaceutical industry, and biomedical research.

Making safe choices: DNA repair mechanisms avoid chromosomal combinations predisposed to disease

Two pathways of DNA damage repair. Rad51 faithfully repairs DNA damage through DNA strand exchange with intact DNA (left). Rad52 causes gross chromosomal rearrangements (GCRs) through single-strand annealing (SSA) (right). Credit: Osaka University

Homologous recombination is an essential process of DNA repair to maintain genomic integrity of the organism. Now, researchers from Japan have identified mechanisms that choose between alternate pathways of DNA repair to limit anomalous and deleterious chromosomal combinations that may be predisposed to cancer and genetic diseases.

In a recent study, researchers from Osaka University show that Rad52-dependent single-strand annealing (SSA) is the mechanism of homologous pairing that leads to gross chromosomal rearrangements (GCRs) at the centromere. They also identify mutations that allow this pathway to predominate in preference to the error-free Rad51 pathway.

Cells are under constant genotoxic pressure from exogenous and endogenous factors. Genome instability underlies several diseases including cancer. Consequently, DNA repair necessarily occurs thousands of times per day in each human cell to correct detrimental mutations, blockage of replication and transcription, and chromosomal breakage. Paradoxically, this recombination may occasionally cause dysfunctional GCRs.

The centromere of a chromosome is the specialized DNA sequence that links a pair of sister chromatids. Many organisms, including humans and fission yeast (Schizosaccharomyces pombe), have repetitive sequences at centromeres. This renders them vulnerable to isochromosome formation —dysfunctional mirror-images—due to a specific type of recombination between inverted repeats. In experiments conducted on fission yeast, the researchers demonstrated that DNA replication mechanisms reduce the occurrence of GCRs by inhibiting SSA activity at centromeres.

An in vitro single-strand annealing (SSA) reaction. Compared to the wild-type Rad52, the mutant Rad52-R45K produced limited amounts of the SSA product (upper bands). *, 32 P-label. Credit: Osaka University

"At centromeres, Rad51-dependent recombination predominates and other recombination pathways appear to be inhibited," explains Atsushi T. Onaka, lead author. "As Rad51 promotes conservative non-crossover recombination, the choice of recombination pathways is important for suppressing GCRs. This recombination predominantly occurs between inverted repeats, thereby suppressing formation of isochromosomes. However, how Rad51-dependent recombination predominates at centromeres is unknown."

Jie Su, co-lead author, explains further. "We showed that Rad52-dependent SSA is the mechanism of homologous pairing that leads to centromeric GCRs. The rad52-R45K mutation impairs SSA activity of the Rad52 protein and reduces isochromosome formation in rad51 mutant cells. To better understand how Rad52-dependent SSA is suppressed at centromeres, we performed a genetic screen and found that specific mutations in replication fork proteins and a fork protection complex increase Rad52-dependent SSA at centromeres and isochromosome formation."

"Our research implicates DNA replication machinery in the recombination pathway choice at centromeres, preventing Rad52-dependent SSA that results in GCRs," explains senior author Takuro Nakagawa. "This knowledge identifies Rad52 as a promising target in treating cancer."

The article, "DNA replication machinery prevents Rad52-dependent single-strand annealing that leads to gross chromosomal rearrangements at centromeres," was published in Communications Biology.


1: Genomes and the flow of biological information
2: Biological molecules
3: The chemical basis of life
4: Chromosome structure and function
5: The cell cycle
6: DNA replication
7: Chromosome segregation
8: Transcription
9: Regulation of transcription
10: RNA processing
11: Translation
12: Regulation of translation
13: Regulatory RNAs
14: Protein modification and targeting
15: Cellular responses to DNA damage
16: Repair of DNA double-strand breaks and homologous recombination
17: Mobile DNA
18: Genomics and genetic variation
19: Tools and techniques in molecular biology

Recombineering and Gene Targeting

One of the most widely used tools in modern biology is molecular cloning with restriction enzymes, which create compatible ends between DNA fragments that allow them to be joined together. However, this technique has certain restrictions that limit its applicability for large or complex DNA construct generation. A newer technique that addresses some of these shortcomings is recombineering, which modifies DNA using homologous recombination (HR), the exchange between different DNA molecules based on stretches of similar or identical sequences. Together with gene targeting, which takes advantage of endogenous HR to alter an organism’s genome at a specific loci, HR-based cloning techniques have greatly improved the speed and efficacy of high-throughput genetic engineering.

In this video, we introduce the principles of HR, as well as the basic components required to perform a recombineering experiment, including recombination-competent organisms and genomic libraries such as bacterial artificial chromosomes (BAC). We then walk through a protocol that uses recombineering to generate a gene-targeting vector that can ultimately be transfected into embryonic stem cells to generate a transgenic animal. Finally, several applications that highlight the utility and variety of recombineering techniques will be presented.


Cloning by homologous recombination, or recombineering, has greatly improved researchers’ ability to conduct high-throughput genetic engineering. Classical molecular cloning requires digestion of vectors and inserts with the same restriction enzymes to generate compatible ends for “recombination,” but especially when trying to isolate a region from a longer sequence, such as a stretch of genomic DNA, there are not always restriction enzymes that will cut uniquely around the region of interest, and not within the region or elsewhere in the sequence. By avoiding the need for these restriction sites, recombineering provides a much more efficient and cost-effective way to make genetic engineering constructs.

In this video, we will introduce the principles of genetic recombination, provide a general protocol for recombineering a gene-targeting vector, and discuss several applications of these technologies.

First, let’s discuss what recombination is and how it can be applied.

Genetic recombination entails the exchange of information between DNA molecules. In “homologous recombination,” relatively large stretches of similar or identical DNA sequences are exchanged. Cells use this process to repair double strand breaks, and sexually reproducing eukaryotes also carry this out between homologous chromosomes during meiosis to increase genetic diversity.

Homologous recombination has been harnessed for a number of experimental purposes, including the modification of specific endogenous loci in cells—called “gene targeting.” It is also applied to the engineering of genetic constructs, a process called “recombineering,” which is especially useful for making modifications to large stretches of DNA for targeting to an organism’s genome. To do this, researchers can make use of “genomic libraries,” or sets of vectors each containing long genomic DNA fragments. Various types of libraries with different capacities, usually propagated in bacterial clones, have been created and can be ordered from commercial or nonprofit repositories. One of the most commonly used types of genomic library is called a bacterial artificial chromosome or BAC, which can carry an insert size between 150-350 kilobases.

In order to recombineer a construct, scientists need an organism in which homologous recombination occurs. The yeast Saccharomyces cerevisiae is naturally “recombination-competent” and can readily carry out homologous recombination between a vector and insert. Another commonly applied system uses the “Red” proteins from bacteriophage lambda, which is reconstituted in the bacteria E. coli. Vectors can be recombineered either in previously established Red-expressing bacterial strains, or a plasmid encoding Red system components can be co-transformed with the recombineering substrates into the bacteria.

Now, let’s walk through a protocol to create a gene-targeting vector by recombineering.

The aim of this protocol is to create a genetically modified mouse with targeted changes to a specific gene. Briefly, desired modifications to the gene of interest are made in a BAC the modified sequence is “retrieved” into a targeting vector, and this vector is transfected into embryonic stem cells for gene targeting. The stem cells can then be used to generate the genetically modified animal. For instructions on this final procedure, please refer to the JoVE Science Education video “Genetic Engineering of Model Organisms.”

To begin, a BAC clone containing the genomic sequence of interest is identified and ordered. Next, the homology arms, or oligonucleotides containing the 30-50 basepair homologous regions, are designed. These oligonucleotides are used as primers in PCR reactions to generate the DNA “cassettes” used to modify the gene of interest. These cassettes typically contain, for example, antibiotic resistance genes that can act as selection markers, which can be readily amplified from many publicly available plasmids. The BAC bacterial clone is then transformed, for example by electroporation, with a plasmid encoding the Red system components, followed by the homology arm-containing cassettes. The bacteria are then incubated at the appropriate temperature to induce recombination.

To recover the modified sequence from the BAC, a “retrieval” vector is linearized and amplified using primers containing homology arms that cover several kilobases around the site of modification. This vector backbone is transformed into the BAC bacterial clone, and recombineering is induced again. The vector will be “gap-repaired” by “retrieving” the appropriate sequence from the BAC. This results in a gene-targeting vector containing the modified sequence of interest, as well as regions that are homologous to genomic DNA surrounding the modified sequence. This vector can then be purified, linearized and introduced into embryonic stem cells where the endogenous homologous recombination machinery will target the modified sequence to the corresponding locus, thereby altering the host genome.

Let’s now look at some applications of recombination-based engineering techniques.

First, researchers can use recombineering to quickly and easily engineer the bacterial genome, for example, to study protein complexes. Here, a cassette encoding a sequential peptide affinity tag as well as a kanamycin resistance selection marker was designed to be recombineered into the 3¢ end of the gene of interest. This vector was then introduced into Red “recombination-competent” E. coli, and successful recombinants were isolated using selective media. The tagged protein complexes were biochemically purified using proteins that bind strongly to the tags, and analyzed by mass spectrometry to identify the interaction partners of the protein of interest.

Recombineering and gene targeting can also be used to modify the genome of parasitic organisms, such as the obligate intracellular parasite Toxoplasma gondii,which causes the disease toxoplasmosis. In this study, a targeting vector was recombineered in yeast, where genomic sequences flanking the gene of interest were placed around a selection marker. The vector was then linearized and electroporated into the parasites. Limiting dilution plating in selective media was used to isolate successful recombinants, in which the parasite’s homologous recombination machinery replaced the genomic sequence with the engineered sequence, thus deleting the gene of interest. PCR of the targeted region confirmed positive recombination events.

Finally, a procedure known as subcloning plus insertion, or SPI, has been devised in which gap repair and cassette insertion occur simultaneously, making the recombineering process simpler and more flexible. Critical for the success of this process is the proper design of the multiple homology arms, which, at up to 180 bp, are longer than in conventional recombineering so as to increase recombination efficiency. After incorporating the homology arms into the subcloning plasmid and insertion cassettes by PCR, they were transformed together into a Red-expressing BAC clone, and recombineering was induced. Successfully recombineered constructs were then identified by PCR or restriction digest analysis.

You’ve just watched JoVE’s video on recombineering. In this video, we have discussed the principles of recombination and how they can be used in genetic engineering, a protocol for a recombineering project, and finally several applications of these techniques. As always, thanks for watching!


All commonly used cloning vectors in molecular biology have key features necessary for their function, such as a suitable cloning site and selectable marker. Others may have additional features specific to their use. For reason of ease and convenience, cloning is often performed using E. coli. Thus, the cloning vectors used often have elements necessary for their propagation and maintenance in E. coli, such as a functional origin of replication (ori). The ColE1 origin of replication is found in many plasmids. Some vectors also include elements that allow them to be maintained in another organism in addition to E. coli, and these vectors are called shuttle vector.

Cloning site Edit

All cloning vectors have features that allow a gene to be conveniently inserted into the vector or removed from it. This may be a multiple cloning site (MCS) or polylinker, which contains many unique restriction sites. The restriction sites in the MCS are first cleaved by restriction enzymes, then a PCR-amplified target gene also digested with the same enzymes is ligated into the vectors using DNA ligase. The target DNA sequence can be inserted into the vector in a specific direction if so desired. The restriction sites may be further used for sub-cloning into another vector if necessary. [2]

Other cloning vectors may use topoisomerase instead of ligase and cloning may be done more rapidly without the need for restriction digest of the vector or insert. In this TOPO cloning method a linearized vector is activated by attaching topoisomerase I to its ends, and this "TOPO-activated" vector may then accept a PCR product by ligating both the 5' ends of the PCR product, releasing the topoisomerase and forming a circular vector in the process. [3] Another method of cloning without the use of DNA digest and ligase is by DNA recombination, for example as used in the Gateway cloning system. [4] [5] The gene, once cloned into the cloning vector (called entry clone in this method), may be conveniently introduced into a variety of expression vectors by recombination. [6]

Selectable marker Edit

A selectable marker is carried by the vector to allow the selection of positively transformed cells. Antibiotic resistance is often used as marker, an example being the beta-lactamase gene, which confers resistance to the penicillin group of beta-lactam antibiotics like ampicillin. Some vectors contain two selectable markers, for example the plasmid pACYC177 has both ampicillin and kanamycin resistance gene. [7] Shuttle vector which is designed to be maintained in two different organisms may also require two selectable markers, although some selectable markers such as resistance to zeocin and hygromycin B are effective in different cell types. Auxotrophic selection markers that allow an auxotrophic organism to grow in minimal growth medium may also be used examples of these are LEU2 and URA3 which are used with their corresponding auxotrophic strains of yeast. [8]

Another kind of selectable marker allows for the positive selection of plasmid with cloned gene. This may involve the use of a gene lethal to the host cells, such as barnase, [9] Ccda, [10] and the parD/parE toxins. [11] [12] This typically works by disrupting or removing the lethal gene during the cloning process, and unsuccessful clones where the lethal gene still remains intact would kill the host cells, therefore only successful clones are selected.

Reporter gene Edit

Reporter genes are used in some cloning vectors to facilitate the screening of successful clones by using features of these genes that allow successful clone to be easily identified. Such features present in cloning vectors may be the lacZα fragment for α complementation in blue-white selection, and/or marker gene or reporter genes in frame with and flanking the MCS to facilitate the production of fusion proteins. Examples of fusion partners that may be used for screening are the green fluorescent protein (GFP) and luciferase.

Elements for expression Edit

A cloning vector need not contain suitable elements for the expression of a cloned target gene, such as a promoter and ribosomal binding site (RBS), many however do, and may then work as an expression vector. The target DNA may be inserted into a site that is under the control of a particular promoter necessary for the expression of the target gene in the chosen host. Where the promoter is present, the expression of the gene is preferably tightly controlled and inducible so that proteins are only produced when required. Some commonly used promoters are the T7 and lac promoters. The presence of a promoter is necessary when screening techniques such as blue-white selection are used.

Cloning vectors without promoter and RBS for the cloned DNA sequence are sometimes used, for example when cloning genes whose products are toxic to E. coli cells. Promoter and RBS for the cloned DNA sequence are also unnecessary when first making a genomic or cDNA library of clones since the cloned genes are normally subcloned into a more appropriate expression vector if their expression is required.

Some vectors are designed for transcription only with no heterologous protein expressed, for example for in vitro mRNA production. These vectors are called transcription vectors. They may lack the sequences necessary for polyadenylation and termination, therefore may not be used for protein production.

A large number of cloning vectors are available, and choosing the vector may depend upon a number of factors, such as the size of the insert, copy number and cloning method. Large insert may not be stably maintained in a general cloning vector, especially for those with a high copy number, therefore cloning large fragments may require more specialised cloning vector. [13]

Plasmid Edit

Plasmids are autonomously replicating circular extra-chromosomal DNA. They are the standard cloning vectors and the ones most commonly used. Most general plasmids may be used to clone DNA insert of up to 15 kb in size. One of the earliest commonly used cloning vectors is the pBR322 plasmid. Other cloning vectors include the pUC series of plasmids, and a large number of different cloning plasmid vectors are available. Many plasmids have high copy number, for example pUC19 which has a copy number of 500-700 copies per cell, [14] and high copy number is useful as it produces greater yield of recombinant plasmid for subsequent manipulation. However low-copy-number plasmids may be preferably used in certain circumstances, for example, when the protein from the cloned gene is toxic to the cells. [15]

Some plasmids contain an M13 bacteriophage origin of replication and may be used to generate single-stranded DNA. These are called phagemid, and examples are the pBluescript series of cloning vectors.

Bacteriophage Edit

The bacteriophages used for cloning are the λ phage and M13 phage. There is an upper limit on the amount of DNA that can be packed into a phage (a maximum of 53 kb), therefore to allow foreign DNA to be inserted into phage DNA, phage cloning vectors may need to have some non-essential genes deleted, for example the genes for lysogeny since using phage λ as a cloning vector involves only the lytic cycle. [16] There are two kinds of λ phage vectors - insertion vector and replacement vector. Insertion vectors contain a unique cleavage site whereby foreign DNA with size of 5–11 kb may be inserted. In replacement vectors, the cleavage sites flank a region containing genes not essential for the lytic cycle, and this region may be deleted and replaced by the DNA insert in the cloning process, and a larger sized DNA of 8–24 kb may be inserted. [17]

There is also a lower size limit for DNA that can be packed into a phage, and vector DNA that is too small cannot be properly packaged into the phage. This property can be used for selection - vector without insert may be too small, therefore only vectors with insert may be selected for propagation. [18]

Cosmid Edit

Cosmids are plasmids that incorporate a segment of bacteriophage λ DNA that has the cohesive end site (cos) which contains elements required for packaging DNA into λ particles. It is normally used to clone large DNA fragments between 28 and 45 Kb. [13]

Bacterial artificial chromosome Edit

Insert size of up to 350 kb can be cloned in bacterial artificial chromosome (BAC). BACs are maintained in E. coli with a copy number of only 1 per cell. [17] BACs are based on F plasmid, another artificial chromosome called the PAC is based on the P1 phage.

Yeast artificial chromosome Edit

Yeast artificial chromosome are used as vectors to clone DNA fragments of more than 1 mega base (1Mb=1000kb) in size. They are useful in cloning larger DNA fragments as required in mapping genomes such as in human genome project. It contains a telomeric sequence, an autonomously replicating sequence( features required to replicate linear chromosomes in yeast cells). These vectors also contain suitable restriction sites to clone foreign DNA as well as genes to be used as selectable markers.

Human artificial chromosome Edit

Human artificial chromosome may be potentially useful as a gene transfer vectors for gene delivery into human cells, and a tool for expression studies and determining human chromosome function. It can carry very large DNA fragment (there is no upper limit on size for practical purposes), therefore it does not have the problem of limited cloning capacity of other vectors, and it also avoids possible insertional mutagenesis caused by integration into host chromosomes by viral vector. [19] [20]

Animal and plant viral vectors Viruses that infect plant and animal cells have also been manipulated to introduce foreign genes into plant and animal cells. The natural ability of viruses to adsorb to cells , introduce their DNA and replicate have made them ideal vehicles to transfer foreign DNA into eukaryotic cells in culture. A vector based on Simian virus 40 (SV40) was used in first cloning experiment involving mammalian cells. A number of vectors based on other type of viruses like Adenoviruses and Papilloma virus have been used to clone genes in mammals. At present , retroviral vectors are popular for cloning genes in mammalian cells. In case of plants like Cauliflower mosaic virus , Tobacco mosaic virus and Gemini viruses have been used with limited success.

Many general purpose vectors such as pUC19 usually include a system for detecting the presence of a cloned DNA fragment, based on the loss of an easily scored phenotype. The most widely used is the gene coding for E. coli β-galactosidase, whose activity can easily be detected by the ability of the enzyme it encodes to hydrolyze the soluble, colourless substrate X-gal (5-bromo-4-chloro-3-indolyl-beta-d-galactoside) into an insoluble, blue product (5,5'-dibromo-4,4'-dichloro indigo). Cloning a fragment of DNA within the vector-based lacZα sequence of the β-galactosidase prevents the production of an active enzyme. If X-gal is included in the selective agar plates, transformant colonies are generally blue in the case of a vector with no inserted DNA and white in the case of a vector containing a fragment of cloned DNA.

Watch the video: Inner Life Of A Cell - Full Version (January 2022).