Was there originally a non-ribosomal way of synthesizing proteins?

Proteins are synthesized on ribosomes from mRNA copies of regions of the DNA. But ribosomes themselves are made up of proteins (and RNA). So how could the first ribosomes have arisen? Was there previously some other way of making proteins other than by ribosomes and mRNA?

Although I am of the same opinion as fpdx and the majority of scientists - that there was an RNA world in which the genetic material and the catalytic molecules were RNA (rather than DNA and protein, respectively) - there are some quite respectable scientists who hold a different view. As there is no proof for either viewpoint (and hence no correct answer to the question), it is important to present this alternative.

In brief, the alternative viewpoint is of a "proteins-first" world: i.e. the first proteins were made without ribosomes from chemical reactions in the "primeval soup". Only later did the contemporary and template-driven ribonucleoprotein machinery for making proteins evolve.

I do not know of an online source of arguments in favour of the protein-first viewpoint - an article by C. Kurland in Bioessays 32: 866-871,(2010) requires library access or purchase. However there is an on-line article discussing the objections to the RNA-world viewpoint, which, even though it argues against these objections, gives one an idea why the question cannot be considered settled. This is available at

(A quick answer to a question that will be likely closed being not well focused on the physics of all this)

First: the ribosome translates mRNA into the chain of amino acids that eventually will become a protein. The ribosome is a huge complex, made of both proteins and RNA. RNA can have enzymatic activity like a protein: the ribosome is a Ribozyme.

This chicken or the egg problem has puzzled biologists, chemists and biophysicists for a while. The structure itself of the ribosome was the "smocking gun" for the model we have today: the RNA world. In few words: because RNA can act as a protein and it is so similar to DNA, probably our DNA-world (i.e. information stored on the DNA, work done by proteins, RNA between the two) emerged from a fully RNA-world, where RNA was responsible for both protein activity and information storage. How in detail all this happened by evolution, is probably the subject of several nobel prizes to come.

Early evolution of life: Study of ribosome evolution challenges 'RNA World' hypothesis

In the beginning -- of the ribosome, the cell's protein-building workbench -- there were ribonucleic acids, the molecules we call RNA that today perform a host of vital functions in cells. And according to a new analysis, even before the ribosome's many working parts were recruited for protein synthesis, proteins also were on the scene and interacting with RNA. This finding challenges a long-held hypothesis about the early evolution of life.

The study appears in the journal PLoS ONE.

The "RNA world" hypothesis, first promoted in 1986 in a paper in the journal Nature and defended and elaborated on for more than 25 years, posits that the first stages of molecular evolution involved RNA and not proteins, and that proteins (and DNA) emerged later, said University of Illinois crop sciences and Institute for Genomic Biology professor Gustavo Caetano-Anollés, who led the new study. "I'm convinced that the RNA world (hypothesis) is not correct," Caetano-Anollés said. "That world of nucleic acids could not have existed if not tethered to proteins."

The ribosome is a "ribonucleoprotein machine," a complex that can have as many as 80 proteins interacting with multiple RNA molecules, so it makes sense that this assemblage is the result of a long and complicated process of gradual co-evolution, Caetano-Anollés said. Furthermore, "you can't get RNA to perform the molecular function of protein synthesis that is necessary for the cell by itself."

Proponents of the RNA world hypothesis make basic assumptions about the evolutionary origins of the ribosome without proper scientific support, Caetano-Anollés said. The most fundamental of these assumptions is that the part of the ribosome that is responsible for protein synthesis, the peptidyl transferase center (PTC) active site, is the most ancient.

In the new analysis, Caetano-Anollés and graduate student Ajith Harish (now a postdoctoral researcher at Lund University in Sweden) subjected the universal protein and RNA components of the ribosome to rigorous molecular analyses -- mining them for evolutionary information embedded in their structures. (They also analyzed the thermodynamic properties of the ribosomal RNAs.) They used this information to generate timelines of the evolutionary history of the ribosomal RNAs and proteins.

These two, independently generated "family trees" of ribosomal proteins and ribosomal RNAs showed "great congruence" with one another, Caetano-Anollés said. Proteins surrounding the PTC, for example, were as old as the ribosomal RNAs that form that site. In fact, the PTC appeared in evolution just after the two primary subunits that make up the ribosome came together, with RNA bridges forming between them to stabilize the association.

The timelines suggest that the PTC appeared well after other regions of the protein-RNA complex, Caetano-Anollés said. This strongly suggests, first, that proteins were around before ribosomal RNAs were recruited to help build them, and second, that the ribosomal RNAs were engaged in some other task before they picked up the role of aiding in protein synthesis, he said.

"This is the crucial piece of the puzzle," Caetano-Anollés said. "If the evolutionary build-up of ribosomal proteins and RNA and the interactions between them occurred gradually, step-by-step, the origin of the ribosome cannot be the product of an RNA world. Instead, it must be the product of a ribonucleoprotein world, an ancient world that resembles our own. It appears the basic building blocks of the machinery of the cell have always been the same from the beginning of life to the present: evolving and interacting proteins and RNA molecules."

"This is a very engaging and provocative article by one of the most innovative and productive researchers in the field of protein evolution," said University of California at San Diego research professor Russell Doolittle, who was not involved in the study. Doolittle remains puzzled, however, by "the notion that some early proteins were made before the evolution of the ribosome as a protein-manufacturing system." He wondered how -- if proteins were more ancient than the ribosomal machinery that today produces most of them -"the amino acid sequences of those early proteins were 'remembered' and incorporated into the new system."

Caetano-Anollés agreed that this is "a central, foundational question" that must be answered. "It requires understanding the boundaries of emergent biological functions during the very early stages of protein evolution," he said. However, he said, "the proteins that catalyze non-ribosomal protein synthesis -- a complex and apparently universal assembly-line process of the cell that does not involve RNA molecules and can still retain high levels of specificity -- are more ancient than ribosomal proteins. It is therefore likely that the ribosomes were not the first biological machines to synthesize proteins."

Caetano-Anollés also noted that the specificity of the ribosomal system "depends on the supply of amino acids appropriately tagged with RNA for faithful translation of the genetic code. This tagging is solely based on proteins, not RNAs," he said. This suggests, he said, that the RNA molecules began as co-factors that aided in protein synthesis and fine-tuned it, resulting in the elaborate machinery of the ribosome that exists today.

The National Science Foundation and the United Soybean Board supported this research.

Protein Synthesis of Prokaryotic and Eukaryotic Systems

The following points highlight the six main stages involved in protein synthesis of prokaryotic and eukaryotic systems. The stages are: 1. Amino Acid Activation and Formation of Amino Acyl-t-RNA 2. Binding of m-RNA to the Ribosome 3. Formation of Initiation Complex 4. Polypeptide Chain Elongation 5. Polysomes 6. Co Transcriptional Translation.

Stage # 1. Amino Acid Activation and Formation of Amino Acyl-t-RNA:

Amino acids require to be raised to a higher energy level to make them competent to be transferred to the t-RNAs. The activation of amino acids occurs by addition of AMP from ATP catalyzed by the enzyme amino acyI synthetase. The pyrophosphate group of ATP is released in the process. The same enzyme also catalyses the transfer of amino-acyI-AMP to its specific t-RNA. All protein amino acids are amino acids having the general structure.

The two-step reaction catalyzed by amino-acyI synthetase is:

The synthetase-bound amino acyl-AMP next reacts with its appropriate t-RNA producing amino acyl-t-RNA and AMP is set free. All t-RNA molecules possess a CCA sequence at the 3′-(OH) end. The amino acyl group of the amino acyl-AMP is transferred to the terminal adenylic acid of CCA sequence with the formation of a covalent linkage with either the 2′ or 3′ (OH) group of the ribose of adenylic acid as shown below (Fig. 9.41).

It should be specially mentioned that just as each amino acid is selected by its specific t-RNA, so also the formation of an amino acyl-t-RNA is catalysed by a specific amino acyl synthetase. A synthetase molecule can recognize its specific t-RNA with the help of the three-dimensional conformation of the protein molecule and that of the t-RNA.

It must also recognize the specific amino acid. Thus, in the intracellular pool, there are many different synthetase molecules, each of which is specific for both an amino acid and its cognate t-RNA. This one-to-one relationship is complicated by the presence of more than one t-RNA for most of the amino acids. To match the different codons of the same amino acid, there are different t-RNAs carrying complementary anticodons.

Once amino acyl-t-RNA has been formed, the amino acid plays no active role in selecting the site where it is to be inserted in the polypeptide chain, because it has no means to recognize the codon. It is carried passively by the t-RNA to its appropriate site. The t-RNA recognizes the m-RNA codon with its anticodon and brings the amino acid to its proper site.

Stage # 2. Binding of m-RNA to the Ribosome:

Protein synthesis does not take place on free m-RNA, but only on m-RNA bound to ribosomes. That is why ribosomes are sometimes referred to as ‘work-benches’ of protein synthesis. In both prokaryotes and eukaryotes, the m-RNA binds first to the small subunit of the ribosome i.e. the 30S subunit in prokaryotes and the 40S subunit in eukaryotes. The large subunit of the ribosomes is attached later to form the initiation complex.

In the prokaryotes, the m-RNA binds to 30S ribosomal subunit before the first amino acid is carried by the t-RNA. The first codon to initiate protein synthesis is AUG, but this initiator codon is preceded by a 20-30 nucleotide long sequence at the 5′-end of m-RNA. This means that the initiator codon (AUG) is situated 20-30 nucleotides downstream from the 5′-end. Within this preceding sequence, there is a short consensus sequence consisting of 5′–AGGAGGU-3′ situated 4 to 7 nucleotide ahead of the initiator codon AUG.

This consensus sequence is known as the Shine-Dalgarno sequence. This sequence helps in binding of m-RNA to the 30S subunit by forming base-pairs with its 16S r-RNA. At the same time, this sequence also acts as a signal for initiating protein synthesis at the next AUG sequence.

This is diagrammatically shown in Fig. 9.42:

The initiator codon codes for methionine, but all prokaryotic proteins have formyl-methionine (fmet) as the first amino acid at the amino terminus. In prokaryotes, t-RNA met picks up methionine with the help of methionyl t-RNA synthetase and methionine is then formylated by another enzyme, transformylase, the donor of the formyl group being N 10 -formyl-tetrahydrofolate (formyl-THF). The formyl group is attached to the amino group of methionine.

The transfer reaction is:

Formylation of methionine can only occur when methionine is carried by t-RNA fmel and not when methionine is charged on t-RNA met Thus, t-RNA fmet and t-RNA mcl are two distinct species although both are amino acylated by the same synthetase. The latter, i.e. t-RNA™’, carries methionine to the codon AUG situated inside the m-RNA but not to the initiator AUG codon.

In eukaryotic proteins, the first amino acid at the amino terminus is methionine and not formyl- methionine as in prokaryotes and the initiator codon is the same, i.e. AUG. However, in eukaryotes also, the initiator t-RNA met and t-RNA met for internal methionine are two distinct species. The former recognizes only the initiator AUG codon and no other AUG codons.

The initiator codon in eukaryotic m-RNA is situated 50 to 100 nucleotides downstream from the 5′ end. The 5′ end of eukaryotic m-RNA’s is always capped by methyl guanosine. This capped end binds to the 40S subunit of ribosome and the ribosomal subunit then moves along the m-RNA in the 5′ —> 3′ direction and scans the m-RNA triplets until it reaches an AUG sequence. At this point the initiation complex is formed, thereby also fixing the reading frame.

Stage # 3. Formation of Initiation Complex:

An initiation complex is an m-RNA-bound complete ribosome in which the initiator t-RNA carrying the first amino acid is attached and ready to receive the next incoming amino acid-charged t-RNA. In prokaryotes, formation of an initiation complex is preceded by the formation of a pre-initiation complex which is composed of the 30S subunit of the ribosome, the m-RNA molecule, a charged t-RNA fmet , three non-ribosomal proteins (initiation factors, IF) IF1, IF2 and IF3 and a molecule of GTP. The initiation factors, IF1 and IF3, help to dissociate the 70S ribosomes into 30S and 50S subunits, so that m-RNA can bind to the 30S subunit to form a pre-initiation complex.

The initiation factor IF2 mediates binding of GTP and the charged t-RNA fmet to the pre-initiation complex. Two other ribosomal proteins, SI and S12 are necessary for binding the m-RNA to the 16S r-RNA of the 30S subunit (Shine-Dalgarno sequence).

After the formation of the pre-initiation complex has been completed, the 50S subunit of the ribosome binds to it resulting in liberation of IF1. Next, IF2 is released by hydrolysis of GTP to GDP+Pi. With the attachment of the 50S subunit, the formation of initiation complex is completed.

The stepwise formation is shown in Fig. 9.43:

The binding of the 30S and 50S subunits creates two sites for binding of two t-RNA molecules. These are called the amino acyl site (A site) and the peptidyl site (P site). These sites overlap both the subunits of ribosome. The initiator, t-RNA fmet carrying formyl-methionine which was bound to the pre-initiation complex, is transferred to the P site of the initiation complex where its anticodon pairs with the initiator codon AUG of the m-RNA. The initiation complex is now ready for chain elongation.

So far as it is known, the sequence of events in the formation of the initiation complex in eukaryotes does not differ essentially from that of the prokaryotes, except that the number of initiation factors is more (at least nine) and binding of the m-RNA requires hydrolysis of ATP, a feature not known to occur in prokaryotes. In eukaryotes the initiator t-RNA carries methionine and not formyl-methionine as in prokaryotes. The initiator codon is the same i.e. AUG in both prokaryotes and eukaryotes.

Stage # 4. Polypeptide Chain Elongation:

In the initiation complex, the P site is occupied by the t-RNA fmet and the A site is vacant. It is now ready to be occupied by the incoming t-RNA carrying an appropriate amino acid. The anticodon of this t-RNA must match with the triplet of m-RNA positioned at the A site. The binding of the t-RNA to the A site requires a non-ribosomal cytoplasmic protein factor, called the elongation factor Tu (EF-Tu) which is activated by GTP.

The three components, viz. t-RNA, EF-Tu and GTP, form a ternary complex which binds to the A site of the ribosome. In this binding process, the TᴪC arm of t-RNA (see structure of t-RNA) is thought to interact with the 5S r-RNA of the 50S subunit of the ribosome. Once the charged t-RNA (incoming) binds firmly to a site of the ribosome, GTP is hydrolysed by a ribosomal protein enzyme to release GDP-EF-Tu and inorganic phosphate.

The A site is now occupied by the incoming t-RNA carrying the second amino acid. From the GDP-EF-Tu binary complex, GTP-EF-Tu is regenerated by another protein factor, called EF-Ts and inorganic phosphate. GTP-EF-Tu can then combine with the next incoming aminoacyl t-RNA to form the ternary complex.

The sequence of events are diagrammatically represented in Fig. 9.44:

The second step in the chain elongation process involves the formation of a peptide bond between the α-amino group the amino acid occupying the A site and the α-carboxyl group of the amino acid occupying the P site.

However, the peptide bond formation takes place by a complex reaction, because the carboxyl group of the amino acid in amino acyl-t-RNA at the P site is not free, but linked to the adenylic acid residue of CCA sequence of t-RNA (see Fig. 9.41).

The reaction is called peptidyl transferase reaction. It takes place in the 50S subunit of the ribosome. As a result of the reaction, the P site is now occupied by an uncharged t-RNA (without amino acid) and the A site is occupied by a t-RNA with a dipeptide as shown in Fig. 9.45.

In the next step, the t-RNA with the bound dipeptide moves from the A site to the P site displacing the empty t-RNA fmet . The movement is known as translocation and it is catalysed by another elongation factor EF-G which binds to the ribosome. In the process, hydrolysis of GTP catalysed by a ribosomal protein is known to occur.

Concurrent with translocation from the A site to P site, the m-RNA also moves through one coding unit in the 3′ —> 5′ direction. Thereby, the next codon is positioned at the A site. These processes of peptidyl transferase reaction and translocation take place with addition of each amino acid carried by the respective t-RNAs. As a result, the polypeptide chain elongates at the amino-terminal end by sequential addition of amino acids one by one determined by the matching of t-RNA anticodon and m-RNA codon.

The final step of protein synthesis is reached when, at the A site, one of the three termination codons (UAA, UAG or UGA) of m-RNA appears. As these codons do not code for any amino acid (that is why they are called nonsense codons), the A site remains vacant. The polypeptide chain is released from the ribosome through the action of a release factor (RF).

In prokaryotes, there are three release factors, RF1, RF2, and RF3. They bind to the ribosome and catalyse hydrolysis of the ester linkage between the t-RNA occupying the P site and the carboxyl group of the last amino acid. Thereby the polypeptide chain becomes free to be released from the ribosome. The 70S ribosome then falls off from the m-RNA.

Stage # 5. Polysomes:

In considering protein synthesis above, the mutual relation between an m-RNA and a single 70S ribosome has been dealt with. However, in practice, a single m-RNA molecule is used for multiple translation in both prokaryotes and eukaryotes. Such a practice is economical for the cell, because several copies of the same protein can be produced in a relatively short time without going through the process of transcription.

When the chain elongation phase of polypeptide synthesis has advanced to a stage, so that the chain is 25-30 amino acid long, the initiator AUG codon of the m-RNA becomes free to form another initiation complex in the same way as it formed the first one. So, a second 70S ribosome initiates synthesis of a second copy of the same polypeptide.

In this way initiation may be repeated several times on different ribosomes, thereby forming a string of ribosomes bound by a common m-RNA molecule. Such a structure is called a polyribosome or polysome. The size of a polysome varies according to the length of the m-RNA molecule.

There may be 3 to 4 or as many as 100 ribosomes in a polysome. It is obvious that polypeptide synthesis terminates in the first ribosome and then in the second and so an. At any given time, the length of the polypeptide chains varies depending on the progress of chain elongation in the ribosomes of a polysome complex.

A diagrammatic representation of polypeptide synthesis in a polysome is shown in Fig. 9.46:

Stage # 6. Co Transcriptional Translation:

In eukaryotes, the primary transcript, known as hn-RNA has to be processed by removing the introns, capping the 5′-end and adding a poly A-tail at the 3′ end. The processed product, the m-RNA, is then trans-located from the nucleus into the cytoplasm through pores in the nuclear membrane. In contrast, the primary transcript in prokaryotes is used directly as the m-RNA.

Also, the nuclear material is not separated by a membrane from the cytoplasm. A characteristic feature of the prokaryotic protein synthesis is, therefore, that translation of the m-RNA can begin while the m-RNA is still being transcribed from the template strand of DNA. This is known as simultaneous transcription and translation, or co-transcriptional translation.

As the RNA polymerase moves along the DNA template strand, 5′ end of the m-RNA comes out and binds to a 30S subunit by base-pairing with the Shine-Dalgarno sequence. An initiation complex is formed in the usual way and polypeptide synthesis begins. With progress of transcription, the m-RNA grows in length and it can bind more ribosomes forming a polysome (Fig. 9.47).

The phenomenon of co-transcriptional translation in prokaryotes assumes special significance in view of the fact that prokaryotic messengers are very short-lived having an average half-life only 1.3 to 1.8 minutes. Simultaneous transcription and translation can, therefore, make best use of a messenger molecule by shortening the two processes through coupling them.

In E. coli the rate of transcription at 37°C is 55 nucleotides/sec and that of translation is 17 amino acids per second. Therefore, transcription which is faster can run concurrently with the slower process of translation. Thereby the total time for protein synthesis can be considerably reduced.

Results and discussion

Sequence analysis of the CDPSs shows that it is a HUP clade Rossmannoid domain related to the class-I AAtRS catalytic domain

To understand the origins of the CDPSs we initiated iterative PSI-BLAST searches [11] with different representatives of the family such as AlbC from S. noursei [9]. In addition to the previously characterized representatives from firmicutes, actinobacteria and Photorhabdus, we also recovered several divergent versions (e < 10 -3 at the time of first detection) from other bacteria such as Parachlamydia, Pseudomonas fluorescens, Legionella, Sphingobium and Rickettsiella grylli prior to convergence (For details on Material and Methods refer to Additional file 1). These searches also detected homologous proteins in eukaryotes, such as the fungus Gibberella, the annelid worm Platynereis and the sea anemone Nematostella. All these versions were standalone proteins with no fusions to any other domains. A multiple alignment of the recovered representatives followed by secondary structure prediction with the JPRED program [12] revealed an α/β fold with five strand-helix units comprising the core of the fold with helical inserts after the second strand-helix unit and after the third strand (Figure ​ (Figure1). 1 ). The sequence conservation pattern revealed a GxSxxp (where p is a polar residue, usually an asparagine) between the first strand and helix (Figure ​ (Figure1). 1 ). A further conserved polar residue was found at the C-terminus of the 3 rd predicted strand and a conserved glutamate in the helical region between strand-3 and strand-4 (Figure ​ (Figure1). 1 ). These are predicted to comprise the active site of the CDPSs. The presence of five strand-helix units along with an active site loop after the first strand and a possible active site residue after the 3rd strand is reminiscent of the Rossmannoid domains and suggested that the CDPSs could adopt such a fold [13].

Alignment of the CDPS and class-I AAtRS catalytic domains. Sequences are labeled by their gene names, species abbreviations and Genbank index numbers separated by underscores. PDB ids, if available, are also shown. Sequences are colored based on 85% consensus derived from an alignment of the cyclopeptide ligases. A key for the coloring scheme, consensus abbreviations and secondary structure labels is shown in the box below the alignment. Familial affiliations of the sequences are shown to the right. Species names are expanded in Abbreviations.

To test this conjecture we used a Hidden Markov Model (HMM) derived from the multiple alignment of the CDPSs in a profile-profile comparison against a library of HMMs generated from all known domains with structural representatives in PDB using the HHpred program [14]. This search recovered catalytic domains of the class-I AAtRSs, namely tyrosyl-tRS (PDB: 2cyc p = 7 × 10 -6 ) and tryptophanyl-tRS (PDB: 3foc p = 4 × 10 -5 ) and the HIGH-motif nucleotidyltransferase, phosphopantetheine adenylyltransferase (1od6 p = 3 × 10 -4 ) as the best hits. The class-I AAtRSs and HIGH-motif NTases belong to the HUP (HIGH, UspA, Photolyase/PP-loop) superclass of Rossmannoid domains that, just as predicted for the CDPSs, contain a core sheet with 5 strands [13]. In all members of the HUP superclass the substrate-binding site is found in the loop between strand-1 and helix-1, consistent with the conservation pattern observed in the CDPSs. Indeed, profile-profile matches align the above-mentioned predicted active site loop of the CDPSs with the corresponding loop of the class-I aatRSs and HIGH NTases (Figure ​ (Figure1). 1 ). Furthermore, most class-I AAtRSs contain a major insert, typically helical, between strand-3 of the core Rossmannoid fold and the helix prior to strand-4 that form a "cap" over the active site [13,15]. This is also the point of insertion of the helical insert observed in the CDPSs (Figure ​ (Figure1). 1 ). Hence, this insert might form a cap over the core substrate-binding site in the CDPSs. Together these observations indicate that the CDPSs are novel members of the HUP superclass of Rossmannoid domains. However, given their restricted distribution in a relatively small set of bacteria and eukaryotes it is likely that they were derived later in evolution from a class-I AAtRS precursor like the YtRS or the WtRS (Figure ​ (Figure1). 1 ). This would also explain how the CDPSs could use aminoacyl tRNAs (AAtRNAs) as substrates in peptide ligation--they are predicted to bind them at the active site, similarly to the class-I AAtRSs. However, in the CDPSs the HIGH motif was lost and a novel signature with a conserved serine was acquired, reminiscent of another superfamily of HUP clade, namely the PP-loop ATPases [13] (Figure ​ (Figure1). 1 ). These changes are likely to be essential features of the CDPSs required to form the amide linkage from AAtRNAs, as opposed to the adenylation followed by ester formation seen in ancestral AAtRS.

Previous studies on CDPSs have shown that they are typically encoded in a conserved operon with a gene for an enzyme of the cytochrome P450 family [16] (Figure ​ (Figure2). 2 ). Studies in B.subtilis and M.tuberculosis indicate that the cytochrome P450 is required for further oxidative modification of the CDP (Figure ​ (Figure3). 3 ). In M.tuberculosis it catalyzes the cross-linking of the two tyrosine rings of cYY [16]. In synthesis of B.subtilis pulcherriminic acid it catalyzes the addition of oxygens to the nitrogens of the diketopiperazin ring of CDP [9]. In contrast, the albonoursin-like operons found in certain actinomycetes links the CDPS with an oxidoreductase of the nitroreductase family (a Rossmann fold dehydrogenase), which is likely to catalyze the α-β desaturation of the two amino acid side chains in the dipeptide (Fig ​ (Fig2, 2 , ​ ,3). 3 ). Among the newly detected versions in this study we found several conserved gene neighborhood associations for AlbC-like CDPSs that might be indicative of alternative modifications and synthetic mechanisms for the dipeptides generated by them. For example, in Burkholderia sp. 383 and Pseudomonas fluorescens we observed novel associations with genes encoding 2-oxoglutarate-dependent dioxygenases that could potentially catalyze hydroxylations of the amino acid side chains of the dipeptide (Figure ​ (Figure2 2 )[17]. Interestingly, in some actinomycetes, like Actinosynnema mirum and Streptomyces sp. AA4 the AlbC-like CDPS shows a neighborhood association with genes encoding a methyltransferase and an acyl-coA ligase. The former enzyme is related to ubiquinone methylases of the UbiE type and could modify CDPs through methylation. The latter superfamily of enzymes adenylates carboxylate groups and subsequently ligates them to coA via a thiocarboxylate linkage to form an acyl-coA [18]. Hence, these CDPSs could in principle use aminoacyl-coAs generated by the action of the above enzymes as a substrate, rather than AAtRNAs. Indeed, enzymes of the acyl-coA ligase are part of the large condensation domain-dependent non-ribosomal peptide and polyketide synthesis systems [19]. In some cases the CDPS gene co-occurs in predicted operons with an additional peptide ligase (Figure ​ (Figure2) 2 ) such as a Mur ligase of the P-loop kinase superfamily (e.g. Desulfovibrio aespoeensis) or a GNAT fold ligase (e.g. Rickettsiella grylli) that could mediate formation of further peptide linkages. These operons might also contain a SelA-like pyridoxal phosphate dependent enzyme (PLPDE) that could synthesize a modified amino acid that is incorporated into the peptide (Figure ​ (Figure2 2 ).

Examples of predicted operons of novel peptide biosynthetic systems. Genes are shown as arrows pointing from the 5' to the 3' end of the coding frame. Operons are labeled with the gi and species name of the primary AlbC, MtRS or cupin genes in that context. Gene identifiers are derived from the genome annotation provided by NCBI. Other than the standard domain names the remaining identifiers are provided in Abbreviations. In the AlbC operon AlbA encodes a nitroreductase family enzyme and AlbD a transmembrane protein.

Simplified scheme showing the core reactions catalyzed by the enzymatic systems described in this work. The figure shows reaction schemes for the biosynthesis of the cyclopeptide albonoursin, the siderophore aerobactin, and possible substrates of enzymes encoded by the systems based on the MtRS paralogs and reactions they could potentially catalyze.

Beyond free-living bacteria with active secondary metabolism systems we also found CDPSs in several phylogenetically distant intracellular parasitic bacteria, such as Legionella, Rickettsiella and certain chlamydiae (Figure ​ (Figure1). 1 ). Typically, these bacteria do not encode biosynthetic systems for secondary metabolites that are common in strongly competing slow-growing forms like actinomycetes and myxobacteria. Hence, the spread of such CDPSs across diverse intracellular bacterial pathogens/symbionts suggests a potential role for particular dipeptides in surviving in or manipulating host cells. Detection of a CDPS in the plant pathogenic fungus Gibberella zeae is consistent with early reports of the generation of CDPs by such fungi with a bioactive effect on their hosts [20]. Interestingly, the CDPS gene encoded by the annelid Platynereis is induced as part of the antibacterial innate immune response [21], suggesting that CDPs might play a role as endogenously encoded antibiotics in the immune response of certain animals.

Identification of a widespread methionyl-tRNA synthetase paralog that might be involved in novel peptide synthetic pathways

The above finding that the CDPSs are highly derived offshoots of a class-I AAtRS-like domain, together with the identification of the paralogous CtRS as the amino acid ligase in mycothiol biosynthesis [10], indicated that AAtRS homologs could also directly function as peptide ligases. To investigate if other AAtRS could function in such a capacity we hoped to use contextual information from predicted operons and domain fusions to identify potential candidates. Recently we identified a paralogous version of the MtRS in several bacterial genomes that is fused to a double-stranded β-helix domain of the metal-binding cupin class [17]. Further analysis showed that related paralogous MtRSs are encoded by several other bacterial genomes - these lack fusion to the cupin domain, but in all these cases a standalone cupin is encoded by a neighboring gene in the genome (Figure ​ (Figure2). 2 ). While a given bacterium might have 1-3 copies of this paralogous MtRS, it always also contains the conventional MtRS involved in protein synthesis (Additional file 1). This suggested that the paralogous MtRS is likely to be dedicated to a pathway involved in synthesis of a distinct metabolite, separate from protein synthesis. Given that the catalytic residues required for adenylation of the amino acid are intact in this paralogous family they are predicted to be catalytically active enzymes capable of adenylating methionine (Additional file 1). This also implies that the metabolite synthesized by these enzymes is likely to be a derivative of methionine.

To better understand the provenance of these MtRS paralogs, we included them in a phylogenetic analysis along with the conventional MtRS. The resultant tree showed that the conventional MtRSs fall in two major clusters separated by a long internal branch (Additional file 1). The first of these clusters is almost exclusively comprised of bacterial, chloroplast and mitochondrial MtRSs. Within this group several monophyletic bacterial lineages can be recognized such as cyanobacteria (including the chloroplast versions), firmicutes and various proteobacteria. This group is unified by the presence of a single Zn-ribbon inserted into the catalytic domain (Additional file 1). The second cluster includes the conventional MtRS from almost all archaea, eukaryotes (cytoplasmic) and certain bacterial lineages, such as actinobacteria, chloroflexi, spirochetes and bacteroidetes. This group is typified by presence of an insert with two Zn-ribbons or a variant thereof, i.e. a circularly permutated Zn-ribbon arising from the loss of the N- and C- terminal metal-chelating dyads, respectively of the first and second Zn ribbons (Additional file 1). This phylogenetic picture of the conventional MtRSs suggests that the internal long-branch represents the primary split between the archaeo-eukaryotic and bacterial lineages, with subsequent lateral transfer of the archaeo-eukaryotic versions to certain bacterial lineages along with xenologous gene-displacement [15]. The newly identified paralogous MtRSs, which are fused to or associated with the metal-binding cupin domain, are only found in bacteria and form a distinct cluster grouping with the "archaeo-eukaryotic" type MtRSs, albeit separated by a long branch (Additional file 1). This grouping is supported by the fact that they possess a duplicated segment-swapped Zn-ribbon, just like the "archaeo-eukaryotic" type MtRSs. This paralogous MtRS is widely distributed in various bacterial lineages, namely actinobacteria, proteobacteria and firmicutes. In particular, it is frequently found in diverse actinobacterial species, with Salinispora arenicola having three and Actinosynnema mirum two distinct paralogs. Interestingly, it is also found in several distantly related pathogenic or symbiotic bacteria that are known to secrete compounds into host cells: Ralstonia solanacearum (two paralogous versions), Burkholderia phytofirmans, multiple Dickeya species, Erwinia pyrifoliae, Pseudomonas syringae, Frankia species, Sinorhizobium meliloti and Mesorhizobium loti, which associate with plants and Yersinia pseudotuberculosis, Photorhabdus luminescens, Pseudomonas entomophila, Bacillus cereus and Legionella pneumophila, which infect animals. The affinities of this paralogous group, suggest that it was probably founded by an independent lateral transfer, perhaps from an archaeal source, into a bacterial lineage. Given its wide presence in actinobacteria, it is likely that this bacterial lineage was the actinobacterial lineage, from where it dispersed widely through lateral transfer to several distinct firimicute and proteobacteria lineages, probably in shared environments.

Contextual evidence suggests that the paralogous methionyl-tRNA synthetases might catalyze synthesis of a novel peptide

To better understand the biosynthetic pathway wherein this MtRS paralog functions we resorted to genomic neighborhood analysis, which, along with domain fusions, has been shown to be a powerful tool to infer functions of uncharacterized proteins [22]. This group of MtRS paralogs is particularly suited for such analysis as they are highly mobile in evolutionary terms and any gene-neighborhood associations that are detected between distantly related species are likely to be indicative of functionally relevant interactions. We found two gene neighborhood/domain-fusion associations that occur without exception with all these MtRS paralogs (Figure ​ (Figure2): 2 ): 1) the above-noted association with the metal-binding cupin. 2) A neighborhood association with a gene-encoding L-lysine 6-monooxygenase, a Rossmann fold oxidoreductase that catalyzes the NADPH-dependent hydroxylation of lysine at the N6 position. The N6-hydroxy-lysine is an intermediate in the biosynthesis of a non-ribosomally condensed peptide-derived siderophore, aerobactin [23]. In aerobactin biosynthesis the N6 position is further modified by acetylation by an acyltransferase of the GNAT fold (Figure ​ (Figure3). 3 ). Interestingly, we found that several MtRS paralog gene neighborhoods additionally encode a member of the GNAT superfamily indicating that a similar reaction is likely even in this system (Figures ​ (Figures2 2 and ​ and3). 3 ). These N6 modifications of lysine serve to block the ε-NH2 group, thereby favoring dipeptide condensation utilizing the main-chain α-NH2 by the aerobactin synthetase that belongs to the protein kinase fold [5]. The NH2 group of glucosamine, which is cysteinylated by CtRS in mycothiol synthesis, is also initially blocked by an acetyl group (i.e. as N-acetyl-glucosamine) prior to amide formation [10]. Given that modifications of lysine N6 comparable to those seen in aerobactin biosynthesis are predicted to be catalyzed by enzymes encoded in the MtRS paralog gene neighborhoods, we reasoned that the lysine N6 is likely to be modified similarly for peptide bond formation even in this system (Figure ​ (Figure3). 3 ). This also suggests that the other enzyme common to all these neighborhoods, the MtRS paralog, by analogy to the CDPSs and the cysteinyl ligase in mycothiol biosynthesis, is likely to catalyze formation of a peptide bond in this system (Figure ​ (Figure3). 3 ). Thus, we propose that the core of the biochemical pathway specified by this conserved gene neighborhood involves the synthesis of a dipeptide through the condensation of the adenylated carboxyl group of methionine with the α-NH2 group of a lysine derivative (i.e. modified at the N6 position). The presence of a gene encoding acireductone dioxygenase, an enzyme involved in methionine salvage, in a subset of these gene-neighborhoods is also consistent with methionine being channeled into this metabolite (Figure ​ (Figure2 2 ).

In some organisms there are gene neighborhood associations suggestive of potential variations to the theme of modification of the N6 lysine. A subset of these predicted operons (e.g. in Rhodococcus jostii and two of the three paralogous gene-neighborhoods in Salinispora arenicola) also contain a tightly linked saccharopine dehydrogenase gene (Figure ​ (Figure2). 2 ). This enzyme catalyzes the formation of saccharopine by linking 2-oxoglutarate to N6 of lysine. Thus, the modification of the ε-NH2 by this enzyme might effectively be similar to the acetylation of this position (Figure ​ (Figure3). 3 ). Further, several of the predicted operons encode proteins such as (Figure ​ (Figure2): 2 ): 1) acyl-coA ligase which ligates coA to a fatty acid [18] 2) one or more acyl carrier proteins (ACP) that bear a serine-linked phosphopantetheinyl moiety, which in turn carries an fatty acyl group as a thioester 3) Acyl condensation enzymes, which catalyze the condensation of an acyl-coA to another moiety resulting in an elongated chain due to addition of the acyl element 4) enzymes that could catalyze a transacylase reaction that delinks the fatty acid from the ACP or transesterifies it. These transacylase-like enzymes might belong to the previously recognized NTN-hydrolase superfamily or the α/β-hydrolase superfamily or the BtrH family with a papain-like fold [5], or are representatives of a novel family of proteins that we determined as also belonging to the papain-like fold (Additional file 1) 4) Acyl-coA dehydrogenases, which catalyze the modification of fatty acids via desaturation. These genes are present in the neighborhoods only if the gene cluster also encodes a member of the GNAT fold, suggesting that they might be involved in synthesis of an acyl-coA substrate that could be used by the GNAT enzyme (in place of the default acetyl-coA) to modify the N6 position of lysine. However, in principle the ACP and associated enzymes could also be used as a substrate for attachment of the peptide synthesized by this system, comparable to what is observed in butirosin biosynthesis and peptide synthesis by giant multidomain peptide synthetases [24,25]. The other universally present component of this system, the metal-binding cupin, belongs to a large radiation of such enzymes, which comprise of a single double-stranded β-helix domain with four conserved positions involved in chelating a metal ion [17,26]. They catalyze two distinct types of reactions: 1) isomerization of linearized sugars through an enediol intermediate and 2) a dioxygenase reaction, incorporating two oxygens into the substrates (e.g. cysteine dioxygenase, wherein the SH group of cysteine is oxidized to sulfinate). However, unlike the 2-oxoglutarate-dependent dioxygenases of the double-stranded β-helix fold they are not known to catalyze single hydroxylations of substrates [17]. Given that this cupin is tightly associated with the MtRS (either fused or typically as the gene 5' to the MtRS gene (Figure ​ (Figure2), 2 ), it is possible that it functions in close association with MtRS, perhaps catalyzing a reaction on methionine (Figure ​ (Figure3). 3 ). However, the exact nature of this modification remains unclear as there are currently no precedents for such modifications in other characterized peptide modification systems.

Beyond the conserved core, majority of these neighborhoods encode several other enzymatic domains and transporters and peptide-binding proteins of the periplasmic binding protein superfamilies. While these tend to vary between neighborhoods, their close genomic linkage and predicted biochemistry suggests that they are part of the same biosynthetic pathway. One or more genes encoding ATP-grasp enzymes or multi-domain non-ribosomal peptide synthases with condensation domains are encountered in several of these operons. By analogy to other non-ribosomal peptide synthesis operons these enzymes are likely to catalyze the ligation of additional amino acids [5]. Consistent with this, enzymes for synthesis of other amino acids are also encountered in these operons (Figure ​ (Figure2, 2 , Additional file 1). These include a PP-loop ATPase that is closely related to the asparagine synthetase, suggesting that it might catalyze formation of asparagine. Furthermore, two ornithine generating enzymes, namely arginase and glycine amidinotransferase, proline-generating ornithine cyclodeaminase, citrulline-generating dimethylargininase and different PLPDEs are also encoded by some of these neighborhoods. The latter enzymes include the diaminobutyrate transaminase, which synthesizes 2,3-diaminobutyrate, and cysteine synthase which generates cysteine. It is possible that amino acids generated by the action of these enzymes are incorporated as further residues of the peptide or alternatively they modify the peptide via reactions such as decarboxylation (e.g. as catalyzed by the PLPDE, BtrK, in butirosin biosynthesis [24]). Other prevalent enzymes encoded by these predicted operons are diverse redox enzymes belonging to distinct folds (Figure ​ (Figure2): 2 ): 1) First of these are Rossmann fold dehydrogenases which utilize either FAD or NADH cofactors. 2) Members of the flavin- or F420- dependent monooxygenase superfamily, which includes the monoxygenase BtrO that hydroxylates the amino acid side-chain in butirosin biosynthesis [24]. 3) double-stranded β-helix-fold 2-oxoglutarate- dependent dioxygenases, such as members of the JOR/JmjC superfamily, and the phytanoyl hydroxylase family of classical 2OGFeDOs [17]. 4) SnoaB/ActVA-Orf6-type ferredoxin-fold monooxygenases that catalyze the insertion of a single oxygen atom into substrates and are often found in the biosynthetic pathways of several antibiotics [27]. 5) α-helical diiron oxygenases of the heme oxygenase superfamily [28]. The last 4 classes of above enzymes could in particular catalyze hydroxylation or oxygenation of side chains of the peptide and/or the fatty acyl moiety if present. Thus, in conclusion majority of these systems centered on the paralogous MtRS are predicted to catalyze synthesis of a highly oxygenated derivative of the dipeptide Met-Lys, which in some cases might be further extended with additional residues.

The presence of this system in diverse actinobacteria, which are known to produce diverse secondary metabolites, suggests that this peptide derivative might function as an antibiotic. However, outside of actinomyctes, it is primarily found in bacteria showing symbiotic or parasitic associations with eukaryotic cells, which normally do not encode any antibiotic production systems. In these cases it is likely that this peptide metabolite has a role in host-parasite interactions. Consistent with this, in these organisms the predicted operons usually contain a peptide-binding protein and a transmembrane efflux transporter (Figure ​ (Figure2). 2 ). One possibility is that it functions as a siderophore in these cases, with the variability probably arising due to selection against siderophore-stealing or host immunity.

Identification of a parallel peptide synthesis system further supports the role of the MtRS paralog in peptide ligation

We also identified another set of predicted operons that closely paralleled the above system in diverse firmicutes, cyanobacteria, proteobacteria and actinobacteria (Figure ​ (Figure2). 2 ). These were centered on a distinctive protein that combines a cupin domain, related to those fused or associated with the MtRS, with an N-terminal uncharacterized region (e.g. FRAAL4157, gi: 111223558). Iterative sequence searches seeded with this N-terminal region using the PSI-BLAST program recovered significant matches to the heme oxygenase superfamily of diiron oxygenases that were also detected in the above system (e.g. Frankia FRAAL4157 recovers the experimentally characterized oxygenase Ct610, PDB: 1RCW from Chlamydia trachomatis, e = 10 -3 , iteration 8). The matches showed that the N-terminal region of these proteins contain two diiron oxygenase domains - the N-terminal one predicted to be inactive and the C-terminal predicted to be active based on conservation of the iron-chelating histidines and acidic residues [28]. Thus, these proteins are predicted to possess two distinct oxygenase capabilities, with the diiron oxygenase domain probably functioning as a monoxygenase and the C-terminal cupin domain as a dioxygenase. They are encoded in predicted operons that show three related but distinct themes (Figure ​ (Figure2): 2 ): 1) the simplest of these combine the gene encoding the oxygenase-cupin protein with one or two enzymes involved in modifying amino acids such as an amino acid methylase and a PLPDE. 2) The second set of these predicted operons combine the oxygenase-cupin gene with genes encoding one or two peptide ligases of the ATP-grasp fold [5]. Additionally, these operons encode a Rieske 2Fe-2 S iron-sulfur protein involved in electron transport in redox reactions. 3) The final set of predicted operons is similar to the above operons - in place of the ATP-grasp peptide ligases they encode giant multi-domain non-ribosomal peptide synthetases with condensation domains and also acyl-coA ligases, which could charge amino acids with coA for use by the former enzymes. This set of operons also encodes a protein of the uncharacterized "YqcI/YcgG" superfamily that contains an absolutely conserved N-terminal cysteine (Additional file 1). Given that the multi-domain non-ribosomal peptide synthetases might utilize a protein anchor for elongation of peptides, it is conceivable that the conserved cysteine of this family serves as a means for anchoring the initial residue via a thiocarboxylate linkage. Furthermore, all the three versions of these operons might additionally encode a transmembrane efflux transporter, suggesting that the metabolite synthesized by this operon is deployed in the environment (Figure ​ (Figure2). 2 ). It is possible that in these gene-clusters, the cupin might play a role similar to the cognate cupin in the systems centered on the MtRS paralog, whereas the diiron oxygenase could function similarly to the lysine N6-monoxygenase. The simplest of these predicted operons are likely to modify a single amino acid. Those containing either of the two unrelated types of peptide-bond forming enzymes appear to be analogs of the system centered on the MtRS paralog with distinct oxygenases associated with at least one peptide ligase.

The inherent flexibility of type I non-ribosomal peptide synthetase multienzymes drives their catalytic activities

Non-ribosomal peptide synthetases (NRPSs) are multienzymes that produce complex natural metabolites with many applications in medicine and agriculture. They are composed of numerous catalytic domains that elongate and chemically modify amino acid substrates or derivatives and of non-catalytic carrier protein domains that can tether and shuttle the growing products to the different catalytic domains. The intrinsic flexibility of NRPSs permits conformational rearrangements that are required to allow interactions between catalytic and carrier protein domains. Their large size coupled to this flexibility renders these multi-domain proteins very challenging for structural characterization. Here, we summarize recent studies that offer structural views of multi-domain NRPSs in various catalytically relevant conformations, thus providing an increased comprehension of their catalytic cycle. A better structural understanding of these multienzymes provides novel perspectives for their re-engineering to synthesize new bioactive metabolites.

1. Introduction: the inherent flexibility of non-ribosomal peptide synthetases

Natural products are secondary metabolites synthesized by microorganisms in order to adapt to their environment [1]. Many of these natural products have been used for medical purposes, such as the antibiotics daptomycin and vancomycin [2] and the anti-cancer molecule bleomycin [3]. Among these natural products, non-ribosomal peptides (NRPs) make up a vast class of peptide-based metabolites, synthesized independently from the ribosome by large machineries named non-ribosomal peptide synthetases (NRPSs). For instance, the surfactin lipopeptide is synthesized by the 1 MDa surfactin NRPS from Bacillus subtilis (figure 1) [4]. Recent progress in whole genome sequencing has revealed the existence of numerous NRPS gene clusters among bacteria and fungi, mostly of unknown function [5–7]. Nevertheless, the structural understanding of the machineries that produce these metabolites has long remained limited to studies of isolated domains [8] until relatively recently, and has evolved dramatically in the last few years.

Figure 1. Organization of the surfactin NRPS. (a) The surfactin NRPS is composed of three polypeptides: SrfA-A, SrfA-B and SrfA-C. Each polypeptide is composed of one or several modules, each one being responsible for the incorporation of an amino acid, highlighted in red in the growing metabolite. Module 1 is the initiation module, module 7 is the termination module and those in between are elongation modules. The addition of an amino acid into the metabolite requires the cooperation between domains, represented as coloured spheres. All surfactin NRPS modules possess a unique non-catalytic domain, the PCP (in orange) that tethers the growing metabolite. Each module also contains at least two catalytic domains: a condensation domain (C, in blue) and an adenylation domain (A, in green). Additionally, modules 3 and 6 possess an optional epimerization domain (E, in grey). Finally, the termination module ends with a thioesterase (TE, in red) domain that releases and cyclizes the surfactin molecule. (b) Chemical structure of the surfactin.

NRPSs are classified into two categories, type I and type II. In type II NRPSs, the incorporation of an amino acid into the metabolite necessitates the involvement of several domains carried by distinct proteins [9]. By contrast, the type I NRPS megaenzymes use an assembly line strategy (figure 1) with modules that act sequentially, each being responsible for the incorporation of an amino acid into the final metabolite. A NRPS assembly line is composed of an initiation module (module 1 in figure 1a), a termination module (module 7 in figure 1a) and one or several elongation modules (2 to 6 in figure 1a). In general, several modules are fused into a single polypeptide. Polyketide synthases (PKSs) that are also modular megaenzymes, adopt the same assembly line logic as NRPSs but they use small carbon chain substrates instead of amino acid substrates [10]. This similar strategy explains the existence of numerous hybrid NRPS/PKS assembly lines that produce hybrid peptide-polyketide metabolites [11].

Most modules are composed of two catalytic domains, the adenylation (A) and the condensation (C) domains (figure 1a) each NRPS module also incorporates the non-catalytic, but essential, peptidyl carrier protein (PCP) domain [12]. In this review, we will consider that a classical module starts with a C domain and ends with a PCP domain (figure 1). The PCP domain functions as an anchoring platform for shuttling substrates to the different catalytic domains within a module it also allows the transport of the modified substrate from the upstream to the downstream module (figure 1a). PCP domains, whose masses are in the range of 10 kDa, have been mostly studied in isolated form by NMR [13–15]. They fold as a right-handed four helix bundle, with the four helices (I, II, III and IV) being connected by loops. At the N-terminus of helix II, all PCP domains possess a conserved serine residue that serves as an attachment point for a phosphopantetheine arm (PPant arm). This post translational modification is catalysed by phosphopantetheinyl transferases (PPtases) that convert apo-PCPs into holo-PCPs [16]. The 20 Å long PPant arm displays a free thiol at its extremity that allows the loading of various substrates. In this review, the term ‘loaded-PCP’ will be used to refer to PCPs modified with a PPant arm and loaded with a substrate. Indeed, a number of NRPSs have been structurally characterized in various productive conformations by employing a promiscuous PPtase, such as Sfp from Bacillus subtilis [17], for the loading of substrates, mimics or dead-end inhibitors, onto the megaenzymes [18–22].

The PCP domain delivers substrates to the catalytic domains. First, the adenylation (A) domain activates the incoming acid monomer. A domains select very diverse monomers including α-L- or α-D-amino acids, β-amino acids or aryl acids [23]. Subsequently, the condensation (C) domain catalyses peptide bond formation between two PCP-tethered monomers. Many optional tailoring domains that further chemically modify the metabolite under construction can also be found in NRPSs [24]. For example, the epimerization (E) domain converts natural L-amino acids into D-amino acids (modules 3 and 6 in figure 1a). Finally, each assembly line ends with a domain, such as a thioesterase (TE) or a reductase (Re) (module 7 in figure 1a), that releases the final product. This final stage can introduce further diversity in the peptide as the release can occur either by hydrolysis or cyclization. Interestingly, since surfactin release occurs via macrolactonization [25,26], surfactin metabolites contain both amide and ester bonds, thus deserving the designation of ‘depsipeptide’ (figure 1b) [27].

NRPS flexibility allows the conformational changes required for the interactions between the PCP and the catalytic domains. However, this flexibility is a drawback to structural characterizations of large NRPS fragments and, accordingly, successful structural studies have often required the employment of chemical tools to reduce conformational heterogeneity [19–22]. Nevertheless, characterizing the movements of NRPS multienzymes is a requirement for the detailed understanding of these fascinating machineries. In this review, we focus on recent aspects of NRPS flexibility that allow PCP movements during a catalytic cycle, by describing both the successive conformations adopted by these enzymes during a cycle as well as movements that engender passage from one conformation to the next.

2. Known non-ribosomal peptide synthetase structures

The field of NRPS structural biology achieved major breakthroughs in the last five years with the publication of several crystal structures of multi-domain NRPSs, largely due to the work of the Schmeing and Gulick groups (figure 2) [19–22,28,29]. These results add to a high number of structures of individual catalytic domains and PCP-containing didomains, solved by NMR, X-ray crystallography or a combination of both techniques [8]. Each domain adopts the same fold in different structures, independent of the number of domains present in the protein. Nevertheless, understanding the organization of full modules as well as module–module interactions is essential to provide a better insight into these assembly lines. For a long time, however, the intrinsic flexibility of NRPSs prevented the structural characterization of full modules at high resolution. The first structure of a full module, that of the termination module of surfactin, SrfA-C, was solved by X-ray crystallography in 2008 (figures 1a and 2a,b) [30]. Since then, crystal structures of three other termination modules have been solved: AB3403 from an uncharacterized pathway of Acinetobacter baumanii, the enterobactin termination module EntF and ObiF1 from the obafluorin assembly line (figure 2c–e) [20,29,30]. In 2017, a combined effort in X-ray crystallography and negative staining electron microscopy (EM) provided insights into the structures of the last two modules of DhbF involved in the synthesis of bacillibactin (figure 2f,g) [21]. Lastly, the structural elucidation of the dimodular protein LgrA, from the gramicidin synthetase complex that contains an initiation and an elongation module, was a breakthrough in the comprehension of supramodular NRPS organization. Indeed, 12 crystal structures of LgrA fragments complexed with substrates, substrate analogues and dead-end inhibitors were solved and provided a full picture of the catalytic cycle of a dimodular NRPS (figures 2hj and 3) [19,22].

Figure 2. Structural gallery of NRPS modules. If present, the PPant arm attached to the PCP domain is represented as sticks. (a,b) Domain organization and crystal structures of the termination modules SrfA-C (PDB code: 2VSQ) [19], (c) AB-3403 (PDB code: 4ZXH) [21], (d) ObiF1 (PDB code: 6N8E) [22] and (e) EntF (PDB code: 5T3D) [21]. In addition to the PCP domain, these modules are composed of condensation (c), adenylation (a) and thioesterase (TE) domains. No electron density was detected for the TE domain of EntF. The catalytic His of the C domain is represented in spheres. (f,g) Domain organization and crystal structure of DhbF, a cross-module construct (PDB code: 5U89) [16]. (h–j) Domain organization and crystal structures of a five-domain construct of LgrA (PDB code: 6MG0) [24]. LgrA starts with a formylation (f) domain. The domain in pink in ObiF1 (d) and DhbF (g) represents MLP, an A domain activator.

Figure 3. Catalytic cycle of LgrA, a dimodular NRPS. (a) Domain organization of LgrA. For clarity, the inactive epimerization domain of LgrA, that follows the PCP2 domain, is not shown. (b) Catalytic cycle of LgrA illustrated by five crystal structures (PDB codes: 5ES5, 5ES8, 5ES9 and 6MFZ) [23,24]. Four different structures of module 1 reveal details about the catalytic cycle of an initiation module. The PCP1 domain is disordered in the open and closed states. First, the open state allows binding of valine and ATP. The closed state is the conformation which is relevant for the activation of valine by ATP (i.e. the adenylation state). The thiolation conformation captures the transfer of valine from the A1 domain to the PPant arm of the PCP1 domain. In the formylation conformation, the F1 domain adds a formyl group to valine, still attached to the PCP1 domain. The structure of the full dimodular NRPS allows the visualization of the condensation conformation. After condensation, Gly activated by module 2 is covalently bound to formyl-Val by a peptide bond. In all presented structures, the F1-Acore bidomain adopts similar conformations whereas the Asub subdomain and the PCP1 domain are positioned differently. Rotations of the Asub subdomain induce movements of the PCP1 domain due to the linker between them.

3. Loading of the amino acid onto the peptidyl carrier protein domain

The adenylation (A) domain is divided into two subdomains: the N-terminal subdomain, Acore, consists of around 400 amino acids, while the C-terminal subdomain, Asub, comprises around 100 amino acids [31]. The A domain catalyses two reactions: the activation of the acid monomer using ATP (adenylation) and its subsequent transfer to the PCP domain (thiolation or thioesterification). The A domain is able to adopt several conformations that have been described as ‘the domain alternation cycle’ and which are supported by several structures of complete NRPS modules (figure 3) [19,20,22,31]. Remarkably, the thiolation state has been characterized multiple times using non-hydrolysable analogues (figure 3b, thiolation conformation) [18–20,22]. For example, the structure of PA1221, a natural A-PCP didomain NRPS from Pseudomonas aeruginosa, was obtained both in its apo form and in a loaded form locked in the thiolation conformation through the use of the inhibitor valyl-adenosine vinylsulfonamide (AVS) [32]. In the apo form, the electron density for the PCP domain was absent, suggesting the domain was flexible, whereas the whole didomain was visible in the presence of the AVS inhibitor, indicating that it stabilized the A-PCP interface. This suggests that the numerous crystal structures of NRPSs in the thiolation conformation do not reflect a preferentially adopted conformation in vivo but, more likely, a conformation that favours crystallization.

The cycle starts when the A domain, in an open conformation, is available for substrate binding (figure 3b, open conformation) [31]. The Acore subdomain contains the monomer binding pocket that accommodates ATP and the acid monomer. Upon substrate binding, a 30° rotation of the Asub subdomain leads to the closed conformation, which is then suitable for adenylation (figure 3b, closed conformation) [19]. This conformation allows the entry of an Asub loop into the Acore subdomain this loop contains a conserved catalytic lysine that stabilizes the acid substrate and ATP [33]. Subsequently, after adenylation and pyrophosphate release, a 140° rotation of the Asub subdomain allows the conversion between the closed and the thiolation conformations, the latter being able to catalyse thioesterification (figure 3b, thiolation conformation) [31].

The rotations of the Asub subdomain are facilitated by a flexible hinge region containing a conserved aspartic acid or a lysine, located in the Acore-Asub linker [34]. The structure of an adenylate-forming enzyme with the hinge residue mutated into proline revealed an enzyme blocked in the adenylation conformation. Consistent with the structure, the mutant enzyme was still capable of adenylation, but not of thiolation [34]. The importance of the hinge residue has also been demonstrated in the context of the multi-domain NRPS EntF, in which the same hinge residue mutation abolished enterobactin production [35]. Therefore, the flexibility of the hinge residue in the Acore-Asub linker is essential to allow the movement of the Asub subdomain relative to the Acore, which is necessary to allow conformational changes of the whole module. Indeed, rotations of the Asub subdomain drive movement of the PCP owing to the linker connecting the two domains. Analysis of the A-PCP linker region reveals that it contains multiple prolines, absent in standalone A domains [35]. These prolines might rigidify the A-PCP linker, thus facilitating movements of the PCP domain in concert with the movements of the Asub subdomain [35].

4. Modification of the peptidyl carrier protein-tethered amino acid

The vast diversity of NRPs arises in part from the action of tailoring domains, such as cyclization (Cy), epimerization (E), formylation (F), ketoreductase (KR), methyltransferase (Met) and oxidase (Ox) domains, that modify the peptide under construction [19,36–40]. It is not unusual for tailoring domains to be inserted within A domains, that are then called interrupted A domains [39,41]. The structure of LgrA in the formylation conformation provides insight into the mechanism of amino acid modification by a tailoring domain, in this case the formylation of PCP-bound L-valine by an F domain using a formyltetrahydrofolate cofactor (figure 3b, formylation conformation) [19]. Since the F and PCP domains are separated by the large A domain (figure 3a), a substantial conformational change must occur to allow the PCP domain to position its PPant arm in the F active site. The Asub subdomain rotates 180° from its position in the thiolation conformation while the PCP domain rotates 75°, thus moving 60 Å away from its thiolation position (compare figure 3b, thiolation and formylation conformations). The interaction surface between the F and PCP domains is very limited. Unfortunately, there is no structure of the full dimodular LgrA in the formylation conformation. It is worth noting that the structure of dimodular LgrA (figure 3b, condensation conformation) reveals that the position occupied by the PCP1 domain in the formylation conformation is occupied by the C domain at a later stage of the catalytic cycle [22].

5. Elongation of the donor peptide chain with an acceptor amino acid

Elongation is the only reaction that necessitates interactions between domains belonging to different modules. This reaction is catalysed by a condensation (C) domain or, more rarely, by a cyclization (Cy) also called heterocyclization (HC) domain, that catalyses cyclization after condensation [42]. If the two modules belong to the same polypeptide, PCP and C domains are directly connected by a linker (figure 3b, condensation conformation). If they belong to different polypeptides, docking domains facilitate the interaction between the upstream PCP and the downstream C domain [43]. The C domain of an elongation module n catalyses peptide bond formation between the growing chain carried by the upstream PCP domain (donor PCP), that belongs to module n-1, and the activated amino acid carried by the downstream PCP domain (acceptor PCP) located on the same module n (figure 3b, condensation conformation) [42]. The growing peptide chain is directly transferred from one PCP domain to the next, without attachment to the C domain. Thus, the two PCP domains must bind simultaneously to two different binding sites on the C domain these are referred to as donor and acceptor binding sites. However, the C domain must discriminate between the two PCP domains to maintain the directionality of the assembly line. Therefore, the condensation reaction results from the interaction between three domains (the C, the donor PCP and the acceptor PCP domains).

The C domain adopts the V-shaped pseudo-dimeric fold seen in the chloramphenicol acyltransferase family [42]. It is divided into N-terminal and C-terminal lobes, and the active site is located inside the N-terminal lobe, at the centre of a tunnel formed by the interface between the two lobes (figure 2). The structures of three termination modules, SrfA-C, AB3403 and ObiF1, show the acceptor PCP domain docked onto the C domain acceptor site (figure 2ad) [20,29,30]. In SrfA-C, although the PCP domain is in its apo form due to a Ser-Ala mutation, the Ala is located 16 Å away from the catalytic His of the C domain, suggesting that this structure is compatible with a productive condensation reaction (figure 2b) [30]. Both holo-AB3403 and holo-ObiF1 show the PCP domain docked onto the C acceptor site with the PPant arm inserted through a tunnel that allows the positioning of the final thiol in proximity to the catalytic His of the C domain (figure 2c,d) [20,30]. It is worth noting that, as opposed to what was observed for the PCP interaction with the A domain, no substrate or inhibitor was needed to favour the interaction between the acceptor PCP and C domains.

Several dimodular LgrA structures have revealed for the first time the productive interaction between a donor PCP domain and its corresponding C domain (figure 3b, condensation conformation) [22]. Indeed, in four structures of LgrA, the donor PCP1 domain is docked in the C2 donor site, presenting its conserved Ser towards the catalytic His of the C domain, the two residues being separated by less than 20 Å. The donor PCP1 domain globally has the same orientation in these structures and three of them show electron density for the PPant arm in the donor tunnel of the C domain. However, the structure of F1-A1-PCP1-C2 with f-Val loaded onto the PPant arm of PCP1 reveals that the reactive thioester group must slightly modify its position in order to be properly positioned for attack by the acceptor amino acid. The authors hypothesized that the donor substrate could only be correctly positioned in the presence of the PCP-bound-acceptor substrate in the active site of the C domain [22]. The importance of the loading status of the carrier protein domain for megaenzyme conformation has already been demonstrated for the modular PKS megaenzymes that incorporate domains analogous to the C, A and PCP of the NRPS systems [44,45]. Indeed, structures of the PikAIII PKS module loaded with various substrates obtained by cryo-electron microscopy revealed that the acyl carrier protein (ACP) domain adopts dramatically different positions according to the nature of the substrate loaded onto the ACP.

A detailed comprehension of the condensation reaction requires structural information on a NRPS including at least a donor PCP, an acceptor PCP and a condensation domain. The structure of holo-LgrA F1-A1-PCP1-C2-A2-PCP2 reveals both donor and acceptor PCPs docked onto a single C domain (figure 3b, condensation conformation) [22]. 29 Å separate the two PCP Ser residues loaded with their 20 Å-long PPant arms. The Ser from the acceptor PCP2 is located 15 Å away from the catalytic His of the C domain while the Ser from the donor PCP1 is 18 Å away from it. These distances are compatible with the proximity between substrates required for nucleophilic attack. Unfortunately, there is no electron density for the PCP PPant arms so the detailed interaction of donor and acceptor substrates cannot be deduced from this structure.

The detailed view of PCP-bound-substrates in the condensation conformation can be obtained using a mechanism-based probe, recently designed by the Gulick and Aldrich groups [46]. Although the structure of LgrA F1-A1-PCP1-C2-A2-PCP2 shows that an inhibitor is not necessary to lock a dimodular NRPS in the condensation conformation, this new chemical probe stabilized the interaction between the donor PCP, the acceptor PCP and the C domains. The enterobactin assembly line served as a model to prove the functionality of this probe [46]. The authors were able to mimic the PPant arm loaded with the natural substrate of the donor PCP by replacing the whole acyl-thioester portion of the substrate by a non-hydrolysable analogue incorporating a ketone functionality, thus preventing the release of the loaded substrate. The resulting crypto-PCP was shown to bind to the donor site of the C domain and these results allowed the construction of a model where the crypto-PCP inserts its unnatural PPant arm into the donor tunnel. The authors assumed that the pantetheine probe would then react with the natural acceptor substrate loaded on the acceptor PCP, forming an imine bond instead of the natural peptide bond formed between substrates. In this configuration, both PCP domains should be docked onto the C domain and linked together via their PPant arms connected through an imine bond. Therefore, this probe should help stabilize the interaction between the C and PCP domains during condensation and could lead to more crystal structures of bimodular NRPS locked in the condensation conformation.

The structure of the LgrA PCP1 domain has been solved in association with its three catalytic partners (A1, F1 and C2 domains, figure 3b) therefore, the comparison of the three crystal structures provides insights into the conformational changes that allow the PCP1 domain to shuttle between its three partners [19,22]. As described above, the large movements required for the PCP domain to reach its different catalytic partners are mainly driven by conformational changes of the Asub subdomain that are transferred to the PCP domain by the Asub-PCP rigid linker. Shifting from the formylation to the condensation conformation, the PCP1 domain must cross 30 Å, achieved through a rotation of 40° of the Asub subdomain (compare figure 3b, formylation and condensation conformations). Similarly, after condensation, the PCP1 domain must detach from the C domain and travel back 50 Å to return to the A1 active site (compare figure 3b, condensation and thiolation conformations), achieved by a rotation of 150° of the Asub subdomain. Even in the absence of structures that show the second LgrA module in the thiolation conformation, we can easily extrapolate that the movements seen in module 1 could be similar in module 2.

Interestingly, a NRPS module can start a second catalytic cycle before the first one is complete [20]. For example, the structure of LgrA in the condensation conformation (figure 3b) shows the PCP1 domain in the peptide donation conformation while the A1 domain is in the closed conformation and can thus catalyse adenylation [22]. After adenylation, the aminoacyl-AMP is tightly sequestered in the A domain active site in the absence of the available PCP [47,48]. Subsequently, the A1 domain can catalyse thiolation as soon as the C2 domain has catalysed condensation which will liberate the PCP1 domain. This decoupling between different domain activities likely increases the synthesis rate of NRPSs.

6. Release of the peptidyl carrier protein-tethered peptide

The structures of four termination modules harbouring a TE domain (C-A-PCP-TE) are now available, i.e. SrfA-C, AB3403, EntF and ObiF1 (figure 2a–e) [20,28–30]. In all four crystal structures, the TE positions are dramatically different (figure 2a-d), suggesting that the TE domain is most probably a mobile element. Negative staining EM images of EntF showed the TE domain in various positions compared to the other domains and no density was observable for the TE domain in the corresponding crystal structure (figure 2e), confirming that the EntF TE domain can adopt multiple conformations [20]. The ObiF1 module has an unusual domain organization, since the TE domain is followed by a MbtH-like protein (MLP) [30]. Interestingly, in these conditions, the MLP domain anchors the TE domain to the module (figure 2d). These elements suggest that the high mobility of the TE domain is due to the flexibility of the short PCP-TE linker and to the fact that, in general, no successive domain imposes structural restraints on the final TE domain. As none of the structures of these four termination modules revealed the interaction between the PCP and the TE domains, it was characterized through the crystal structure of the EntF PCP-TE didomain [49]. The authors used a phosphopantetheinyl-based inhibitor loaded onto the PCP domain that stabilized the transient interaction between the PCP and TE domains [49], thus providing details of a productive PCP–TE interaction.

7. Non-ribosomal peptide synthetases flexibility at the supramodular scale, unrelated to the catalytic cycle

As described in the previous sections, NRPS flexibility allows the conformational rearrangements that are required for the PCP domain to interact with its catalytic partners. It is then legitimate to wonder whether NRPS flexibility is only restricted to the movements that shuttle PCP-tethered substrates to the different NRPS active sites or if there is flexibility at the supramodular scale, unrelated to the catalytic cycle. In other words, are there architectural rules governing the relationships between successive modules, or is their relationship random? In addition to the crystal structure of bimodular LgrA, structural models of multimodular NRPSs derive from low resolution techniques or from the combination of crystal structures. Indeed, in 2016, Marahiel and co-workers proposed a helical model for a hypothetical 7-module NRPS assembly by combining the C-A-PCP structure of the SrfA-C termination module (figure 2b) with the PCP-C cross-module structure of TycC from the tyrocidine synthetase [30,50,51]. The helical axis was occupied by the PCP domains and each module was rotated by 120° relative to the previous one.

Several EM observations indicated that NRPSs probably adopt a more flexible architecture than the helical model mentioned above. An early negative staining EM observation of a fungal 11-module NRPS, responsible for the synthesis of cyclosporin, pictured this 1.7 MDa machinery as an assembly of globular moieties, most likely modules, that could adopt either very compact or elongated structures [52]. It led to the hypothesis that NRPS modules are arranged as ‘beads on a chain’, suggesting that an NRPS assembly line would not adopt any specific architecture.

More recently, the dimodular NRPS DhbF (C1-A1-PCP1-C2-A2-PCP2 + MLP) was also observed by negative staining EM [21]. Despite the presence of AVS inhibitors that limited its conformational heterogeneity, DhbF adopted a continuum of conformations as diverse as an elongated shape, an L shape or a very compact shape. Most particles could be sorted into five classes that differed by the relative positions of the first module in relation to the second one. Therefore, although the flexibility inside a module is limited due to the stable conformation adopted by the C-A didomain, it seems that there are few limitations to the position one module can adopt relatively to the adjacent one. These data favour an irregular architecture for NRPSs however, the fact that the number of classes is limited to five suggests that the supramodular architecture of NRPSs is not completely random. In the crystal structure of the A1-PCP1-C2 cross-module (figure 2f,g), there is no density for the PCP1-C2 linker, suggesting that it could be flexible [21]. Therefore, the movements of the PCP1-C2 linker combined with the absence of strong intermodule interactions could explain the various conformations adopted by the dimodular DhbF.

The six recent dimodular LgrA crystal structures provided further evidence that NRPSs do not adopt a unique stable architecture but rather a few conformations among a myriad of possibilities [22]. One striking example confirmed the flexibility of the PCP1-C2 intermodule linker. Indeed, the F1-A1-PCP1-C2-A2 variant was crystallized in the thiolation conformation for module 1 using a Val-AVS inhibitor and two molecules were found in the asymmetric unit (figure 2h-j). Within these two molecules, module 1 is identical but module 2 adopts two radically different positions. This behaviour results in two strikingly different LgrA shapes, reminiscent of the two DhbF structures observed by EM, one that is elongated and the other L-shaped. Therefore, it seems that locking one module in a specific conformation does not impose a unique conformation on the adjacent module. The most convincing evidence that NRPSs adopt a flexible architecture was obtained from the SAXS analyses of the LgrA F1-A1-PCP1-C2-A2 construct [22]. They indicated that the conformations adopted in the crystal structures do not exactly reflect the conformations adopted in solution. To better estimate these, Reimer and co-workers used the ensemble optimization method to generate different models that took into account flexibility parameters [53]. The ensemble generated fit very well with the experimental data, thus confirming the flexibility of LgrA. However, it cannot be excluded that LgrA flexibility is only apparent and is an effect of the absence of the other components of the assembly line. Indeed, in addition to LgrA, the linear gramicidin NRPS is composed of three other proteins [54] that could restrain the conformations that LgrA can adopt.

8. Concluding remarks

The structural and functional studies of NRPS multienzymes are not limited to providing details regarding the production of complex metabolites but can also be applied to the discovery of new antibiotics. Indeed, the products of these machineries are often essential for bacterial virulence, hence targeting their biosynthesis is a promising strategy to fight microbial pathogens [6]. For instance, the multi-drug resistant Klebsiella pneumoniae uses several siderophores for iron acquisition, including the NRPs enterobactin and yersiniabactin [55–57]. Strains deficient for yersinabactin production are much less virulent than the wild-type strains [58], suggesting that the yersiniabactin NRPS machinery could be potentially be explored as an antibacterial development target.

Moreover, the re-engineering of NRPS megaenzymes in order to produce new medically relevant molecules is of particular interest [59,60]. This prospect exists since the discovery of the modular organization of NRPSs [61]. To date, a straightforward strategy to re-engineer NRPS assembly lines to produce artificial peptides has been difficult to establish, although some successful reports of re-engineering were published [62–65]. Classical strategies using substitutions of A, C-A, PCP-C-A units or entire modules yielded only a small amount of synthesized peptide [59,60]. Recently, the Bode group successfully exploited a novel exchange strategy, using A-PCP-C exchange units (XUs) by fixing the borders of the XU within the flexible C and A domain linker [66]. They subsequently improved their strategy by dividing the C domain, placing the borders of the XU within the flexible linker that connects the N-terminal acceptor and C-terminal donor subdomains of C, yielding CAcc-A-PCP-CDon (XUs) [67]. This strategy allowed the authors to produce very high yields of novel NRPS peptides, paving the way for new biotechnological approaches that could optimize the production of novel bioactive compounds through NRPS engineering. Therefore, an increased knowledge on the supramodular architecture of NRPSs, especially regarding the linker regions that allow enzyme flexibility, raises interesting perspectives for natural product re-engineering.

Pokeweed antiviral protein

Barbieri L, Ferreras JM, Barraco A, Ricci P, Stirpe F (1992) Some ribosome-inactivating proteins depurinate ribosomal RNA at multiple sites. Biochem J 286:1–4

Barbieri L, Valbonesi P, Bonora E, Gorini P, Bolognesi A, Stirpe F (1997) Polynucleotide:adenosine glycosidase activity of ribosome-inactivating proteins: effect on DNA, RNA and poly(A). Nucleic Acids Res 25:518–522

Barbieri L, Brigotti M, Perocco P, Carnicelli D, Ciani M, Mercatali L, Stirpe F (2003) Ribosome-inactivating proteins depurinate poly(ADP-ribosyl)ated poly(ADP-ribose) polymerase and have transforming activity for 3T3 fibroblasts. FEBS Lett 538:178–182

Day PJ, Lord JM, Roberts LM (1998) The deoxyribonuclease activity attributed to ribosome-inactivating proteins is due to contamination. Eur J Biochem 258:540–545

He W-J, Liu WY (2004) Both N- and C-terminal regions are essential for cinnamomin A-chain to deadenylate ribosomal RNA and supercoiled double-stranded DNA. Biochem J 377:17–23

Helmy M, Lombard S, Pieroni G (1999) Ricin RCA60: evidence of its phospholipase activity. Biochem Biophys Res Commun 258:252–255

Lam YH, Wong YS, Wang B, Wong RNS, Yeung HW, Shaw PC (1996) Use of trichosanthin to reduce infection by turnip mosaic virus. Plant Sci 114:11–117

Li X-D, Chen W-F, Liu W-Y, Wang G-H (1997) Large-scale preparation of two new ribosome-inactivating proteins, Cinnamomin and Camphorin, from the seeds of Cinnamomum camphora. Protein Expr Purif 10:27–31

Lodge JK, Kaniewski WK, Tumer NE (1993) Broad-spectrum virus resistance in transgenic plants expressing pokeweed antiviral protein. Proc Natl Acad Sci USA 90:7089–7093

Moon YH, Song SK, Choi KW, Lee JS (1997) Expression of a cDNA encoding Phytolacca insularis antiviral protein confers virus resistance on transgenic potato plants. Mol Cells 7:807–815

Ng TB, Lam YW, Wang H (2003) Calcaelin, a new protein with translation-inhibiting, antiproliferative and antimitogenic activities from mosaic puffball mushroom Calvatia caelata. Planta Med 69:212–217

Parikh BA, Coetzer C, Tumer NE (2002) Pokeweed antiviral protein regulates the stability of its own mRNA by a mechanism that requires depurination but can be separated from depurination of the α-sarcin/ricin loop of rRNA. J Biol Chem 277:41428–41437

Park S-W, Stevens NM, Vivanco JM (2002) Enzymatic specificity of three ribosome-inactivating proteins against fungal ribosomes, and correlation with antifungal activity. Planta 216:227–234

Park, S-W, Vepachedu R, Owens RA, Vivanco JM (2004) The N-glycosidase activity of ribosome-inactivating protein ME1 targets single-stranded regions of nucleic acids independent of sequence or structural motifs. J Biol Chem 279:34165–34174

Rajamohan F, Venkatachalam TK, Irvin JD, Uckun FM (1999) Pokeweed antiviral protein isoforms PAP-I, PAP-II, and PAP-III depurinate RNA of human immunodeficiency virus (HIV)-1. Biochem Biophys Res Commun 260:453–458

Ready MP, Brown DT, Robertus JD (1986) Extracellular localization of pokeweed protein. Proc Natl Acad Sci USA 84:5053–5056

Sharma N, Park S-W, Vepachedu R, Barbieri L, Ciani M, Stirpe F, Savary BJ, Vivanco JM (2004) Isolation and characterization of an RIP-like protein from Nicotiana tabacum with dual enzymatic activity. Plant Physiol 134:171–181

Valbonesi P, Barbieri L, Bolognesi A, Bonora E, Polito L, Stirpe F (1999) Preparation of highly purified momordin II without ribonuclease activity. Life Sci 65:1485–1491

Vepachedu R, Bais HP, Vivanco JM (2003) Molecular characterization and post-transcriptional regulation of ME1, a type-I ribosome-inactivating protein from Mirabilis expansa. Planta 217:498–506

Wang J-H, Nie H-L, Huang H, Tam S-C, Zheng Y-T (2003) Independency of anti-HIV-1 activity from ribosome-inactivating activity of trichosanthin. Biochem Biophys Res Commun 302:89–94

Wang P, Tumer NE (1999) Pokeweed antiviral protein cleaves double-stranded supercoiled DNA using the same active site required to depurinate rRNA. Nucleic Acids Res 27:1900–1905

Zoubenko O, Unkun F, Hur Y, Chet I, Tumer NE (1997) Plant resistance to fungal infection induced by nontoxic pokeweed antiviral protein mutants. Nat Bio/Technol 15:992–996

New technology enables fast protein synthesis

Images for download on the MIT News office website are made available to non-commercial entities, press and the general public under a Creative Commons Attribution Non-Commercial No Derivatives license. You may not alter the images provided, other than to crop them to size. A credit line must be used when reproducing images if one is not provided below, credit the images to "MIT."

Previous image Next image

Many proteins are useful as drugs for disorders such as diabetes, cancer, and arthritis. Synthesizing artificial versions of these proteins is a time-consuming process that requires genetically engineering microbes or other cells to produce the desired protein.

MIT chemists have devised a protocol to dramatically reduce the amount of time required to generate synthetic proteins. Their tabletop automated flow synthesis machine can string together hundreds of amino acids, the building blocks of proteins, within hours. The researchers believe their new technology could speed up the manufacturing of on-demand therapies and the development of new drugs, and allow scientists to design artificial proteins by incorporating amino acids that don’t exist in cells.

“You could design new variants that have superior biological function, enabled by using non-natural amino acids or specialized modifications that aren’t possible when you use nature’s apparatus to make proteins,” says Brad Pentelute, an associate professor of chemistry at MIT and the senior author of the study.

In a paper appearing today in Science, the researchers showed that they could chemically produce several protein chains up to 164 amino acids in length, including enzymes and growth factors. For a handful of these synthetic proteins, they performed a detailed analysis showing their function is comparable to that of their naturally occurring counterparts.

The lead authors of the paper are former MIT postdoc Nina Hartrampf, who is now an assistant professor at the University of Zurich, MIT graduate student Azin Saebi, and former MIT technical associate Mackenzie Poskus.

Rapid production

The majority of proteins found in the human body are up to 400 amino acids long. Synthesizing large quantities of these proteins requires delivering genes for the desired proteins into cells that act as living factories. This process is used to program bacterial or yeast cells to produce insulin and other drugs such as growth hormones.

“This is a time-consuming process,” says Thomas Nielsen, head of research chemistry at Novo Nordisk, who is also an author of the study. “First you need the gene available, and you need to know something about the cellular biology of the organism so you can engineer the expression of your protein.”

An alternative approach for protein production, first proposed in the 1960s by Bruce Merrifield, who was later awarded the Nobel Prize in chemistry for his work on solid-phase peptide synthesis, is to chemically string amino acids together in a stepwise fashion. There are 20 amino acids that living cells use to build proteins, and using the techniques pioneered by Merrifield, it takes about an hour to perform the chemical reactions needed to add one amino acid to a peptide chain.

In recent years, Pentelute’s lab has invented a more rapid method to perform these reactions, based on a technology known as flow chemistry. In their machine, chemicals are mixed using mechanical pumps and valves, and at every step of the overall synthesis they cycle through a heated reactor containing a resin bed. In the optimized protocol, forming each peptide bond takes on average 2.5 minutes, and peptides up to 25 amino acids long can be assembled in less than an hour.

Following the development of this technology, Novo Nordisk, which makes several protein drugs, became interested in working with Pentelute’s lab to synthesize longer peptides and proteins. To achieve that, the researchers needed to improve the efficiency of the reactions that form peptide bonds between amino acids in the chain. For each reaction, their previous efficiency rate was between 95 and 98 percent, but for longer proteins, they needed it to be over 99 percent.

“The rationale was if we got really good at making peptides, we could expand the technology to make proteins,” Pentelute says. “The idea is to have a machine that a user could walk up to and put in a protein sequence, and it would string together these amino acids in such an efficient manner that at the end of the day, you can get the protein you want. It’s been very challenging because if the chemistry is not close to 100 percent for every single step, you will not get any of the desired material.”

To boost their success rate and find the optimal recipe for each reaction, the researchers performed amino-acid-specific coupling reactions under many different conditions. In this study, they assembled a universal protocol that achieved an average efficiency greater than 99 percent for each reaction, which makes a significant difference when so many amino acids are being linked to form large proteins, the researchers say.

“If you want to make proteins, this extra 1 percent really makes all the difference, because byproducts accumulate and you need a high success rate for every single amino acid incorporated,” Hartrampf says.

Using this approach, the researchers were able to synthesize a protein that contains 164 amino acids — Sortase A, a bacterial protein. They also produced proinsulin, an insulin precursor with 86 amino acids, and an enzyme called lysozyme, which has 129 amino acids, as well as a few other proteins. The desired protein has to be purified and then folded into the correct shape, which adds a few more hours to the overall synthesis process. All of the purified synthesized proteins were obtained in milligram quantities, making up between 1 and 5 percent of the overall yield.

Medicinal chemistry

The researchers also tested the biological functions of five of their synthetic proteins and found that they were comparable to those of the biologically expressed variants.

The ability to rapidly generate any desired protein sequence should enable faster drug development and testing, the researchers say. The new technology also allows amino acids other than the 20 encoded by the DNA of living cells to be incorporated into proteins, greatly expanding the structural and functional diversity of potential protein drugs that could be created.

“This is paving the way for a new field of protein medicinal chemistry,” Nielsen says. “This technology really complements what is available to the pharmaceutical industry, providing new opportunities for rapid discovery of peptide- and protein-based biopharmaceuticals.”

The researchers are now working on further improving the technology so that it can assemble protein chains up to 300 amino acids long. They are also working on automating the entire manufacturing process, so that once the protein is synthesized, the cleavage, purification, and folding steps also occur without any human intervention required.

Pentelute is a co-founder of a company called Amide Technologies that has licensed aspects of the peptide synthesis technology for possible commercial development. The research was funded by Novo Nordisk, a National Science Foundation Graduate Research Fellowship, and an MIT Dean of Science Fellowship.

Felnagle, E. A. et al. Nonribosomal peptide synthetases involved in the production of medically relevant natural products. Mol. Pharm. 5, 191–211 (2008).

Sieber, S. A. & Marahiel, M. A. Molecular mechanisms underlying nonribosomal peptide synthesis: approaches to new antibiotics. Chem. Rev. 105, 715–738 (2005).

Walsh, C. T. The chemical versatility of natural-product assembly lines. Acc. Chem. Res. 41, 4–10 (2008).

Kopp, F. & Marahiel, M. A. Macrocyclization strategies in polyketide and nonribosomal peptide biosynthesis. Nat. Prod. Rep. 24, 735–715 (2007).

Marahiel, M. A. A structural model for multimodular NRPS assembly lines. Nat. Prod. Rep. 33, 136–140 (2016).

Cai, X. et al. Biosynthesis of the antibiotic nematophin and its elongated derivatives in entomopathogenic bacteria. Org. Lett. 19, 806–809 (2017).

Cai, X. et al. Entomopathogenic bacteria use multiple mechanisms for bioactive peptide library design. Nat. Chem. 9, 379–386 (2017).

Gao, X. et al. Cyclization of fungal nonribosomal peptides by a terminal condensation-like domain. Nat. Chem. Biol. 8, 823–830 (2012).

Winn, M., Fyans, J. K., Zhuo, Y. & Micklefield, J. Recent advances in engineering nonribosomal peptide assembly lines. Nat. Prod. Rep. 33, 317–347 (2016).

Weist, S. & Süssmuth, R. D. Mutational biosynthesis—a tool for the generation of structural diversity in the biosynthesis of antibiotics. Appl. Microbiol. Biotechnol. 68, 141–150 (2005).

Calcott, M. J. & Ackerley, D. F. Genetic manipulation of non-ribosomal peptide synthetases to generate novel bioactive peptide products. Biotechnol. Lett. 36, 2407–2416 (2014).

Stachelhaus, T., Mootz, H. D. & Marahiel, M. A. The specificity-conferring code of adenylation domains in nonribosomal peptide synthetases. Chem. Biol. 6, 493–505 (1999).

Challis, G. L., Ravel, J. & Townsend, C. A. Predictive, structure-based model of amino acid recognition by nonribosomal peptide synthetase adenylation domains. Chem. Biol. 7, 211–224 (2000).

Kries, H., Niquille, D. L. & Hilvert, D. A subdomain swap strategy for reengineering nonribosomal peptides. Chem. Biol. 22, 640–648 (2015).

Mootz, H. D. et al. Decreasing the ring size of a cyclic nonribosomal peptide antibiotic by in-frame module deletion in the biosynthetic genes. J. Am. Chem. Soc. 124, 10980–10981 (2002).

Butz, D. et al. Module extension of a non-ribosomal peptide synthetase of the glycopeptide antibiotic balhimycin produced by Amycolatopsis balhimycina. ChemBioChem 9, 1195–1200 (2008).

Kries, H. Biosynthetic engineering of nonribosomal peptide synthetases. J. Pept. Sci. 22, 564–570 (2016).

Samel, S. A., Schoenafinger, G., Knappe, T. A., Marahiel, M. A. & Essen, L.-O. Structural and functional insights into a peptide bond-forming bidomain from a nonribosomal peptide synthetase. Structure 15, 781–792 (2007).

Chiocchini, C., Linne, U. & Stachelhaus, T. In vivo biocombinatorial synthesis of lipopeptides by COM domain-mediated reprogramming of the surfactin biosynthetic complex. Chem. Biol. 13, 899–908 (2006).

Miller, B. R., Sundlov, J. A., Drake, E. J., Makin, T. A. & Gulick, A. M. Analysis of the linker region joining the adenylation and carrier protein domains of the modular nonribosomal peptide synthetases. Proteins 82, 2691–2702 (2014).

Schimming, O., Fleischhacker, F., Nollmann, F. I. & Bode, H. B. Yeast homologous recombination cloning leading to the novel peptides ambactin and xenolindicin. ChemBioChem 15, 1290–1294 (2014).

Tanovic, A., Samel, S. A., Essen, L.-O. & Marahiel, M. A. Crystal structure of the termination module of a nonribosomal peptide synthetase. Science 321, 659–663 (2008).

Sundlov, J. A., Shi, C., Wilson, D. J., Aldrich, C. C. & Gulick, A. M. Structural and functional investigation of the intermolecular interaction between NRPS adenylation and carrier protein domains. Chem. Biol. 19, 188–198 (2012).

Drake, E. J. et al. Structures of two distinct conformations of holo-non-ribosomal peptide synthetases. Nature 529, 235–238 (2016).

Liu, Y., Zheng, T. & Bruner, S. D. Structural basis for phosphopantetheinyl carrier domain interactions in the terminal module of nonribosomal peptide synthetases. Chem. Biol. 18, 1482–1488 (2011).

Gulick, A. M. Conformational dynamics in the acyl-CoA synthetases, adenylation domains of non-ribosomal peptide synthetases, and firefly luciferase. ACS Chem. Biol. 4, 811–827 (2009).

Tan, X.-F. et al. Structure of the adenylation-peptidyl carrier protein didomain of the Microcystis aeruginosa microcystin synthetase McyG. Acta. Crystallogr. D 71, 873–881 (2015).

Strieker, M., Tanovic, A. & Marahiel, M. A. Nonribosomal peptide synthetases: structures and dynamics. Curr. Opin. Struct. Biol. 20, 234–240 (2010).

Duerfahrt, T., Doekel, S., Sonke, T., Quaedflieg, P. J. L. M. & Marahiel, M. A. Construction of hybrid peptide synthetases for the production of α- L -aspartyl- L -phenylalanine, a precursor for the high-intensity sweetener aspartame. Eur. J. Biochem. 270, 4555–4563 (2003).

Bode, H. B. et al. Determination of the absolute configuration of peptide natural products by using stable isotope labeling and mass spectrometry. Chem. Eur. J. 18, 2342–2348 (2012).

Nollmann, F. I. et al. Insect-specific production of new GameXPeptides in Photorhabdus luminescens TTO1 widespread natural products in entomopathogenic bacteria. ChemBioChem 16, 205–208 (2015).

Bode, H. B. et al. Structure elucidation and activity of Kolossin A, the D-/L -pentadecapeptide product of a giant nonribosomal peptide synthetase. Angew. Chem. Int. Ed. 54, 10352–10355 (2015).

Kegler, C. et al. Rapid determination of the amino acid configuration of xenotetrapeptide. ChemBioChem 15, 826–828 (2014).

Fuchs, S. W. et al. Neutral loss fragmentation pattern based screening for arginine-rich natural products in Xenorhabdus and Photorhabdus. Anal. Chem. 84, 6948–6955 (2012).

Haynes, S. W., Ames, B. D., Gao, X., Tang, Y. & Walsh, C. T. Unraveling terminal C-domain-mediated condensation in fungal biosynthesis of imidazoindolone metabolites. Biochemistry 50, 5668–5679 (2011).

Flissi, A. et al. Norine, the knowledge-base dedicated to non-ribosomal peptides, is now open to crowdsourcing. Nucleic Acids Res. 44, D1113–D1118 (2016).

Medema, M. H. et al. Minimum information about a biosynthetic gene cluster. Nat. Chem. Biol. 11, 625–631 (2015).

Fischbach, M. A., Lai, J. R., Roche, E. D., Walsh, C. T. & Liu, D. R. Directed evolution can rapidly improve the activity of chimeric assembly-line enzymes. Proc. Natl Acad. Sci. USA 104, 11951–11956 (2007).

This article is cited by 80 publications.

  1. Emel Adaligil, Aimin Song, Kenneth K. Hallenbeck, Christian N. Cunningham, Wayne J. Fairbrother . Ribosomal Synthesis of Macrocyclic Peptides with β2- and β2,3-Homo-Amino Acids for the Development of Natural Product-Like Combinatorial Libraries. ACS Chemical Biology 2021,16 (6) , 1011-1018.
  2. Sabrina E. Iskandar, Victoria A. Haberman, Albert A. Bowers . Expanding the Chemical Diversity of Genetically Encoded Libraries. ACS Combinatorial Science 2020,22 (12) , 712-733.
  3. Rumit Maini, Hiroyuki Kimura, Ryo Takatsuji, Takayuki Katoh, Yuki Goto, Hiroaki Suga . Ribosomal Formation of Thioamide Bonds in Polypeptide Synthesis. Journal of the American Chemical Society 2019,141 (51) , 20004-20008.
  4. Fred R. Ward, Zoe L. Watson, Omer Ad, Alanna Schepartz, Jamie H. D. Cate . Defects in the Assembly of Ribosomes Selected for β-Amino Acid Incorporation. Biochemistry 2019,58 (45) , 4494-4504.
  5. Omer Ad, Kyle S. Hoffman, Andrew G. Cairns, Aaron L. Featherston, Scott J. Miller, Dieter Söll, Alanna Schepartz . Translation of Diverse Aramid- and 1,3-Dicarbonyl-peptides by Wild Type Ribosomes in Vitro. ACS Central Science 2019,5 (7) , 1289-1294.
  6. Justine Charon, Aitor Manteca, C. Axel Innis . Using the Bacterial Ribosome as a Discovery Platform for Peptide-Based Antibiotics. Biochemistry 2019,58 (2) , 75-84.
  7. Alicia E. Mangubat-Medina, Samuel C. Martin, Kengo Hanaya, Zachary T. Ball . A Vinylogous Photocleavage Strategy Allows Direct Photocaging of Backbone Amide Structure. Journal of the American Chemical Society 2018,140 (27) , 8401-8404.
  8. Kelly L. George, W. Seth Horne . Foldamer Tertiary Structure through Sequence-Guided Protein Backbone Alteration. Accounts of Chemical Research 2018,51 (5) , 1220-1228.
  9. Jun Ohata , Matthew B. Minus , Morgan E. Abernathy , and Zachary T. Ball . Histidine-Directed Arylation/Alkenylation of Backbone N–H Bonds Mediated by Copper(II). Journal of the American Chemical Society 2016,138 (24) , 7472-7475.
  10. Takashi Kawakami , Koji Ogawa , Tomohisa Hatta , Naoki Goshima , and Tohru Natsume . Directed Evolution of a Cyclized Peptoid–Peptide Chimera against a Cell-Free Expressed Protein and Proteomic Profiling of the Interacting Proteins to Create a Protein–Protein Interaction Inhibitor. ACS Chemical Biology 2016,11 (6) , 1569-1577.
  11. Clarissa Melo Czekster , Wesley E. Robertson , Allison S. Walker , Dieter Söll , and Alanna Schepartz . In Vivo Biosynthesis of a β-Amino Acid-Containing Protein. Journal of the American Chemical Society 2016,138 (16) , 5194-5197.
  12. Tomoshige Fujino , Yuki Goto , Hiroaki Suga , and Hiroshi Murakami . Ribosomal Synthesis of Peptides with Multiple β-Amino Acids. Journal of the American Chemical Society 2016,138 (6) , 1962-1969.
  13. Jinfan Wang , Marek Kwiatkowski , Michael Y. Pavlov , Måns Ehrenberg , and Anthony C. Forster . Peptide Formation by N-Methyl Amino Acids in Translation Is Hastened by Higher pH and tRNAPro. ACS Chemical Biology 2014,9 (6) , 1303-1311.
  14. Satoru Horiya , Jennifer K. Bailey , J. Sebastian Temme , Yollete V. Guillen Schlippe , and Isaac J. Krauss . Directed Evolution of Multivalent Glycopeptides Tightly Recognized by HIV Antibody 2G12. Journal of the American Chemical Society 2014,136 (14) , 5407-5415.
  15. Takashi Kawakami , Takahiro Ishizawa , and Hiroshi Murakami . Extensive Reprogramming of the Genetic Code for Genetically Encoded Synthesis of Highly N-Alkylated Polycyclic Peptidomimetics. Journal of the American Chemical Society 2013,135 (33) , 12297-12304.
  16. Takashi Kawakami , Takahiro Ishizawa , Tomoshige Fujino , Patrick C. Reid , Hiroaki Suga , and Hiroshi Murakami . In Vitro Selection of Multiple Libraries Created by Genetic Code Reprogramming To Discover Macrocyclic Peptides That Antagonize VEGFR2 Activity in Living Cells. ACS Chemical Biology 2013,8 (6) , 1205-1214.
  17. Yollete V. Guillen Schlippe , Matthew C. T. Hartman , Kristopher Josephson , and Jack W. Szostak . In Vitro Selection of Highly Modified Cyclic Peptides That Act as Tight Binding Inhibitors. Journal of the American Chemical Society 2012,134 (25) , 10469-10477.
  18. Brian E. McKinney and Joseph J. Urban . Fluoroolefins as Peptide Mimetics. 2. A Computational Study of the Conformational Ramifications of Peptide Bond Replacement. The Journal of Physical Chemistry A 2010,114 (2) , 1123-1133.
  19. Takashi Kawakami , Hiroshi Murakami and Hiroaki Suga . Ribosomal Synthesis of Polypeptoids and Peptoid−Peptide Hybrids. Journal of the American Chemical Society 2008,130 (50) , 16861-16863.
  20. Xuechen Li , Yu Yuan , Cindy Kan and Samuel J. Danishefsky . Addressing Mechanistic Issues in the Coupling of Isonitriles and Carboxylic Acids: Potential Routes to Peptidic Constructs. Journal of the American Chemical Society 2008,130 (40) , 13225-13227.
  21. Haigang Song , Antony J. Burton , Sally L. Shirran , Jūratė Fahrig‐Kamarauskaitė , Hannelore Kaspar , Tom W. Muir , Markus Künzler , James H. Naismith . Engineering of a Peptide α‐N‐Methyltransferase to Methylate Non‐Proteinogenic Amino Acids. Angewandte Chemie 2021,133 (26) , 14440-14444.
  22. Haigang Song , Antony J. Burton , Sally L. Shirran , Jūratė Fahrig‐Kamarauskaitė , Hannelore Kaspar , Tom W. Muir , Markus Künzler , James H. Naismith . Engineering of a Peptide α‐N‐Methyltransferase to Methylate Non‐Proteinogenic Amino Acids. Angewandte Chemie International Edition 2021,60 (26) , 14319-14323.
  23. Yoshihiko Iwane , Hiroyuki Kimura , Takayuki Katoh , Hiroaki Suga . Uniform affinity-tuning of N -methyl-aminoacyl-tRNAs to EF-Tu enhances their multiple incorporation. Nucleic Acids Research 2021,461
  24. Markus Muttenthaler , Glenn F. King , David J. Adams , Paul F. Alewood . Trends in peptide drug discovery. Nature Reviews Drug Discovery 2021,20 (4) , 309-325.
  25. Sylwia Freza . Cyclo(DAA-DAA) dipeptide as a peptide linker and β-sheet inducer. Chemical Physics Letters 2020,758 , 137914.
  26. Yun Ding , Joey Paolo Ting , Jinsha Liu , Shams Al-Azzam , Priyanka Pandya , Sepideh Afshar . Impact of non-proteinogenic amino acids in the discovery and development of peptide therapeutics. Amino Acids 2020,52 (9) , 1207-1226.
  27. Hisaaki Hirose , Christos Tsiamantas , Takayuki Katoh , Hiroaki Suga . In vitro expression of genetically encoded non-standard peptides consisting of exotic amino acid building blocks. Current Opinion in Biotechnology 2019,58 , 28-36.
  28. Mariah J. Austin , Adrianne M. Rosales . Tunable biomaterials from synthetic, sequence-controlled polymers. Biomaterials Science 2019,7 (2) , 490-505.
  29. Taylor M. Barrett , Kristen E. Fiore , Chunxiao Liu , E. James Petersson . Thioamide-Containing Peptides and Proteins. 2019,,, 193-238.
  30. Pol Arranz-Gibert , Koen Vanderschuren , Farren J. Isaacs . Next-generation genetic code expansion. Current Opinion in Chemical Biology 2018,46 , 203-211.
  31. Marcin Czapla , Sylwia Freza . Functionalized ACC molecule as an effective peptide clasp. Chemical Physics Letters 2018,703 , 52-55.
  32. Anne E. d'Aquino , Do Soon Kim , Michael C. Jewett . Engineered Ribosomes for Basic Science and Synthetic Biology. Annual Review of Chemical and Biomolecular Engineering 2018,9 (1) , 311-340.
  33. Geunho Choi , Soon Hyeok Hong . Selective Monomethylation of Amines with Methanol as the C 1 Source. Angewandte Chemie International Edition 2018,57 (21) , 6166-6170.
  34. Geunho Choi , Soon Hyeok Hong . Selective Monomethylation of Amines with Methanol as the C 1 Source. Angewandte Chemie 2018,130 (21) , 6274-6278.
  35. Takayuki Katoh , Toby Passioura , Hiroaki Suga . Advances in in vitro genetic code reprogramming in 2014–2017. Synthetic Biology 2018,3 (1)
  36. Takayuki Katoh , Yoshihiko Iwane , Hiroaki Suga . Logical engineering of D-arm and T-stem of tRNA that enhances d-amino acid incorporation. Nucleic Acids Research 2017,45 (22) , 12601-12610.
  37. Zhen Chen , David R. Liu . Nucleic Acid-Templated Synthesis of Sequence-Defined Synthetic Polymers. 2017,,, 49-90.
  38. Yi Liu , Do Soon Kim , Michael C Jewett . Repurposing ribosomes for synthetic biology. Current Opinion in Chemical Biology 2017,40 , 87-94.
  39. Satoru Horiya , Jennifer K. Bailey , Isaac J. Krauss . Directed Evolution of Glycopeptides Using mRNA Display. 2017,,, 83-141.
  40. Takayuki Katoh , Kenya Tajima , Hiroaki Suga . Consecutive Elongation of D-Amino Acids in Translation. Cell Chemical Biology 2017,24 (1) , 46-54.
  41. Christine M. Ring , Emil S. Iqbal , David E. Hacker , Matthew C. T. Hartman , T. Ashton Cropp . Genetic incorporation of 4-fluorohistidine into peptides enables selective affinity purification. Organic & Biomolecular Chemistry 2017,15 (21) , 4536-4539.
  42. Rumit Maini , Shiori Umemoto , Hiroaki Suga . Ribosome-mediated synthesis of natural product-like peptides via cell-free translation. Current Opinion in Chemical Biology 2016,34 , 44-52.
  43. Tomoshige Fujino , Hiroshi Murakami . In Vitro Selection Combined with Ribosomal Translation Containing Non-proteinogenic Amino Acids. The Chemical Record 2016,16 (1) , 365-377.
  44. George E. Fox . Origins and Early Evolution of the Ribosome. 2016,,, 31-60.
  45. Michael Goldflam , Christopher G. Ullman . Recent Advances Toward the Discovery of Drug-Like Peptides De novo. Frontiers in Chemistry 2015,3
  46. Joerg H. Schrittwieser , Stefan Velikogne , Wolfgang Kroutil . Biocatalytic Imine Reduction and Reductive Amination of Ketones. Advanced Synthesis & Catalysis 2015,357 (8) , 1655-1685.
  47. Jessica C. Bowman , Nicholas V. Hud , Loren Dean Williams . The Ribosome Challenge to the RNA World. Journal of Molecular Evolution 2015,80 (3-4) , 143-161.
  48. Niels ten Brummelhuis . Controlling monomer-sequence using supramolecular templates. Polymer Chemistry 2015,6 (5) , 654-667.
  49. Ryota Nabika , Shinya Oishi , Ryosuke Misu , Hiroaki Ohno , Nobutaka Fujii . Synthesis of IB-01212 by multiple N-methylations of peptide bonds. Bioorganic & Medicinal Chemistry 2014,22 (21) , 6156-6162.
  50. Toby Passioura , Hiroaki Suga . Reprogramming the genetic code in vitro. Trends in Biochemical Sciences 2014,39 (9) , 400-408.
  51. Jean-Michel Kornprobst . Eubacteria - 2. 2014,,, 1-62.
  52. Toby Passioura , Takayuki Katoh , Yuki Goto , Hiroaki Suga . Selection-Based Discovery of Druglike Macrocyclic Peptides. Annual Review of Biochemistry 2014,83 (1) , 727-752.
  53. Takashi Kawakami , Toru Sasaki , Patrick C. Reid , Hiroshi Murakami . Incorporation of electrically charged N-alkyl amino acids into ribosomally synthesized peptides via post-translational conversion. Chemical Science 2014,5 (3) , 887.
  54. Christopher T. Walsh , Robert V. O'Brien , Chaitan Khosla . Nichtproteinogene Aminosäurebausteine für Peptidgerüste aus nichtribosomalen Peptiden und hybriden Polyketiden. Angewandte Chemie 2013,125 (28) , 7238-7265.
  55. Christopher T. Walsh , Robert V. O'Brien , Chaitan Khosla . Nonproteinogenic Amino Acid Building Blocks for Nonribosomal Peptide and Hybrid Polyketide Scaffolds. Angewandte Chemie International Edition 2013,52 (28) , 7098-7124.
  56. Yu Gao , Thomas Kodadek . Synthesis and Screening of Stereochemically Diverse Combinatorial Libraries of Peptide Tertiary Amides. Chemistry & Biology 2013,20 (3) , 360-369.
  57. E. Railey White , Timothy M. Reed , Zhong Ma , Matthew C.T. Hartman . Replacing amino acids in translation: Expanding chemical diversity with non-natural variants. Methods 2013,60 (1) , 70-74.
  58. C. Alexander Valencia , Jianwei Zou , Rihe Liu . In vitro selection of proteins with desired characteristics using mRNA-display. Methods 2013,60 (1) , 55-69.
  59. Kenichiro Ito , Toby Passioura , Hiroaki Suga . Technologies for the Synthesis of mRNA-Encoding Libraries and Discovery of Bioactive Natural Product-Inspired Non-Traditional Macrocyclic Peptides. Molecules 2013,18 (3) , 3502-3528.
  60. Jayanta Chatterjee , Florian Rechenmacher , Horst Kessler . N -Methylierung von Peptiden und Proteinen: ein wichtiges Element für die Regulation biologischer Funktionen. Angewandte Chemie 2013,125 (1) , 268-283.
  61. Jayanta Chatterjee , Florian Rechenmacher , Horst Kessler . N -Methylation of Peptides and Proteins: An Important Element for Modulating Biological Functions. Angewandte Chemie International Edition 2013,52 (1) , 254-269.
  62. Adrianne M. Rosales , Rachel A. Segalman , Ronald N. Zuckermann . Polypeptoids: a model system to study the effect of monomer sequence on polymer properties and self-assembly. Soft Matter 2013,9 (35) , 8400.
  63. Christopher J Hipolito , Hiroaki Suga . Ribosomal production and in vitro selection of natural product-like peptidomimetics: The FIT and RaPID systems. Current Opinion in Chemical Biology 2012,16 (1-2) , 196-203.
  64. R. Edward Watts , Anthony C. Forster . Update on Pure Translation Display with Unnatural Amino Acid Incorporation. 2012,,, 349-365.
  65. Takashi Kawakami , Hiroshi Murakami . Genetically Encoded Libraries of Nonstandard Peptides. Journal of Nucleic Acids 2012,2012 , 1-15.
  66. Hui Wang , Rihe Liu . Advantages of mRNA display selections over other selection techniques for investigation of protein–protein interactions. Expert Review of Proteomics 2011,8 (3) , 335-346.
  67. Alexander O. Subtelny , Matthew C. T. Hartman , Jack W. Szostak . Optimal Codon Choice Can Improve the Efficiency and Fidelity of N-Methyl Amino Acid Incorporation into Peptides by In-Vitro Translation. Angewandte Chemie 2011,123 (14) , 3222-3225.
  68. Alexander O. Subtelny , Matthew C. T. Hartman , Jack W. Szostak . Optimal Codon Choice Can Improve the Efficiency and Fidelity of N-Methyl Amino Acid Incorporation into Peptides by In-Vitro Translation. Angewandte Chemie International Edition 2011,50 (14) , 3164-3167.
  69. Philip R Effraim , Jiangning Wang , Michael T Englander , Josh Avins , Thomas S Leyh , Ruben L Gonzalez , Virginia W Cornish . Natural amino acids do not require their native tRNAs for efficient selection by the ribosome. Nature Chemical Biology 2009,5 (12) , 947-953.
  70. Rihe Liu , Brian K. Kay , Shaoyi Jiang , Shengfu Chen . Nanoparticle Delivery: Targeting and Nonspecific Binding. MRS Bulletin 2009,34 (6) , 432-440.
  71. Eiji Nakajima , Yuki Goto , Yusuke Sako , Hiroshi Murakami , Hiroaki Suga . Ribosomal Synthesis of Peptides with C-Terminal Lactams, Thiolactones, and Alkylamides. ChemBioChem 2009,10 (7) , 1186-1192.
  72. Keigo Mizusawa , Kenji Abe , Shinsuke Sando , Yasuhiro Aoyama . Synthesis of puromycin derivatives with backbone-elongated substrates and associated translation inhibitory activities. Bioorganic & Medicinal Chemistry 2009,17 (6) , 2381-2387.
  73. Yevgeny Brudno , David R. Liu . Recent Progress Toward the Templated Synthesis and Directed Evolution of Sequence-Defined Synthetic Polymers. Chemistry & Biology 2009,16 (3) , 265-276.
  74. Takatsugu Kobayashi , Tatsuo Yanagisawa , Kensaku Sakamoto , Shigeyuki Yokoyama . Recognition of Non-α-amino Substrates by Pyrrolysyl-tRNA Synthetase. Journal of Molecular Biology 2009,385 (5) , 1352-1360.
  75. M. Y. Pavlov , R. E. Watts , Z. Tan , V. W. Cornish , M. Ehrenberg , A. C. Forster . Slow peptide bond formation by proline and other N-alkylamino acids in translation. Proceedings of the National Academy of Sciences 2009,106 (1) , 50-54.
  76. John A. McIntosh , Mohamed S. Donia , Eric W. Schmidt . Ribosomal peptide natural products: bridging the ribosomal and nonribosomal worlds. Natural Product Reports 2009,26 (4) , 537.
  77. Salvador Tomas . Bioinspired organic chemistry. Annual Reports Section "B" (Organic Chemistry) 2009,105 , 440.
  78. Alessandro Moretto , Marta De Zotti , Marco Crisma , Fernando Formaggio , Claudio Toniolo . N-Methylation of N α-Acetylated, Fully Cα-Ethylated, Linear Peptides. International Journal of Peptide Research and Therapeutics 2008,14 (4) , 307-314.
  79. Atsushi Ohta , Hiroshi Murakami , Hiroaki Suga . Polymerization of α-Hydroxy Acids by Ribosomes. ChemBioChem 2008,9 (17) , 2773-2778.
  80. Shinsuke Sando , Hiroki Masu , Chika Furutani , Yasuhiro Aoyama . Enzymatic N-methylaminoacylation of tRNA using chemically misacylated AMP as a substrate. Organic & Biomolecular Chemistry 2008,6 (15) , 2666.

Protein synthesis

Leucine, a branched-chain amino acid, stimulates muscle protein synthesis faster than other amino acids.

A 2016 study published in the journal Nutrients found that eating protein immediately before sleep can stimulate muscle protein synthesis as well as adaptation from that day’s training.

One detainee was bent over for a rectal feeding that involved Ensure, the protein shake.

Instead, opt for eating complete meals with good sources of protein and fiber.

Green plants in pre-flowering stages may contain significant protein but not fat.

But is bug protein really any better than traditional protein sources, like chicken, or your go-to protein powder?

But as a nutrition-obsessed senior at Brown University, he struggled to find a protein bar he actually liked.

But where there is no existing relation between the words or ideas, it is a case for Synthesis, to be taught hereafter.

In such cases, Synthesis, which is taught hereafter, develops an indirect relation.

Synthesis will be sometimes hereafter resorted to to connect in our minds an event to its date.

Recollective Synthesis or Thoughtive Unification is used where no relation exists.

The test is simple and harmless if the scratch is not too deep and if the protein is not injected beneath the skin.

Related Biology Term

  • Evolution – The process that changes populations of organisms over time, adapting them to the environment.
  • Inorganic – Molecules containing little carbon, not made in living organisms.
  • Organic – Molecules synthesized in living organisms that contain many carbon-carbon bonds.
  • Ribosome – One of the first cellular machines, capable of producing proteins from RNA molecules and amino acids.

1. A virus attached to a cell, and injects it DNA into a cell. The cell’s proteins and structures create proteins from the DNA, which create more viral DNA and protein cases. Currently, viruses are not considered to be “living”. Is this abiogenesis?
A. Yes, because the virus is not a living organisms but it is creating proteins.
B. No, because the cell is still responsible for the new materials.
C. Yes, but only if there is no carbon in the new material.

2. Which of the following is a valid criticism of abiogenesis theory?
A. The levels of energy needed to produce self-replicating molecules isn’t possible outside the lab.
B. We cannot known what the atmosphere of pre-Earth looked like.
C. If RNA formed first, there would be no reason for DNA.

3. Why would molecules naturally combine in nature?
A. They are naturally attracted to each other.
B. The products of their reactions are more stable.
C. All of the above.

Watch the video: From DNA to protein - 3D (January 2022).