Information

Directed evolution: Point mutation vs Insertion-Deletion vs Shuffling


When attempting enzyme function improvement via Directed Evolution I can see three different strategies to generating variation for the gene sequence:

  1. Point mutations
  2. Insertion / Deletions
  3. Shuffling

Is there a systematic difference between how successful each of these approaches are likely to be for a particular gene or in general? Does one choose one over the other approaches in any specific project based on any characteristics / goal differences?

A related question: On a methodological front, I know one can use error prone PCR to get #1 i.e. point mutations.

What are similar techniques for #2 & #3? i.e. How does one execute gene shuffling or Insertion / Deletions in practice?


In general, point mutations are introduced in the proteins of interest during directed evolution. These can be specific mutations in the active site of an enzyme if you would like to change its specificity towards a new substrate, for example. The point mutations can be introduced with primers when you amplify your gene of interest. However, if you don't know specifically what changes to make, then yes, a way to generate mutations is by error prone PCR or by Quick change PCR. If you'd like to see if a certain amino acid change abolishes activity/stability/etc, you would do Alanine scanning (https://en.wikipedia.org/wiki/Alanine_scanning).

I haven't heard much about introducing deletions or insertions for directed evolution. First of all, they have to be in-frame, so that the overall coding sequence does not change. People are usually introducing epitope tags on the N- or C-terminus for two reasons: 1) to purify the proteins; 2) to increase its solubility. But in general, the introduction of deletion will lead to unpredictable changes to the 3D structure of the protein of interest.

The DNA shuffling is very often used for directed evolution. Here is a figure from a Nature paper [1] describing DNA shuffling. The way it is usually performed is by using sequence homologs of the protein of interest. The different fragments can be obtained by PCR-amplification and introducing short overlaps with the neighboring fragments. The fragments can be then assembled by various methods, including Gibson assembly and homologous yeast recombination. The advantage of DNA shuffling over introducing single mutations is that you have to screen fewer mutants and the activity/stability of the protein could be improved several hundred fold more.

DNA shuffling figure from [1]

Interestingly, a DNA shuffling using a method called SCHEMA [2] was developed in Chris Voigt lab. Shortly, SHEMA is a computational algorithm used to identify the fragments of proteins, or schemas, that can be recombined without disturbing the integrity of the three dimensional structure. It is based on the 3D structure (of one of the parents) and an alignment of the parental sequences. It calculates the interactions between residues and determines the number of interactions that are disrupted in the creation of a hybrid protein. A window of residues w is defined and the number of internal interactions within this window is counted. The window is slid and the schema profile is created. For example. in the figure below, the number of interactions that are broken in the possible chimeras are 0 on the right chimera and 3 on the left chimera. The right chimera will be used in the screening.

SHEMA method from [2]


In the light of directed evolution: Pathways of adaptive protein evolution

Directed evolution is a widely-used engineering strategy for improving the stabilities or biochemical functions of proteins by repeated rounds of mutation and selection. These experiments offer empirical lessons about how proteins evolve in the face of clearly-defined laboratory selection pressures. Directed evolution has revealed that single amino acid mutations can enhance properties such as catalytic activity or stability and that adaptation can often occur through pathways consisting of sequential beneficial mutations. When there are no single mutations that improve a particular protein property experiments always find a wealth of mutations that are neutral with respect to the laboratory-defined measure of fitness. These neutral mutations can open new adaptive pathways by at least 2 different mechanisms. Functionally-neutral mutations can enhance a protein's stability, thereby increasing its tolerance for subsequent functionally beneficial but destabilizing mutations. They can also lead to changes in “promiscuous” functions that are not currently under selective pressure, but can subsequently become the starting points for the adaptive evolution of new functions. These lessons about the coupling between adaptive and neutral protein evolution in the laboratory offer insight into the evolution of proteins in nature.

Proteins are the molecular workhorses of biology, responsible for carrying out a tremendous range of essential biochemical functions. The existence of proteins that can perform such diverse tasks is a testament to the power of evolution, and understanding the forces that shape protein evolution has been a longstanding goal of evolutionary biology. More recently, it has also become a subject of interest among bioengineers, who seek to tailor proteins for a variety of medical and industrial applications by mimicking evolution. Although they approach the study of protein evolution from different perspectives and with different ultimate goals, evolutionary biologists and bioengineers are interested in many of the same broad questions.

In examining these questions, we begin by considering the continuing relevance of one of the earliest analyses of protein evolution, performed >40 years ago by the great chemist Linus Pauling and his colleague Emile Zuckerkandl (1). Working at the time when it was first becoming feasible to obtain amino acid sequences, Pauling and Zuckerkandl assembled the sequences of hemoglobin and myoglobin proteins from a range of species. They compared the sequences with an eye toward determining the molecular changes that accompanied the evolutionary divergence of these species. But although it was already known [in part from Pauling's earlier work on sickle cell anemia (2, 3)] that even a single mutation could alter hemoglobin's function, the number of accumulated substitutions seemed more reflective of the amount of elapsed evolutionary time than any measure of functional alteration. Summarizing their research, Pauling and Zuckerkandl wrote (1), Perhaps the most important consideration is the following. There is no reason to expect that the extent of functional change in a polypeptide chain is proportional to the number of amino acid substitutions in the chain. Many such substitutions may lead to relatively little functional change, whereas at other times the replacement of one single amino acid residue by another may lead to a radical functional change. Of course, the two aspects are not unrelated, since the functional effect of a given single substitution will frequently depend on the presence or absence of a number of other substitutions. This passage highlights 2 key issues that continue to occupy researchers nearly a half-century later. First, natural proteins evolve through a combination of neutral genetic drift and functionally-selected substitutions. Although probably every evolutionary biologist would acknowledge the existence of both types of substitutions, their relative prevalence is debated with often startling vehemence (4, 5). The intractability of this debate is caused in large part by the difficulty of retrospectively determining whether long-ago substitutions were the subject of selective pressures.

The second issue highlighted by Pauling and Zuckerkandl, the potential for an adaptive mutation's effect to depend on the presence of other possibly nonadaptive mutations, has been a topic of much discussion among protein engineers (6 ⇓ –8). The reason is that the presence of epistatic coupling between mutations has the potential to profoundly affect the success of protein optimization strategies. In the absence of epistasis, a protein can always be improved by a simple hill-climbing approach, with each successive beneficial mutation moving further up the path toward some desired objective. But such a hill-climbing approach can in principle be confounded by epistasis, because selectively-favored “uphill” steps (beneficial mutations) may only be possible after several ”sideways” or “downhill” steps (neutral or deleterious mutations).

Over the last decade, protein engineers have performed hundreds of directed evolution experiments to improve properties such as catalytic activity, binding affinity, or stability (9 ⇓ –11). The results of these experiments offer substantial insight into the possible pathways of adaptive protein evolution and the interplay between adaptive and neutral mutations. In the next section, we describe how a typical directed evolution experiment is implemented. We then provide a specific example of how directed evolution was successfully applied to a cytochrome P450 enzyme. Drawing on this example and a wealth of other work, we then generalize to draw what we consider to be three of the main empirical lessons from directed evolution. Finally, we discuss how these lessons can help inform an understanding of natural protein evolution.


Summary

Directed evolution is a powerful strategy for improving the functionality of proteins. Generating thousands of variants, directed evolution uses high throughput screening to arrive at the best solution when either the biological function is obscure or when chemical knowledge on the substrate and protein structure is limited. Directed evolution has enabled the generation of numerous engineered proteins, including pure isomers of pharmaceuticals and proteins for critical life-science applications. Provided the rapid turnaround time for generating new proteins through directed evolution, engineered proteins and enzymes will likely continue to have a significant impact in advancing our understanding of numerous biological processes and biocatalysis. 14


Deletion Mutant - Evolution - (Aug/20/2012 )

@Pito: genome shuffling happens in nature: V(D)J recombination is DNA shuffling this is the way your body makes antibodies by combining the alleles in the MHC class I loci moreover in the next step, your body evolves the produced antibodies by epPCR-like process both methods used by directed evolution happening in your body 4 days after infection with a virus.

The point of epPCR is that it actually recreates the same most probable mutations/errors done by DNA polymerases in bacteria. There are tons of papers calculating the probabilities of certain transitions/transversions and the hot spots for mutations and they are pretty much the same no matter what polymerase you are using. So what you get in the test tube is what nature produces in millions of years. This is why directed evolution is defined as fast-forwarding natural evolution in a test tube. (some people use this cliche over and over again in the directed evolution field)

@prabhubct: if you would like to read more about directed evolution: http://www.sesam-bio. ected-evolution A pretty good review of the state of the art 2-3 years ago (I wrote it:P) It also has references to material that paraallels directed evolution to natural evolution.


@ asacioc as you said '' So what you get in the test tube is what nature produces in millions of years. This is why directed evolution is defined as fast-forwarding natural evolution in a test tube.'' could it be fast- back-warding natural evolution?

pito on Wed Aug 22 20:04:47 2012 said:

ascacioc on Wed Aug 22 19:06:14 2012 said:

@Pito: genome shuffling happens in nature: V(D)J recombination is DNA shuffling this is the way your body makes antibodies by combining the alleles in the MHC class I loci moreover in the next step, your body evolves the produced antibodies by epPCR-like process both methods used by directed evolution happening in your body 4 days after infection with a virus.

The point of epPCR is that it actually recreates the same most probable mutations/errors done by DNA polymerases in bacteria. There are tons of papers calculating the probabilities of certain transitions/transversions and the hot spots for mutations and they are pretty much the same no matter what polymerase you are using. So what you get in the test tube is what nature produces in millions of years. This is why directed evolution is defined as fast-forwarding natural evolution in a test tube. (some people use this cliche over and over again in the directed evolution field)

@prabhubct: if you would like to read more about directed evolution: http://www.sesam-bio. ected-evolution A pretty good review of the state of the art 2-3 years ago (I wrote it:P) It also has references to material that paraallels directed evolution to natural evolution.

Of course genomeshuffeling happens in nature, but there is a difference between what you call genome shuffeling that happens in nature (there is a system behind it, I am talking about the VDJ recombination now) and genome shuffeling we do in the lab.

And I do not agree with what you state here, its a bit "risky" to state it like this: "your body evolves the produced antibodies by epPCR-like process both methods used by directed evolution happening in your body 4 days after infection with a virus"
This is not entirely correct.
The body does not "evolve" like you state it.
The body has allready a bunch of antibodies present, by pure luck a few of those happen to bind the antigen, because they do, they will be favored and other that do not bind will not (or less) by enriched (recreated) by the body. This is called affinity maturation of antibodies. This happens because the B cells with the best binding receptors will bind the anitgen and will be selected (and survive/multiply and pass on their genes) because they are able to bind the follicular dendritic cells (those will bind the antigen, so the B cells bind indireclty) this is how affinity maturation happens and how "evolution" of antigens work. ANd yes, due to simple mutations in those cells, you will also create better cells in the end. But its bit different from just stating what you stated.


But at the start: its all a random proces, your body just "creates" random antibodies by VDJ recombination (random but with a certain system).

This is completely different from the genome shuffling you are speaking of. In genome shuffling you cut DNA and create random new pieces of DNA by extending them again, ligating them.
+ genome shuffeling and then compare it with VDJ.. a bit weird esp since I was talking about bacteria, but perhaps I did not state this clear enough.
In bacteria there is no such thing as VDJ recombination.
And I dont like to link systems we use in bacteria/yeast for human or animal systems/genes.

Also: epPCR can indeed be used as a tool to study what happens with bacteria/yeast for example and you can indeed call it fast evolution, but it doesnt really represent 100% what happens in nature.
Its just a tool to cause mutation, nothing more.
ANd yes, you could state that those mutation (or some) would indeed also happen in nature , but nature is far more complex.

Also: linking genome shuffeling and epPCR is a bridge to far for me.
epPCR is much more controlled while genome shuffeling is (or can be) less controlled and you can get stranger results.
Altough, in the end its all about how you define certain stuff.

Also, and you said it yourself: its called "directed" evolution, thats just it: we (the researchers) direct evolution.. you cant simple say: aha, this is what would happen in nature.
Its a bit easy to state that.

What you create in the lab, I wouldnt call it evolution. I would call it: an observation of changes in DNA that cause a certain (observed/measured) effect, which in the end could indeed be a representation of a certain (possible) evolution.
You need to keep in mind that many of the so called "evolutions" caused by direct evolution techniques would not survive in nature or stay evolved like this because we want this evolution and keep in, while in nature the evolution might we "stupid" and not wanted and thus be lost in the end.

But this is more about semantics.

But do not agree with "directed evolution is defined as fast-forwarding natural evolution in a test tube" because for me this is not correct. A lot of the so called "direct evolution" are no evolutions that would happen in nature.

thanks.
@ pito : as you said '' epPCR can indeed be used as a tool to study what happens with bacteria/yeast for example and you can indeed call it fast evolution, but it doesnt really represent 100% what happens in nature..'' i agree with you as directed evolution could not represent 100% of nature. But it do represent some probabilistic value for evolution if we do want to believe in transition, transversion, recombination as means of evolution.

prabhubct on Thu Aug 23 05:58:08 2012 said:

pito on Wed Aug 22 20:04:47 2012 said:

ascacioc on Wed Aug 22 19:06:14 2012 said:

@Pito: genome shuffling happens in nature: V(D)J recombination is DNA shuffling this is the way your body makes antibodies by combining the alleles in the MHC class I loci moreover in the next step, your body evolves the produced antibodies by epPCR-like process both methods used by directed evolution happening in your body 4 days after infection with a virus.

The point of epPCR is that it actually recreates the same most probable mutations/errors done by DNA polymerases in bacteria. There are tons of papers calculating the probabilities of certain transitions/transversions and the hot spots for mutations and they are pretty much the same no matter what polymerase you are using. So what you get in the test tube is what nature produces in millions of years. This is why directed evolution is defined as fast-forwarding natural evolution in a test tube. (some people use this cliche over and over again in the directed evolution field)

@prabhubct: if you would like to read more about directed evolution: http://www.sesam-bio. ected-evolution A pretty good review of the state of the art 2-3 years ago (I wrote it:P) It also has references to material that paraallels directed evolution to natural evolution.

Of course genomeshuffeling happens in nature, but there is a difference between what you call genome shuffeling that happens in nature (there is a system behind it, I am talking about the VDJ recombination now) and genome shuffeling we do in the lab.

And I do not agree with what you state here, its a bit "risky" to state it like this: "your body evolves the produced antibodies by epPCR-like process both methods used by directed evolution happening in your body 4 days after infection with a virus"
This is not entirely correct.
The body does not "evolve" like you state it.
The body has allready a bunch of antibodies present, by pure luck a few of those happen to bind the antigen, because they do, they will be favored and other that do not bind will not (or less) by enriched (recreated) by the body. This is called affinity maturation of antibodies. This happens because the B cells with the best binding receptors will bind the anitgen and will be selected (and survive/multiply and pass on their genes) because they are able to bind the follicular dendritic cells (those will bind the antigen, so the B cells bind indireclty) this is how affinity maturation happens and how "evolution" of antigens work. ANd yes, due to simple mutations in those cells, you will also create better cells in the end. But its bit different from just stating what you stated.


But at the start: its all a random proces, your body just "creates" random antibodies by VDJ recombination (random but with a certain system).

This is completely different from the genome shuffling you are speaking of. In genome shuffling you cut DNA and create random new pieces of DNA by extending them again, ligating them.
+ genome shuffeling and then compare it with VDJ.. a bit weird esp since I was talking about bacteria, but perhaps I did not state this clear enough.
In bacteria there is no such thing as VDJ recombination.
And I dont like to link systems we use in bacteria/yeast for human or animal systems/genes.

Also: epPCR can indeed be used as a tool to study what happens with bacteria/yeast for example and you can indeed call it fast evolution, but it doesnt really represent 100% what happens in nature.
Its just a tool to cause mutation, nothing more.
ANd yes, you could state that those mutation (or some) would indeed also happen in nature , but nature is far more complex.

Also: linking genome shuffeling and epPCR is a bridge to far for me.
epPCR is much more controlled while genome shuffeling is (or can be) less controlled and you can get stranger results.
Altough, in the end its all about how you define certain stuff.

Also, and you said it yourself: its called "directed" evolution, thats just it: we (the researchers) direct evolution.. you cant simple say: aha, this is what would happen in nature.
Its a bit easy to state that.

What you create in the lab, I wouldnt call it evolution. I would call it: an observation of changes in DNA that cause a certain (observed/measured) effect, which in the end could indeed be a representation of a certain (possible) evolution.
You need to keep in mind that many of the so called "evolutions" caused by direct evolution techniques would not survive in nature or stay evolved like this because we want this evolution and keep in, while in nature the evolution might we "stupid" and not wanted and thus be lost in the end.

But this is more about semantics.

But do not agree with "directed evolution is defined as fast-forwarding natural evolution in a test tube" because for me this is not correct. A lot of the so called "direct evolution" are no evolutions that would happen in nature.

thanks.
@ pito : as you said '' epPCR can indeed be used as a tool to study what happens with bacteria/yeast for example and you can indeed call it fast evolution, but it doesnt really represent 100% what happens in nature..'' i agree with you as directed evolution could not represent 100% of nature. But it do represent some probabilistic value for evolution if we do want to believe in transition, transversion, recombination as means of evolution.


Yes,it does represents some possibilities.
Thats the idea behind it, but you should never forget what you are really doing compared with evolution

prabhubct on Thu Aug 23 05:52:13 2012 said:

ascacioc on Wed Aug 22 19:06:14 2012 said:

@Pito: genome shuffling happens in nature: V(D)J recombination is DNA shuffling this is the way your body makes antibodies by combining the alleles in the MHC class I loci moreover in the next step, your body evolves the produced antibodies by epPCR-like process both methods used by directed evolution happening in your body 4 days after infection with a virus.

The point of epPCR is that it actually recreates the same most probable mutations/errors done by DNA polymerases in bacteria. There are tons of papers calculating the probabilities of certain transitions/transversions and the hot spots for mutations and they are pretty much the same no matter what polymerase you are using. So what you get in the test tube is what nature produces in millions of years. This is why directed evolution is defined as fast-forwarding natural evolution in a test tube. (some people use this cliche over and over again in the directed evolution field)

@prabhubct: if you would like to read more about directed evolution: http://www.sesam-bio. ected-evolution A pretty good review of the state of the art 2-3 years ago (I wrote it:P) It also has references to material that paraallels directed evolution to natural evolution.


@ asacioc as you said '' So what you get in the test tube is what nature produces in millions of years. This is why directed evolution is defined as fast-forwarding natural evolution in a test tube.'' could it be fast- back-warding natural evolution?

You just cause mutations, backwarding or forwarding.. who knows.

Its a bit more complicated. But we are speaking of bacteria here, so there is no real telling if we are going forward of backwards. try to define backwards and forwards.. is not that easy in certain circumstances.

wow. quite a lot happened here while I was in the lab. Just a few words to the first post after mine:
somatic hypermutation is epPCR: in the lymphoid system the mutation rate is 10^6 higher than in the normal cells epPCR is amplification with a high mutation rate. And this is not semantics.
I give you that at least: indeed VDJ recombination is a totally different system than DNA shuffling. even though parallels can be drawn.
Both DNA shuffling and epPCR are very controlled: now, if you use genome shuffling a la Stamer (first protocol ever published) you indeed cannot control it, but recent protocols for both epPCR and DNA shuffling are very well characterized and you can predict what you have in the end in the test tubes. There are programs and algorithms/scripts that do that for you. I worked developing some myself during my master thesis. and there were unexpectidily quite good in telling me what some other people get in their test tubes which means that we did not have so many uncontrolable stuff happening in the test tube (as I thought in the beginning of my masters when I was like you: riiiight you can control it)
I did not link DNA shuffling and epPCR more than the basic two protocols used in directed evolution. They are totally differently.
Of course that what we call directed evolution in the tube would not happen in nature: depends on where the selective pressure lays. I mean: if I evolve for example glucose oxidase to be more active to use it for a biofuel (real project on which people are actually working) you will not get the same things as in nature because while you are lowering the Km and making the kcat higher by directed evolution for you purpose, maybe in real life it is not good for this enzyme to use up in a fraction of second all your glucose in an organism and release H2O2 like tons of it in the same fraction of second because the organism will starve and will be killed by the toxicity immediatelly. However, it matter where you put the selection pressure: if you choose a smart selection pressure to keep the good mutations that are beneficial for an organism, you can simulate evolution.

And about not agreeing with that sentence: well, you are not agreeing with an entire field.

@prabhubct: fast-backwarding: I do not know how the selection pressure would work to fast backward something. I only know how to improve stuff, not how to make them worse

ascacioc on Thu Aug 23 21:43:09 2012 said:

wow. quite a lot happened here while I was in the lab. Just a few words to the first post after mine:
somatic hypermutation is epPCR: in the lymphoid system the mutation rate is 10^6 higher than in the normal cells epPCR is amplification with a high mutation rate. And this is not semantics.
I give you that at least: indeed VDJ recombination is a totally different system than DNA shuffling. even though parallels can be drawn.
Both DNA shuffling and epPCR are very controlled: now, if you use genome shuffling a la Stamer (first protocol ever published) you indeed cannot control it, but recent protocols for both epPCR and DNA shuffling are very well characterized and you can predict what you have in the end in the test tubes. There are programs and algorithms/scripts that do that for you. I worked developing some myself during my master thesis. and there were unexpectidily quite good in telling me what some other people get in their test tubes which means that we did not have so many uncontrolable stuff happening in the test tube (as I thought in the beginning of my masters when I was like you: riiiight you can control it)
I did not link DNA shuffling and epPCR more than the basic two protocols used in directed evolution. They are totally differently.
Of course that what we call directed evolution in the tube would not happen in nature: depends on where the selective pressure lays. I mean: if I evolve for example glucose oxidase to be more active to use it for a biofuel (real project on which people are actually working) you will not get the same things as in nature because while you are lowering the Km and making the kcat higher by directed evolution for you purpose, maybe in real life it is not good for this enzyme to use up in a fraction of second all your glucose in an organism and release H2O2 like tons of it in the same fraction of second because the organism will starve and will be killed by the toxicity immediatelly. However, it matter where you put the selection pressure: if you choose a smart selection pressure to keep the good mutations that are beneficial for an organism, you can simulate evolution.

And about not agreeing with that sentence: well, you are not agreeing with an entire field.

@prabhubct: fast-backwarding: I do not know how the selection pressure would work to fast backward something. I only know how to improve stuff, not how to make them worse

you are right about, my mistake.

I misread your first post and I was talking more about what went on before the mutation.

Its indeed as you said: after binding antigens this starts, but the receptors themsellf (the first ones) are allready made at that time. I was thinking you ment the entire proces from the start which is of course not right.

PS.: you state that I am not agreeing with an entire field, I dont know if you are an immunologist, but if you are you, should know that even they arent agreeing completely, a lot of questionmarks are still out ther in that specific field. Its a pretty new field. Altough they do seem to agree on the proces, but how it happens exactly etc. still a debate going on.

This is just it: either you control it and then you are not really working with evolution, because evolution is not really controlled at all.. (at least not at the level people control it in test tubes during this kind of experiments). Or you cant control it and you do a completely random shuffling and then there is no link with evolution at all because you shuffle it way more then nature.

The thing is: you can control it because you set certain values yourself at the start! You start controling it yourself . thats not what you call evolution.
You pick the strains, you pick enzymes (sometimes, or more often then not, they work with restriction enzymes not even natural to the organisms), you pick the working temp, the media, . you control a lot of the parameters.
BTW: the entire discussion here is nothing more then what physicist (and biologists) are debating for yours: chaostheory, randomness etc
Or on a footnot: what religious people are also stating.

Also, I like what you said about "evolving", but this sentence kinda breaks it down:

All you can do is mimick what you think happens in nature, you define those boundaries.. A lot of "you" here and not a lot of "nature".

A lot of the work done on micro-organisms in the field of "evolution" is not really following the debate about what evolution now really is. The definition of evolution is changing almost every 10 years and even when people (like you?) speak about fast foward evolution in bacteria/yeast, we arent even able to correctly link protists based on evolutionary mutations etc.
All we do in the lab is cause mutations to occur , have genes rearranged , check for "better" organisms we can use or "new" organims and then we call this evolution. There are papers out there describing the "evolution" of a bacterium towards a better bacterium to bio-degrade some toxicA, is this really evolution? Is putting genes from yeast 1 in bacterium 4 and then caling this evolution really correct?
Dont forget that in many shuffeling experiments you mix genes from different yeasts .
Or you push the organisms towards a certain evolution.. like you allready said: you select based on what you (we) want. weird view of evolution to be honest.
Your sentence said it, it makes my point:

@prabhubct: fast-backwarding: I do not know how the selection pressure would work to fast backward something. I only know how to improve stuff, not how to make them worse

You only know how to improve stuff.. not to make them worse.. well there you go. evolution doesnt work like that at all.
what we call improving is based on our needs, and the definition of making things worse is also based on our standards. Dont forget that in the history of evolution a lot of the so called "bad" evolutions turned out to be good.

Evolution seems to be a very wide definition for many biotechnologists.

I think there should be some new definition or general agreement on what we do: evolve things like we want vs (real) evolution.

I was not talking about not agreeing with Immunology I was talking about the directed evolution field. I used to work (3 years) in directed evolution. I only took a specialization class in Immunology (still, I can say that I have a bit more knowledge than the average trained biologist in immunology). So. the directed evolution people really think that they are simulating evolution in their test tubes. BTW, FYI the view of one of these people about directed evolution:
http://www.chymiatrie.de/index.php/component/content/article/133-video-37

One of the popes in directed evolution explaining some things here.

And now, let's stop the discussion and call it a truce because we are getting too philosophic, polemic and semantic and too little scientific


Engineering genetic variation

The processes of genetic mutation and recombination are often considered to be random in nature. However, the types of variation that can occur and their associated probabilities are often heavily biased and constrained by the biochemistry of the biosystem itself, limiting the paths accessible to evolution 25 . As these constraints are partly determined by the biosystems genotype, genetic variation is something that can, in theory, be genetically engineered. For example, not all point mutations are equally likely transversions and transitions differ in their likelihoods 26 , and methylation 27 , genomic context 28 and species 29 all influence local and global mutation rates. Furthermore, algorithmic mutations 30 may occur. These are mutations that result in changes of several nucleotides in one event (thus, an algorithm can describe the change) and can be thought of as shortcuts through sequence space (Fig. 1b, left). The likelihood of an algorithmic mutation may be much greater than the summed likelihoods of the equivalent sequence of individual point mutations. For example, the chance of an insertion of the two-base motif ‘AC’ into a tandem repeat region due to slipped-strand mispairing may be more likely than two insertion events of ‘A’ and ‘C’ occurring independently 31 . Recombination 32 and mobile genetic elements 33 are other examples of biological processes capable of producing algorithmic mutations.

Sequence space is therefore not explored in a uniformly random way, even discounting for the role of selection. Instead, the paths evolution can take are determined by the ‘variation operator set’, which defines all the different point and algorithmic mutations that can occur in the system. Each variation operator in this set has an associated probability distribution that represents the likelihood of arriving at a given sequence from another (i.e., by this operator acting on the design type). The distributions of the variation operator set combine to produce the ‘variation probability distribution’. This describes the chance of arriving at any given sequence from the design type due to all the biochemical and physical processes capable of causing genetic variation that are present in the system (Fig. 1b, right). The variation operator set defines the rate and the likely directions in sequence space a design will explore during evolution. As a design type evolves, the variation probability distribution changes, as further dispositions become available.

The variation operator set depends on the specifics of the biosystem being engineered and the set to be applied in practice is dependent on available knowledge of the system. For example, the variation operator set of a design-type biosystem may be said to include transition mutations, transversion mutations and recombinations, each associated with a unique probability that varies across the design type’s sequence. A sample population can be generated by applying the operator set to the design type. This population, with the design type at its centre, may be named a quasispecies, as is used for the related concept in viral evolution 34 .

The variation probability distribution can be considered at all stages of the design process: from specifying mutation rates of specific parts, designing new biochemical mechanisms capable of specific forms of genetic variation and thinking of genetic variation as a feature of a system that can be designed and built. Such integration would allow for global and local mutation rates to be specified as part of the design and standardised mutation rates could even be listed in part datasheets 35 . It is likely that improvements in the prediction of mutation probabilities will be made with the increasing availability of sequence data and associated computational methods. Furthermore, some design rules for influencing local genetic variability are already known (e.g., avoiding the reusing of parts and repetitive sequences to reduce homologous recombination and indel mutations) 36,37,38 , and global mutation rates can also be rationally engineered and manipulated 12,39 .

A large toolkit for controlling genetic variation has already been created by bioengineers, which could be used to improve evolutionary stability or increase specific evolvability (i.e., the ability of the biosystem’s evolution to be directed as the designer intended). New tools will doubtless be developed from the diverse mechanisms that generate genetic variation in nature. The variation probability distribution of the design type can be modified by either adding or removing variation operators (e.g., by adding or removing DNA modifying enzymes) or by modulating existing operators in the system across the genotype. This may be through altering DNA sequence properties (e.g., avoiding simple sequence repeats to reduce the chance of indels through slipped-strand mispairing 36 ). Variation operators can be highly targeted like the DNA methylation of specific bases to increase likelihood of mutation through spontaneous deamination 27 or may have a global effect such as the removal of error-prone polymerases from a host organism 40 . Orthogonal mutation systems that modulate genetic variation of a specific plasmid or region of DNA can be used to overcome genomic error thresholds, increasing the potential for directed evolution 41 .

Larger-scale genetic variation can be achieved through mechanisms such as site-specific recombination, which can be used for inserting, removing, duplicating, inverting or shuffling large segments of DNA, exemplified by the SCRaMbLE system used in the synthetic yeast Sc2.0 42 . Finally, acquisition of foreign DNA either from other organisms in the population through sex, horizontal gene transfer or from free oligonucleotides in the environment 12 may also be engineered. The recombinant approaches of genetic engineering can be thought of as a highly orchestrated form of horizontal gene transfer, which is also increasingly being acknowledged as a source of innovation in natural evolution. For example, it is a major mechanism used by bacteria to acquire antibiotic resistance 43 . As with sexual recombination, it enables large jumps through sequence space. This increases the breadth of search and potentially enables the crossing of valleys in the evolutionary landscape to access peaks that would otherwise be inaccessible.

By combining these and other biochemical tools, it may eventually be possible to precisely design the variation operator set to produce complex combinations of genetic variation. For example, the variation operator set of a genetic circuit may be engineered by avoiding repeated parts (removing the homologous recombination operator), using a host with a high-fidelity DNA polymerase (globally reducing probability of point mutations), and by incorporating DNA recombination sites (adding an operator for specific DNA recombination, perhaps to be used for future directed evolution). Table 2 provides some examples of methods for controlling variation operators that have been developed so far.


Stabilization of GPCRs by Point Mutations and their Combination

Currently there is no clear design strategy for stabilization of GPCRs by mutations. Therefore, stabilizing mutations have to be identified experimentally by testing many different point mutations in either one-by-one or by ensemble evolutionary approaches. When single mutations are identified, they can be combined to further increase the thermostability of the protein. The process of combining the mutations is also experimental because the effects of individual mutations are not always additive and the structural basis for stabilization is not necessarily obvious. However, some general observations have been formulated. Effects of replacing residues which are neighbors in sequence or structure usually do not lead to a further increase and may even decrease the stability of the protein, as effects of single mutations may cancel each other when combined. Combinations of non-neighboring mutations may lead to a further stabilization, though the increase in stability is usually smaller than the summed up stabilization effects conferred by single mutations (Magnani et al., 2008 Serrano-Vega et al., 2008 Shibata et al., 2009, 2013 Lebon et al., 2011a). It has been observed that mutations stabilizing the agonist-bound state are more difficult to combine as they may stabilize slightly different active conformations. In addition, active conformations are more open on the intracellular side which may be more difficult to stabilize compared to a more compact, less dynamic inactive state (Magnani et al., 2008).


Access options

Get full journal access for 1 year

All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.

Get time limited or full article access on ReadCube.

All prices are NET prices.


Acknowledgements

The authors thank Professors Aaron P. Mitchell and Lorraine S. Symington for helpful discussions both in the design of this project and during the writing of this manuscript. They also like to acknowledge Prof. Jingyue Ju and the Columbia Genome Center for high-throughput DNA sequencing.

Additional Supporting Information may be found in the online version of this article.

Filename Description
PRO_513_sm_suppinfo.doc369.5 KB Supporting Information

Please note: The publisher is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.


Engineering enzymes to catalyze new biosynthetic reactions

Enzymes have indeed found significant applications in synthetic processes but the full realization of their potential has been limited because they are often unstable, rarely accommodate alternate substrates, function under a limited range of reaction conditions and frequently require expensive cofactors. The drawbacks an enzyme exhibits are determined by the architecture and dynamics of the active site and their encompassing protein scaffold. To address these issues, structure-based redesign strategies are often undertaken that endeavor to identify mutations, usually based on analysis of enzyme active sites, to improve binding and catalysis for a desired substrate or reaction context. Many successes have been reported, but it is likely that for every success there are at least as many unreported failures. This is perhaps unsurprising, since there are inherent limits to the mutability of active sites for very divergent substrates and because enzyme-engineering hypotheses are often based on structural studies that are incomplete, both in terms of knowledge of substrate, transition state, product-binding modes and in understanding the dynamics of active sites and their ancillary-protein scaffolding.

Directed evolution methodologies have been developed, in part, to bypass this knowledge gap [4,5]. Inspired by natural selection, these methods endeavor to identify productive protein-sequence substitutions and their useful combinations by generating libraries of enzyme variants. For instance, random mutagenesis of an encoding gene and then selecting for improved characteristics via desired function- or property-screening strategies. In principle, by emulating the search algorithm employed by nature (mutation followed by selection), directed evolution is capable of identifying solutions to problems in the generation of new biocatalysts from progenitor enzymes lacking complete characterization, without knowledge-based biases. In practice, however, structural and biochemical data are often used to guide directed evolution experimental designs for greater success.

For over two decades, directed evolution methods have been applied throughout numerous studies, for the generation of optimized biocatalysts, in some cases with impressive results. Rate enhancements in excess of five orders of magnitude have been reported [6], in addition to the generation of biocatalysts for previously unknown biochemical reactions. While these successes have been reviewed in other recent publications [2,5,7,8], we will highlight a few here to illustrate the power and scope of these methods.

P450 monooxygenases have been developed by Arnold and co-workers for the biochemical fermentation of alcohols from straight chain alkanes. Using a medium-chain fatty acid oxidase as a progenitor enzyme, directed evolution methodologies introduced catalytic competence for the oxidation of successively shorter and less oxidized alkyl chain precursors, first octane [9] and later to propane [10]. Step-wise improvements via mutagenesis throughout the individual domains of the monooxygenase variants ultimately resulted in an efficient P450, propane monooxygenase [11]. Furthermore, one of the variants resulting from these campaigns was evolved to convert ethane to ethanol, currently an important biofuel for use in automobiles [12]. Zhao and co-workers have demonstrated how the cofactor-regeneration problem may be addressed using directed evolution of phosphite dehydrogenase. Through a combination of multiple-mutagenesis methods over several generations, the t1/2 of phosphite dehydrogenase at 45ଌ was improved greater than 23,000-fold from the parent enzyme without sacrificing catalytic efficiency, providing a useful biochemical method for NADH cofactor regeneration [13,14]. Impressive results have also been reported in the improvement of enzyme enantioselectivity for new substrates through the process of iterative saturation mutagenesis and Combined Active site Saturation Testing (CASTing) [15]. To demonstrate the utility of these methods, Reetz applied iterative saturation mutagenesis to Pseudomonas aeuriginosa lipase, a well studied enzyme, to increase the enantioselectivity for a selected chiral ester, succeeding in generating a mutant with an enantiomeric ratio (E-value) 594-fold improved over the original enzyme. In comparison, previous efforts using conventional mutagenesis protocols (error prone polymerase chain reaction [epPCR], saturation mutagenesis of hot spots, gene shuffling and recombination) identified a lipase possessing a comparatively lower E-value of 51 [16].

While there are several similar impressive success stories, a cursory survey of the literature also reveals a substantial number of studies in which the enhancements are more modest. Further complicating the task of quantifying the successes of many studies is that many articles report improvement in properties that do not refer directly to kinetic parameters (e.g., total yield of reaction). Presumably, this is not an intentional obfuscation. Rather, many directed evolution studies are simply goal oriented and the goal is a desired phenotype. Desired phenotypes may be 𠆊n organism survives’, 𠆊 colony changes color’ or that 𠆊 percent conversion is achieved’. Indeed, almost all reported studies to date appear to succeed in generating their desired phenotype or chemophenotype and by these criteria, can be viewed as bona fide successes in directed evolution.

In the context of combating the problem of 𠆏inancial toxicity’, two recent studies demonstrate the potential of directed evolution methods for reducing costs. A synthetic-biology campaign for the economical production of artemisinin has used directed evolution methodology as a tool in optimizing production in heterologous expression of this antimalarial metabolite [17�]. In a more recent example, the Sitagliptin synthetic pathway was streamlined via the directed evolution of an enantioselective transaminase, which obviates the need for a costly resolution step in the synthesis of this clinically prescribed antihyperglycemic compound. Given the annual market for sitagliptin (tradename Januvia/Janumet) of US$2.8 billion, these improvements have a large potential economic and health impact [20].

However, given the substantial body of literature describing successes in directed evolution, the diversity of the methods employed and the variety of enzyme classes improved, we became interested in performing an objective assessment of success of the methods in directed evolution for the creation or optimization of discreet biosynthetic enzymes. Correspondingly, for this study, we selected articles from the last ten years using objective selection criteria. Using the GoPubMed.com semantic server [21], we identified the top 20 research journals publishing articles with the search phrases 𠆍irected evolution’ or 𠆍irected molecular evolution’. Additionally, all citations comprised studies that reported improvements in a small molecule biotransformation enzyme, reported kinetic parameters, including kcat/Km and/or kcat (or Vmax) and/or Km, for both the progenitor enzyme and the evolved enzyme, are target-based studies in other words, the reported improved turnover was for the substrate used in the screen and more than one site was mutated, reflecting a departure from single site saturation mutagenesis.

Before presenting the results of our literature analysis we review the most common methods used in typical directed evolution campaigns, to investigate the current state of the art and to define the classifications used in our subsequent analysis.


Result and discussion

Screening high specific activity mutants produced by directed evolution

For the purpose of enhancing the specific activity of ManAK at 37°C, an error-prone PCR-based directed evolution methodology was undertaken to construct a mutation library in P. pastoris. As described in Fig. 1, the mutation library was first constructed in E. coli DH5α, in order to gain a collection of recombinant plasmids that were used for the sequential transformation and selection of positive mutants in P. pastoris. Approximately 10 000 clones were picked up and inoculated into 48-well deep plates for the first round of screening, during which the mutants showing > 20% higher enzymatic activity than the control group (

200 clones) were selected for the second round of screening. The second round of screening was also performed in 48-well deep plates to further confirm the results from the first round of screening. From this, 120 clones were left for the third round of screening in 100-ml shake flasks, and 42 transformants showing enhanced enzymatic activity were used for preparation and purification of enzymes. Because these mutants were fusion proteins with 6xHis tags at the C-terminal end, the target protein was easily purified by the Ni-affinity chromatography method. Finally, four positive mutants (P191 M, P194E, S199G and S268Q) with enhanced specific activity were selected for biochemical analysis. Multiple copies of target genes with apparently enhanced enzymatic yield can be achieved in P. pastoris (Teng et al., 2015 ). This might be responsible for most of the excluded mutants in the final selection process. Therefore, making directed evolution in S. cerevisiae by using episomal vectors, and thereafter transferring the best variants to P. pastoris for overproduction in bioreactor might be a good option for future campaigns (Molina-Espeja et al., 2015 ).

Biochemical characterization and kinetics of positive mutants

Resolution of mutants by sodium dodecyl sulphate-polyacrylamide gel electrophoresis (SDS-PAGE) indicated that four positive mutants shared the same molecular weight (

55 kDa) with the wild-type enzyme (Fig. S2), and the molecular weight of deglycosylated mutants was about 45 kDa, which was 10 kDa lower than their glycosylation forms. This suggested that the glycosylation pattern of ManAK was not apparently altered by these four single-point mutations (P191 M, P194E, S199Gand S286Q) (Cheng et al., 2015 Liu et al., 2021 ). As shown in Table 1, the specific activities of four mutants were improved, with 25.5%–60.9% enhancement compared with ManAKH. According to the temperature profiles (Fig. 2A and B), three mutants (P194E, S199G and S268Q) shared the same optimal temperature (75°C), while that of the P191 M mutant decreased to 70°C. It was also noteworthy that the relative activities of the four positive mutants at a low temperature range (40–65°C) were also apparently increased compared with ManAKH. This might be responsible for their enhanced specific activities at 37°C. As for their thermostability, the half-life (t1/2) parameters for enzymatic activity at 75°C of P191 M and P194E were decreased to 1.5 min and 15.6 min, respectively. By comparison, no significant changes to the thermostability of the S199G and S268Q mutants were detected. Compared with ManAKH, the pH properties of the four mutants were virtually unchanged (Fig. 2C and D). The optimal pH scales for enzymatic activity were 2–6, and these four mutants retained 80% of their initial activity after a 2-h pre-incubation at different buffered pHs (pH 2–8).


Watch the video: Genetiki Viologia (December 2021).