Engineering enzymes for noncanonical amino acid synthesis
Patrick J. Almhjell
,
Christina E. Boville
, and
Frances H. Arnold
*
Division of Chemistry and Chemical Engineering 210-41, California Institute of Technology, 1200
East California Boulevard, Pasadena, California 91125, United States
Abstract
The standard proteinogenic amino acids grant access to a myriad of chemistries that harmonize to
create life. Outside of these twenty canonical protein building blocks are countless noncanonical
amino acids (ncAAs), either found in nature or created by man. Interest in ncAAs has grown as
research has unveiled their importance as precursors to natural products and pharmaceuticals,
biological probes, and more. Despite their broad applications, synthesis of ncAAs remains a
challenge, as poor stereoselectivity and low functional-group compatibility stymie effective
preparative routes. The use of enzymes has emerged as a versatile approach to prepare ncAAs, and
nature’s enzymes can be engineered to synthesize ncAAs more efficiently and expand the amino
acid alphabet. In this tutorial review, we briefly outline different enzyme engineering strategies
and then discuss examples where engineering has generated new ‘ncAA synthases’ for efficient,
environmentally benign production of a wide and growing collection of valuable ncAAs.
Key Learning Points:
•
Important applications of ncAAs in medicine, chemistry, and biology
•
Advantages and disadvantages of current approaches for synthesizing ncAAs
•
Strategies for enzyme engineering
•
Cases where ncAA synthases have been created or optimized with enzyme
engineering
•
Opportunities for further exploration and progress in biocatalytic ncAA synthesis
1. Introduction
The twenty canonical
L
-
α
-amino acids (Scheme 1a) that serve as the primary basis of protein
structure and function comprise only a small fraction of biologically and technologically
important amino acids. Noncanonical amino acids (ncAAs), which are not naturally
incorporated into proteins during translation, contain unusual side chains, D stereochemistry,
or atypical backbone connectivity (Scheme 1b). These features in turn impart distinct
chemical and biological properties, such as greater stability
in vivo
. These properties have
*
Corresponding author: frances@cheme.caltech.edu.
Conflicts of Interest
There are no conflicts to declare.
HHS Public Access
Author manuscript
Chem Soc Rev
. Author manuscript; available in PMC 2019 December 21.
Published in final edited form as:
Chem Soc Rev
. 2018 December 21; 47(24): 8980–8997. doi:10.1039/c8cs00665b.
Author Manuscript
Author Manuscript
Author Manuscript
Author Manuscript
elicited considerable interest, and ncAAs are used as therapeutics
1
and synthetic
intermediates,
2
and are even encoded directly into proteins to confer useful new features.
3
Noncanonical amino acids are challenging to synthesize because they often contain a
stereocenter at the
α
-carbon, which must be set in a precise configuration. In addition, the
amine and carboxylic acid groups are reactive and often have to be protected. These
problems are compounded when ncAAs have complex side chains that contain additional
stereocenters or reactive functional groups. Nature circumvents such challenges by using
enzymes, which bind and position substrates to accelerate a specific reaction, making
enantiopure amino acids in aqueous media without the need for protecting groups. However,
many enzymes that produce ncAAs in nature are not suitable for preparative ncAA synthesis
due to low activity, poor endogenous expression and stability, need for allosteric activation,
or limited substrate scope. Furthermore, many ncAAs are naturally synthesized via
complicated multi-enzyme cascades, which may be difficult to identify and use for synthesis
at scale. New strategies are needed for ncAA synthesis, and engineering new or improved
enzymes offers some promising leads.
A number of enzymes have activities that could be used to make ncAAs, and protein
engineering has been an indispensable tool for expanding this latent potential. Researchers
have been able to requisition existing enzymes and engineer them to create high-yielding
biocatalytic platforms that generate enantiomerically pure ncAAs. We will refer to these
enzymes as ‘ncAA synthases’, since they are used to couple two molecules without
requiring additional energetic input (e.g., ATP). Well-designed mutagenesis and screening
strategies facilitate the engineering process, and iterative rounds of mutagenesis and
screening (directed evolution) can produce greatly improved ncAA synthases; expanded
substrate scopes, enhanced stability and heterologous expression, and increased yields are all
achievable with the appropriate experimental design. In this tutorial review, we will illustrate
principles of protein engineering and evolution in the context of ncAA synthases. To focus
the scope of this review, we will primarily explore the synthesis and applications of
α
-amino
acids, and direct the readers interested in different configurations to other excellent works
4
,
5
.
We will begin by introducing the importance of ncAAs in chemistry and medicine and
comparing current methods for producing ncAAs. We will then cover principles of protein
engineering and evolution before exploring cases where researchers have applied these
principles to address fundamental shortcomings in the synthesis of valuable ncAAs. Finally,
we will discuss how directed evolution could provide future biocatalyst improvements.
2. Applications of ncAAs in chemistry, medicine, and biology
Even with advances like photo-redox chemistry, metal-catalyzed cross-coupling, and
asymmetric catalysis, the bottleneck for drug discovery is chemical synthesis.
2
Important
pharmaceutical functionalities such as chiral amines,
N
-heterocycles, and unprotected polar
groups are challenging to work with in synthesis, but the incorporation of ncAAs directly
into synthetic pipelines can bypass many of these difficulties. For example, the diabetes
medication saxagliptin (Scheme 2) contains a chiral amine as well as a challenging
β
-
quaternary center that is important for the drug’s activity.
1
Improved methods to synthesize
this drug have focused on producing the ncAA
L
-
α
−3-hydroxy-1-adamantyl-glycine
Almhjell et al.
Page 2
Chem Soc Rev
. Author manuscript; available in PMC 2019 December 21.
Author Manuscript
Author Manuscript
Author Manuscript
Author Manuscript
enzymatically for use as a building block in the synthesis of saxagliptin.
6
Other medicines
contain alkaloids, pharmaceutically important natural products derived from amino acids.
Alkaloids and similar compounds make up many essential medicines such as dopamine
(heart failure), codeine and morphine (analgesic), vincristine and irinotecan (cancer), and
quinine (antimalarial).
7
As biologically active and synthetically useful molecules, it is
unsurprising that ncAAs are present in 12% of the 200 top-grossing drugs.
8
Incorporating
ncAAs and ncAA-derived products into synthetic pipelines allows important
pharmaceuticals to be synthesized more easily than ever before. However, few ncAAs are
readily available, and improved synthetic and biocatalytic methods are needed to realize
their full potential.
Protein therapeutics including peptides and antibodies also make use of ncAAs. Therapeutic
peptides have been used since the 1920s, when insulin was extracted from animal pancreases
for diabetes treatment. Peptide drugs remain important to this day, with more than 60
approved for use.
1
However, most natural peptides are not suitable therapeutics because they
are present only in low concentrations and are susceptible to proteolysis, limiting
bioavailability. Incorporation of ncAAs with the
D
-configuration, unnatural backbones, or
bulky side chains can reduce proteolysis, and modified side chains can tune biological
specificity and pharmacokinetics.
1
For example, cyclic antimicrobial peptides such as
daptomycin disrupt the membranes of infectious microorganisms. Daptomycin incorporates
the ncAAs kynurenine, ornithine, and (2
S
,3
R
)-methylglutamate, as well as three
D
-amino
acids (Scheme 3).
Antibodies and antibody-drug conjugates (ADCs) are another class of protein therapeutics
that benefit from access to ncAAs. ADCs are versatile therapeutics composed of a
chemotoxic agent coupled via an amino acid linker to an antibody that specifically targets a
cellular component with limited side effects.
9
A common linker is a dipeptide composed of
valine and the ncAA citrulline, which is cleaved in the lysosome to release the toxic
‘payload’ (Fig 1). The linker and payload are typically attached to the antibody via non-
specific modifications of surface-exposed cysteine and lysine residues. With this non-
specific conjugation, the payload may be attached at different positions and in different
concentrations, resulting in a heterogeneous drug whose pharmacokinetics, safety, and
efficiency are not well-defined. Incorporation of ncAAs into ADCs can provide site-specific,
bio-orthogonal attachment points for the linker, affording tunable and reproducible control
over the payload concentration.
9
Amino acid sequence dictates protein tertiary structure and affects protein function,
localization, recognition, and post-translational modification. Consequently, incorporation of
ncAAs at certain positions within a protein sequence can be used to modulate the physical
and chemical properties of that protein. For example, global replacement of methionine with
selenomethionine provides heavy atoms for X-ray crystallography,
10
while replacement of
amino acids with their fluorinated analogues can influence the substrate specificity and
stability of enzymes.
11
Furthermore, genetic code expansion permits the site-specific
incorporation of ncAAs into proteins where promiscuous global replacement might be
undesirable or impossible.
3
,
10
For example, ncAAs with side chains such as the
environmentally sensitive fluorophore 7-hydroxycoumarin, the metal chelator 2,2’-
Almhjell et al.
Page 3
Chem Soc Rev
. Author manuscript; available in PMC 2019 December 21.
Author Manuscript
Author Manuscript
Author Manuscript
Author Manuscript
bipyridine, and the metal-binding fluorophore 8-hydroxyquinoline have unique properties
that can be used to probe biomolecular interactions or induce metal-dependent assembly or
fluorescence.
12
,
13
Additionally, ncAAs with reactive unsaturated aliphatic, azido, and
carbonyl side chains can be used as site-specific bio-orthogonal handles for chemical
modification.
14
The ability to selectively manipulate proteins through the incorporation of
ncAAs promotes the understanding and engineering of protein stability, activity, and
mechanism.
3. Methods for ncAA production
Although ncAAs are valuable chemical and biological tools, their applications are limited by
inefficient routes of production. The most popular approaches, such as extraction from
protein hydrolysate, fermentation, chemical synthesis, and biocatalysis fall short in terms of
cost, yield, or scope.
15
–
17
Extracting amino acids from hydrolyzed proteins is excellent for
large-scale production, especially when sourced from inexpensive industrial byproducts such
as hair, meat, or plants. However, this is only suitable for naturally occurring ncAAs with
unique physicochemical properties that enable purification, such as reactive side chains or
extreme pH stability.
Amino acids are also produced on a large scale by microorganisms that convert sugars and
other feedstocks into the desired products.
15
Bacterial strains of
Escherichia coli
or
Corynebacterium glutamicum
have been extensively engineered for efficient metabolism and
mitigated stress response to enhance yields.
18
Production by fermentation, however, requires
an organism with the capacity to synthesize the ncAA. This is a major complication, since
biosynthetic pathways for many ncAAs either give poor yields or are simply unknown.
Chemical synthesis can access numerous ncAAs by employing intermediates such as serine-
derived lactones, hydantoins, or aziridines (Scheme 4, blue).
19
,
20
A significant benefit of
chemical synthesis approaches is their broad applicability, allowing a variety of ncAAs to be
produced from a single synthetic pipeline. Limitations are also apparent: chemical synthesis
can be labor-intensive, utilize hazardous reagents and produce significant waste products, or
generate racemic products that require further purification.
An increasingly useful approach to preparing ncAAs is biocatalysis, which can either
replace or supplement chemical synthesis and fermentation with enzymes (Scheme 4, red).
4
,
16
Enzyme-catalyzed reactions benefit from mild reaction conditions and a broad range of
biocatalysts that can be used in derivatization or bond-forming reactions. For example,
aminotransferases are used to form chiral amines by transferring the amino group of one
amino acid to a prochiral
α
-keto acid while setting the stereochemistry at the
α
-carbon. The
process for manufacturing Januvia, a diabetes drug, incorporates an engineered
aminotransferase that replaces two steps of the chemical synthesis route.
21
Other enzymes
such as lyases capitalize on accessible, non-hazardous starting materials to synthesize
diverse optically pure ncAAs. For example, ammonia lyases can catalyze the asymmetric
amination of inexpensive, prochiral substrates such as fumarate or cinnamic acid derivatives
to make optically pure ncAAs (discussed in Section 5.1).
22
,
23
Other enzymes, such as
tyrosine phenol lyase (TPL)
24
and tryptophan synthase (TrpS),
25
,
26
can generate more
Almhjell et al.
Page 4
Chem Soc Rev
. Author manuscript; available in PMC 2019 December 21.
Author Manuscript
Author Manuscript
Author Manuscript
Author Manuscript
complexity from even simpler substrates by coupling a nucleophilic side chain to an amino
acid backbone (discussed in Section 5.2).
Although biocatalysts perform impressive chemical transformations, applications are limited
by reaction and substrate scope as well as enzyme stability and compatibility with process
conditions. Improved biocatalysts are needed for simple, green, and cost-effective access to
a variety of ncAAs. Protein engineering can improve catalyst performance, expand the scope
of substrates the enzymes can accept, and even generate new activities. Methods of directed
enzyme evolution are especially powerful for rapid engineering. In the following sections of
this review, we will focus on opportunities for engineering ncAA synthases that generate
complex products from simple (e.g., prochiral or inexpensive) starting materials. We believe
that these biocatalysts offer special advantages that will significantly improve the synthesis
of ncAAs.
4. Enzyme engineering strategies
This review will illustrate several successful strategies for engineering improved ncAA
synthases. Because the relationship between enzyme sequence and function is still largely
unknown, we will focus primarily on methods in which collections of mutant enzymes—
called ‘libraries’—are generated and screened for desired properties. This process provides
the basis for discovering beneficial mutations. A single round of mutation and screening is
sufficient to engineer an improved enzyme as long as the library contains the improved
variant and screening can accurately identify it. Indeed, several examples described in
Section 5 describe enzymes that have shown substantial improvements in a single round of
mutation and screening. However, accumulating mutations in iterations of mutagenesis and
screening, an approach known as directed evolution, can produce even better enzymes. The
mutagenesis and screening strategies are the foundation of enzyme engineering, and success
in an engineering project rests on the appropriate selection of complementary methods.
The first step is to select an enzyme as the starting point, known as the ‘parent’ enzyme.
Even poor baseline activity for the reaction of interest can be the foundation of a good
evolved enzyme, provided improvements in activity can be measured reliably, as discussed
below. If the starting activity is too low, it can be worthwhile to explore homologous
enzymes as parents, as they can differ drastically in substrate scope, activity, and stability.
27
Enzymes with high stability are often preferred parents, especially for directed evolution, as
they can support the accumulation of activating but often destabilizing mutations.
28
Starting from the parent gene, diversity is introduced to produce a collection of mutant
genes, which are used to transform a suitable host organism. Standard microbiology
techniques are typically used to array individual clones—containing individual mutant genes
—into physically separated compartments, such as the wells of microtiter plates, where gene
expression and protein production take place. The resulting library of enzyme variants is
then assayed for the desired activity with an appropriate screen. (It should be noted that
some screening techniques, such as those that use cell sorting instruments, as well as most
types of selections, do not require that the cells containing the protein variants be physically
separated prior to screening or selection, but instead rely on the technique itself to
Almhjell et al.
Page 5
Chem Soc Rev
. Author manuscript; available in PMC 2019 December 21.
Author Manuscript
Author Manuscript
Author Manuscript
Author Manuscript
accomplish separation. These methods, however, provide enrichment rather than full
separation of individual clones.) The sensitivity and reproducibility of the screen determines
the improvement that can be measured reliably and therefore discovered in these
experiments, while the mutagenesis method determines the sequence diversity that is
searched. Mutagenesis and screening methods should be evaluated together, as the
combination is critical to the success of any enzyme engineering project. The goal is to
generate libraries that are sufficiently rich in improved enzyme variants that the screening
method can find them efficiently.
The first rule to remember when developing a protein engineering screen is: you get what
you screen for. In other words, the screen needs to report faithfully on the desired property
or set of properties. Developing a very high throughput screen for synthesis of a ncAA can
be challenging, especially when the starting yield is low, and/or the product amino acid does
not have a chromophore or fluorophore that make it readily visible in a complex medium
with high concentrations of substrates and other species. Indirect assays, such as surrogate
screens (see Section 5.1.2) or selections (a method of associating the fitness of an organism
with the activity of an enzyme) can be used, but this can and often does result in variants that
perform well in the surrogate screen or selection but not in the desired task; there are of
course many notable successes. Additionally, if the assay requires purification steps or
additional reactions, the throughput may be limited to a few hundred, rather than thousands,
of samples per day. The sample throughput capacity and cost determine whether one has to
take a more ‘designed’ approach to making variants or whether one can use more agnostic
mutagenesis methods such as random mutagenesis of the whole sequence. Because new
mutagenesis and library construction protocols appear regularly, we will not go into any
detail on specific methodologies. Rather, we will discuss four general classes of library
construction and the screening techniques commonly associated with them. Readers may be
interested in recent reviews from the Hilvert
29
and Liu
30
groups for further details on
selections and library construction methods.
The mutagenesis approach requiring the highest level of design is one in which the protein
engineer has identified one or more specific residues that influence enzyme behavior and
then decides which mutation to make at those residues using site-directed mutagenesis
methods. Making informed decisions, however, usually requires detailed structural
information, and success depends on having an accurate assessment of how a given mutation
will affect biocatalyst structure and function. An example of site-directed mutagenesis is
targeted modification of the active site to improve substrate access. Bulky residues (e.g.,
leucine or phenylalanine) in the active site may be mutated to smaller ones (e.g., alanine) to
improve binding of larger substrates. Such ‘designed’ mutations can provide a starting point
for further improvement by directed evolution or may even be sufficient to generate an
efficient biocatalyst. Since this approach explores a very limited sequence space, success
depends entirely on the validity of the hypothesis. It is important to keep in mind that failed
attempts are rarely published.
A somewhat more agnostic approach is to use site-saturation mutagenesis to sample most or
all possible mutations at a given residue or set of residues. Beneficial mutations found by
screening saturation mutagenesis libraries can be accumulated in sequential rounds (see
ref.
Almhjell et al.
Page 6
Chem Soc Rev
. Author manuscript; available in PMC 2019 December 21.
Author Manuscript
Author Manuscript
Author Manuscript
Author Manuscript
31
for further reading), by recombination (see below), or by screening combinatorially
saturated mutant libraries. If positions are sampled one or two at a time, these libraries are
small enough to be analyzed by techniques such as high-performance liquid chromatography
(HPLC), gas chromatography (GC), mass spectrometry (MS), and even thin layer
chromatograph (TLC) or nuclear magnetic resonance (NMR). Of course, success with site-
saturation mutagenesis depends on whether the engineer has chosen the right residue(s) to
target. For many enzymes and properties this is not easy, and libraries at many positions may
have to be screened in order to find beneficial mutations.
In directed evolution, random mutagenesis is often used to introduce mutations at a low
frequency throughout the gene of interest. A great advantage of random mutagenesis is that
no prior knowledge of the enzyme structure or mechanism is required: the experiment tells
you what is important. For example, random mutagenesis and screening can identify
mutations that mimic allosteric activation, which may be distributed throughout the protein.
32
Random mutagenesis and screening often finds activating mutations that are distant from
the active site or substrate binding pocket, at residues that would likely not be chosen by any
‘design’ methods. (Examples of this can be seen in Section 5.2.2.) However, there are costs:
many randomly-mutated variants are parent-like, or unaltered in the property screened, and
therefore significant screening effort is required to identify the rare beneficial mutations.
Often this means screening hundreds to thousands of samples. Furthermore, random
mutagenesis by the most convenient method, error-prone PCR, only accesses a fraction of all
the possible single amino acid substitutions (roughly 6 out of 19) as it traverses the codon
table via single nucleotide mutations; at the low mutation rates used for directed evolution,
there is an underwhelming probability of making two mutations within the same codon.
Recombination, performed either
in vitro
or
in vivo
, parallels the natural events of
homologous recombination, such as the shuffling of chromosomal DNA during meiosis or
diversity generation during V(D)J antibody recombination.
33
,
34
Rather than generating new
mutations, recombination navigates the evolutionary landscape by leveraging genetic
diversity that already exists. One can recombine homologous protein sequences to make
chimeric proteins or recombine previously identified mutations to create new combinations
in a single sequence. The latter is useful when several beneficial mutations are found in one
generation of random mutagenesis. Stemmer’s ‘DNA shuffling’ method
35
and Zhao’s
‘staggered extension’ method
36
perform random mutagenesis and recombination in one
operation. Sequence information is all that is necessary for most recombination methods.
In choosing a mutagenesis strategy, the goal is to find and accumulate beneficial mutations
with the least combined mutagenesis and screening effort. Site-directed and site-saturation
mutagenesis are highly focused approaches and produce small libraries that can be screened
with relatively low-throughput techniques. The downside is that the targeted positions are
limited and may not yield beneficial mutations. Random mutagenesis samples a far greater
sequence space, but the frequency of beneficial mutations can be low, and a higher-
throughput screen is generally needed to find improvements. As any library is likely to
contain variants exhibiting a range of performance, it is crucial to establish a robust
screening system that discriminates improved enzyme variants from the parent.
37
Improved
variants can always be used for subsequent rounds of engineering in a directed evolution
Almhjell et al.
Page 7
Chem Soc Rev
. Author manuscript; available in PMC 2019 December 21.
Author Manuscript
Author Manuscript
Author Manuscript
Author Manuscript
approach. Making incremental improvements is an effective way to navigate the enzyme’s
fitness landscape to create an exceptional enzyme for a given task; this evolutionary search
process can even create entire lineages of enzymes that excel at different tasks (see Fig 9 in
Section 5.2.2).
Although not discussed here, computational models are also showing promise for enzyme
engineering, with small
in silico
libraries yielding high frequencies of activating mutations
that reduce screening efforts. This typically requires identifying a parent enzyme with an
appropriate and well-understood mechanism and then computationally redesigning that
enzyme based on the desired mechanism. When used correctly, this approach can produce a
slightly active enzyme for further evolution or can even create highly proficient biocatalysts,
as demonstrated in a recent report from the Janssen and Wu groups on the development of
β
-
ncAA synthases.
5
Importantly, it is now possible to order entire libraries of mutant genes
made to individual specifications from various DNA suppliers. While expensive, synthetic
libraries of individual genes reduce labor associated with library construction and validation
and also reduce screening requirements compared to libraries made by various
randomization methods.
5. Engineering improved ncAA synthases
In this section, we discuss examples where enzyme engineering principles described above
have been applied to improve biocatalytic ncAA synthesis. These examples highlight key
aspects of engineering proteins through mutation and screening to quickly achieve
functional goals and cover different parent enzymes, mutation strategies, screening
approaches, and reaction conditions.
5.1 Asymmetric carbon-nitrogen bond formation by ammonia lyases
Ammonia lyases catalyze reversible carbon-nitrogen bond cleavage to produce a
trans
-
α
,
β
-
unsaturated carboxylic acid and ammonia (Scheme 5).
38
Of interest in this review are
ammonia lyases that competently catalyze the reverse reaction, resulting in the asymmetric
addition of ammonia to synthesize enantiopure
α
-amino acids. This occurs with readily
available, prochiral substrates such as fumarate or cinnamic acid analogues. Ammonia is
activated within the enzyme, either by internal residues or by a special electrophilic group,
which facilitates nucleophilic attack at the electrophilic alkene followed by proton donation
to the
β
-carbon by a catalytic base (Scheme 5). Ammonia lyases act on a diverse set of
substrates, such as aspartate (aspartate ammonia lyases, or DALs),
β
-methylaspartate
(MALs), and the aromatic amino acids (PALs, TALs, and HALs for phenylalanine, tyrosine,
and histidine, respectively). This class of enzymes has the potential to access diverse types
of functional ncAA products, from
β
-substituted aspartate analogues to aromatic ncAAs
(also called arylalanines).
5.1.1 MAL: Accessing bulky
β
-substituted aspartate analogues.—
Of all the
ammonia lyases, those that act on aspartate and its analogues are among the most specific.
38
The MAL catalytic cycle requires the presence of a
β
-carboxylate (the aspartate side chain,
labeled blue in Scheme 6), which renders the
α
-carbon electrophilic and provides necessary
acidity at the
β
-carbon (labeled red in Scheme 6). In MALs, the species with the
Almhjell et al.
Page 8
Chem Soc Rev
. Author manuscript; available in PMC 2019 December 21.
Author Manuscript
Author Manuscript
Author Manuscript
Author Manuscript
electrophilic
α
-carbon (Scheme 6, rightmost resonance species) is further stabilized by a
bound magnesium ion that coordinates the negatively charged
β
-carboxylate. Because of
this, substitutions are better tolerated at the
α
-carboxylate group by MALs, enabling
production of certain
β
-ncAAs rather than the corresponding
α
-ncAA that lacks the
aspartate side chain (Scheme 7). This approach has recently been used with great success to
computationally engineer an aspartate ammonia lyase for
β
-amino acid production.
5
Therefore, if used to synthesize
α
-ncAAs, MALs are best suited to synthesize aspartate
analogues as they contain the necessary
β
-carboxylate.
Wild-type MALs can produce
β
-methylaspartate—their eponymous natural substrate—from
the prochiral substrate 2-methylfumarate (Scheme 6, R = methyl). MALs show moderate
activity on substrates with slightly larger
β
-substituents such as ethyl, propyl, and ethoxy
groups.
23
Unfortunately, some of the more valuable aspartate analogues, such as the
excitatory neural ligand transporter inhibitor
threo
-3-benzyloxyaspartate, fail to show even
trace activity with MAL, which is attributed to the large size of the
β
-substituent.
To improve the biocatalytic synthesis of bulkier
β
-branched aspartate analogues, Raj and
colleagues engineered MAL from the bacterium
Clostridium tetanomorphum
(
Ct
MAL).
23
A
crystal structure of
β
-methylaspartate in complex with
Ct
MAL suggested that three residues,
F170, Y356, and L384, were likely to sterically occlude the binding of substrates with larger
β
-substituents (Fig 3). The authors prepared three separate site-saturation mutagenesis
libraries targeting these residues, hypothesizing that the active site could be modified to
enable access by bulkier substrates. After expressing the corresponding
Ct
MAL variants in
E. coli
, they screened for amination of the model bulky substrate 2-hexylfumarate, toward
which the wild-type enzyme showed no detectible activity. The authors used an absorbance
assay to screen for 2-hexylfumarate amination by monitoring the change in 270-nm
absorbance in the reaction mixture over time. Due to the conjugation between the terminal
carboxylate groups in fumarate analogues, the substrate absorbs strongly at 270 nm (shown
in Scheme 6). Following substrate amination to produce an
α
-amino acid, this conjugation is
broken and the absorbance at 270 nm is significantly decreased. Using this approach, the
authors identified L384A as a beneficial mutation that increased the yield of
β
-
hexylaspartate from 0% to 85% (Table 1). The
Ct
MAL L384A variant was also able to react
with fumarate, 2-hexylfumarate, and 2-benzyloxyfumarate, giving nearly complete
conversion within an hour at room temperature.
Chemical synthesis of
β
-branched aspartate analogues is challenging because, in addition to
containing two carboxylic acid groups and a reactive amine, they are diastereomeric, having
stereocenters at both the
α
- and
β
-carbon positions.
Ct
MAL L384A preferentially produces
the
threo
isomer from the prochiral substrate (Scheme 6), mirroring the native enzyme. This
specificity enabled synthesis of the valuable therapeutic compound
threo
-3-
benzyloxyaspartate in >95% excess over the
erythro
isomer. However, varying degrees of
diastereomeric excess (
de
) were seen for other substrates, especially for those that contained
thioether bonds, such as 2-benzylthioaspartate (Table 1). Nonetheless, this rapid engineering
of
Ct
MAL to access therapeutically relevant, optically pure, diastereomeric aspartate
analogues from prochiral starting materials demonstrates the utility of protein engineering in
ncAA synthesis.
Almhjell et al.
Page 9
Chem Soc Rev
. Author manuscript; available in PMC 2019 December 21.
Author Manuscript
Author Manuscript
Author Manuscript
Author Manuscript
5.1.2 PAL: Efforts to evolve a D-arylalanine synthase.—
Aromatic ncAAs have
diverse applications.
1
,
3
This has made ammonia lyases with expansive aromatic substrate
scopes, such as PALs, a prime target for application and for engineering.
38
PALs natively
catalyze the reversible decomposition of
L
-phenylalanine to
trans
-cinnamic acid and
ammonia (Scheme 8). This is accomplished with the aid of the highly electrophilic 4-
methylidine-5-imidazolone (MIO) group, which is unique in that it is not formally a
cofactor. Rather, it is formed intramolecularly from a short three-amino acid sequence within
the active site, in a manner reminiscent of GFP chromophore maturation.
38
In the synthetic
direction, the electrophilic MIO group enables amination by activating ammonia for
nucleophilic attack, forming a stereospecific carbon-nitrogen bond (Scheme 8). The catalytic
base of PALs must be even stronger than that of the aspartate-specific ammonia lyases, as
aryl side chains do not render the protons of the
β
-carbon as labile as carboxylate groups do.
PALs are an example of productive wild-type ncAA synthases with broad arylalanine-
synthase activity that can be used to synthesize ncAAs
in vitro
. For example, wild-type
PALs readily aminate halogenated cinnamic acids to generate the corresponding halogenated
phenylalanine analogues.
22
The Turner group has carried out extensive engineering of PAL from the cyanobacterium
Anabaena variabilis
(
Av
PAL) for straightforward arylalanine synthesis.
Av
PAL has a broad
substrate scope and is able to synthesize many different types of
L
-arylalanines with high
activity and enantioselectivity. However, it was noted that the enantioselectivity of the
reaction is compromised with electron-deficient substrates, such as nitrocinnamic acid.
39
Further analysis suggested that these substrates go through a MIO-independent pathway. In
this case, the enzyme is still required for proton abstraction at the
β
-carbon, but ammonia
does not need to be activated by MIO for nucleophilic attack, due to the enhanced
electrophilicity of the
α
-carbon (Scheme 9). The effect is strong enough to produce a
racemic mixture whenever electron density is directed away from the electrophilic alkene,
i.e., with
para
or
ortho
substitutions;
meta
-substituted substrates are less affected.
While this is an unfortunate side reaction of
Av
PAL, the
L
-
α
-amino acids have a privileged
position in nature, and enzymes that interact with them are abundant. A biocatalytic toolbox
is lacking for the
D
-configured amino acids, which are particularly useful in peptide-based
therapeutics as they can reduce degradation
in vivo
(see Section 2). For example, the
gonadotropin-releasing hormone antagonist cetrorelix contains three
D
-arylalanine ncAAs
(Scheme 10). Therefore, this effect could potentially be leveraged as starting activity for the
evolution of a
D
-arylalanine synthase.
To improve the practicality and efficiency of synthesizing
D
-arylalanine products,
Parmeggiani and colleagues engineered
Av
PAL to increase its potential as a
D
-arylalanine
synthase.
40
Site-saturation mutagenesis libraries were prepared by targeting 48 residues near
the active site that they hypothesized might influence the enantioselectivity of the reaction.
To design a screen, the authors considered an approach similar to the MAL reaction
discussed in Section 5.1.1, as the PAL reaction results in a wavelength change at 290 nm
during the reaction. However, this approach does not report on the stereochemistry.
Parmeggiani and colleagues addressed this by implementing an enzyme cascade that
specifically interacted with
D
-amino acids, producing an intense color change. This would
Almhjell et al.
Page 10
Chem Soc Rev
. Author manuscript; available in PMC 2019 December 21.
Author Manuscript
Author Manuscript
Author Manuscript
Author Manuscript
only occur within
E. coli
colonies expressing an
Av
PAL variant with improved
D
-arylalanine
synthase activity (Fig 4). This high-throughput technique afforded rapid sampling of ~5,000
colonies, with numerous
Av
PAL variants showing significantly faster rates of
D
-arylalanine
production.
However, the screen had an important drawback: it only reported on
D
-amino acid
production and provided no information on the corresponding production of an
L
-amino
acids that might be expected from the less enantioselective MIO-independent mechanism.
Indeed, further examination demonstrated that the variants identified in the screen did not
change the distribution of enantiomers; they simply produced a racemic mixture more
rapidly. It is not clear that mutations at the 48 targeted residues actually failed to enrich
D
-
arylalanine production, because the screen only reported on an increase in
D
-arylalanine
production, exactly as designed. For example, it could be the case that some mutants
enriched for the
D
-isomer, but did so more slowly and were therefore not apparent in the
screen. Additional controls to differentiate racemic from enantioenriched product formation
could be implemented in future studies, but this would make an already complex screen even
more complicated. A recent report by Zhu and colleagues demonstrated that a single active-
site mutation to
Av
PAL (N347A), introduced by site-directed mutagenesis, resulted in a 2.3-
fold enrichment in production of
D
-
p
-nitrophenylalanine over
L
-
p
nitrophenylalanine by
influencing the stereoselectivity of the reaction within the enzyme.
41
Av
PAL-N347A may
provide valuable starting activity for future directed evolution of an enantioselective
D
-
arylalanine synthase.
5.2 Synthesis of designer ncAAs through direct side chain addition to amino-acrylate
intermediates
Ideally, a ncAA synthase would be modular, in that it would attach desired side chains to an
amino acid backbone with perfect stereoselectivity. An advantage typically associated with
chemical synthesis, modularity allows different pieces to be incorporated into a diverse array
of products with the same technique (see Section 3, Scheme 4). The pyridoxal 5’-phosphate
(PLP)-dependent enzymes tyrosine phenol lyase (TPL) and tryptophan synthase (TrpS) have
this attractive feature. These enzymes catalyze the
β
-elimination of an
L
-amino acid
substrate to form an electrophilic amino-acrylate intermediate (Fig 5a). The amino-acrylate
is a versatile electrophile that allows diverse nucleophilic substrates to be incorporated as
amino acid side chains to form new
L
-
α
-amino acids. Due to the range of acceptable
nucleophiles, these enzymes are capable of C–C bond formation as well as C–N and C–S
bond formation. These enzymes also act with perfect enantioselectivity, as the
stereochemistry at the
α
-carbon is retained through proton abstraction and donation on the
same face of the amino-acrylate by the active-site lysine.
25
5.2.1 TPL: Active-site remodeling of tyrosine-analogue synthases.—
Tyrosine
phenol lyase (TPL) catalyzes the degradation of
L
-tyrosine (Tyr) to phenol, pyruvate, and
ammonia through a
β
-elimination reaction (Figs 5b and 6). The reaction is readily reversible,
and the addition of excess of ammonia and pyruvate shifts the equilibrium to favor Tyr
production. This occurs by promoting the formation of the electrophilic amino-acrylate
intermediate, which then reacts with phenol to form a C–C bond via an electrophilic
Almhjell et al.
Page 11
Chem Soc Rev
. Author manuscript; available in PMC 2019 December 21.
Author Manuscript
Author Manuscript
Author Manuscript
Author Manuscript
aromatic substitution mechanism.
42
Phenol is nucleophilic at positions
para
and
ortho
to the
electron-donating hydroxyl group, and the enzyme positions this substrate such that the C–C
bond is formed exclusively at the
para
position.
TPL has been used for industrial-scale Tyr production as well as for the preparation of
important ncAAs.
18
As early as the 1970s, TPL was found to synthesize the therapeutic Tyr
analogue
L
-DOPA directly from catechol, ammonia, and pyruvate (Fig 6, R
1
= OH, R
2
= H).
Fluorotyrosine, used to study redox-active tyrosine residues, is also synthesized by TPL via
addition of the corresponding fluorinated phenol (Fig 6).
43
Unfortunately, wild-type TPL
variants fail to produce Tyr analogues with substituents larger than fluorine; catechol seems
to be the exception.
TPL variants have been engineered to expand the nucleophiles accepted in the reaction. For
example, 3-methyltyrosine (3-MeTyr), an anticancer drug precursor, is synthesized through a
three-step, low-yielding, racemic chemical synthesis with many protecting and deprotecting
steps, followed by biocatalytic kinetic resolution.
24
To improve synthesis of 3-MeTyr,
Seisser and colleagues engineered TPL from the bacterium
Citrobacter freundii
(
Cf
TPL),
one of the most extensively studied TPL variants.
24
Using site-directed mutagenesis, the
authors targeted residues in the active site (F36, T125, M288, M379, and F448) that were
hypothesized to interact unfavorably with substituted phenols (Fig 6). To reduce the steric
restrictions, the hydrophobic residues were individually mutated to valine, while T125 was
mutated to serine, and these variants were then screened by HPLC. While many of the
mutations increased the production of 3-substituted phenols, the M379V variant was found
to produce 3-MeTyr, 3-methoxyTyr (another anticancer precursor), and 3-chloroTyr (an
atherosclerosis marker) with good yields (Fig 6, red).
Engineering efforts by the Wang group have expanded the capacity of
Cf
TPL to synthesize
designer ncAAs that can be incorporated into a protein and exhibit a specialized function. In
two studies they used site-saturation mutagenesis and a thin-layer chromatography (TLC)-
based screen to identify
Cf
TPL variants that could synthesize the desired ncAA. The first
study focused on making 3-(methylthio)-
L
-tyrosine (MtTyr; Fig 6, purple), which is a Tyr-
Cys cofactor mimic.
44
The Tyr-Cys cofactor is known to modulate enzyme kinetics and is
common in metalloenzymes. However, Tyr-Cys cross-linking can only occur when the
cysteine residue is positioned in a particular orientation relative to the tyrosine residue,
which is difficult or impossible to engineer in many proteins. The authors reasoned that
MtTyr might offer the properties of the Tyr-Cys cofactor in a single residue and that a TPL
variant could be used to synthesize it in the laboratory. Wild-type
Cf
TPL was selected as the
parent for engineering a MtTyr synthase due to its well-characterized ability to generate
other Tyr analogues.
44
However,
Cf
TPL had no activity toward the
o-
(methylthio)phenol
nucleophile that forms the side chain of MtTyr. To accommodate MtTyr, Wang and
colleagues individually targeted three active-site residues (F36, M228, and F448) for site-
saturation mutagenesis. The authors chose only 96
E. coli
clones from these libraries for
analysis using the TLC-based screen, which identified
Cf
TPL F36L as having improved
activity. This variant synthesized MtTyr with 40% yield at preparative scale, which could
then be purified and incorporated into proteins of interest using an evolved orthogonal
Almhjell et al.
Page 12
Chem Soc Rev
. Author manuscript; available in PMC 2019 December 21.
Author Manuscript
Author Manuscript
Author Manuscript
Author Manuscript
amino-acyl tRNA synthetase and amber stop codon suppression technology (as described in
ref. 10
).
In a following study, the Wang group engineered
Cf
TPL to synthesize the Tyr analogue 2-
amino-3-(8-hydroxy-5-quinolinyl)-
L
-alanine (HqAla; Fig 6, purple).
12
HqAla contains the
bidentate metal chelator 8-hydroxyquinoline (8-HQ) as its side chain. 8-HQ is a common
organic ligand of metal complexes noted for its high quantum yield of fluorescence,
particularly when bound to zinc(II). Wild-type
Cf
TPL again had no activity with the desired
nucleophile (8-HQ), so the authors repeated site-saturation mutagenesis targeting active site
residues F36, M228, and P448. This study targeted all three sites simultaneously, generating
a library of 4,096 possible variants.
Cf
TPL variants were again analyzed by TLC, and 1024
clones were screened to identify the double active-site mutant M228S/F448C that produced
HqAla with 40% yield. Again, the synthesis of the ncAA was sufficient for downstream
experiments and could be used to create proteins that exhibited zinc-dependent fluorescence,
highlighting the capacity of enzyme engineering to access new chemical and biological
space rapidly and effectively.
One mechanistic limitation of TPL is that the reaction is under thermodynamic control. The
forward and reverse reaction rates of the net reaction depend strongly on the concentrations
of products and reactants, and excess reactants are needed to drive product formation.
Ammonia lyases (discussed in Section 5.1) also have this limitation, as do
aminotransferases. The need for excess reagents is not a major issue when using TPL and
ammonia lyases for preparative-scale synthesis, since ammonia and pyruvate are inexpensive
and easy to exclude during purification. Nonetheless, it would be preferable to have the
reaction under kinetic control such that product formation is effectively irreversible,
improving atom economy and making
in vivo
applications more accessible. This type of
reaction is possible when using
Cf
TPL with specialized substrates. For example,
S
-(
o
-
nitrophenyl)-
L
-cysteine can undergo rapid
β
-elimination in the presence of
Cf
TPL, as the
nitrothiophenol side chain acts as a good leaving group (Fig 7).
42
This subsequently forms
the reactive amino-acrylate intermediate, which is attacked by phenol to produce Tyr.
Furthermore, because
Cf
TPL binds
S
-(
o
-nitrophenyl)-
L
-cysteine more tightly than Tyr, as
long as
S
-(
o
-nitrophenyl)-
L
-cysteine is present in the reaction it preferentially undergoes
β
-
elimination and inhibits Tyr degradation. This approach gives yields of ~70%.
5.2.2 TrpB: Evolution of stand-alone TrpB function from an allosteric TrpS
complex.—
Due to the reversible nature of enzymatic reactions under thermodynamic
control, the ncAA synthases discussed thus far have suffered from inherently low substrate
coupling efficiencies, with a high concentration of one or more substrates remaining upon
reaching equilibrium. Although there are ways to circumvent this by using specialized
substrates, an ideal biocatalyst would couple stoichiometric proportions of simple substrates
at high rates and with quantitative yields. Additionally, these biocatalysts could have
applications
in vivo
, as physiological concentrations of reactants could be sufficient to form
products. To accomplish this, the enzymatic reaction should be under kinetic control. This is
the case for tryptophan synthase (TrpS), which catalyzes the final steps of
L
-tryptophan (Trp)
biosynthesis.
Almhjell et al.
Page 13
Chem Soc Rev
. Author manuscript; available in PMC 2019 December 21.
Author Manuscript
Author Manuscript
Author Manuscript
Author Manuscript
TrpS a heterodimeric complex composed of an
α
-subunit (TrpA) that allosterically regulates
the
β
-subunit (TrpB).
25
In the native reaction, indole glycerol-3-phosphate undergoes a
retro-aldol reaction in TrpA to release indole. This induces TrpB to catalyze the
β
-
elimination of
L
-serine (Ser), which generates the amino-acrylate intermediate (Fig 8a).
Indole then diffuses through a hydrophobic tunnel connecting the subunits and attacks the
amino-acrylate to form Trp (Fig 8b). The wild-type TrpS enzyme can perform this C–C
bond-forming reaction with an array of indole analogues
in vitro
, synthesizing substituted
Trp analogues in a single step.
25
Numerous Trp derivatives have been made using this
strategy. For example, the Goss group demonstrated that
Salmonella enterica
TrpS can use
7-chloroindole and Ser to form 7-chloroTrp, part of the antibiotic rebeccamycin (Scheme
11a).
45
This reaction occurs in a single step, whereas nature would require an additional Trp
halogenase to add the chloro substituent to Trp. Additionally, nonindole nucleophiles have
been used to form C–S and C–N bonds, demonstrating that TrpS can also be a platform for
the production
L
-cysteine and
L
-
β
-aminoalanine ncAAs.
25
Although TrpS is an impressive biocatalyst, there are roadblocks for its application.
Expression of the TrpS complex is metabolically challenging for the host cell, and the need
for both the TrpA and TrpB subunits complicates expression and engineering. TrpB
performs the synthetically interesting
β
-substitution reaction between indole and Ser to
generate Trp, while TrpA generates indole
in situ
so that this toxic metabolite is not released
into the cytosol. If the indole analogues are added exogenously, then TrpA is superfluous,
but removing TrpA significantly decreases the activity of TrpB, due to the allosteric
interactions between the subunits of TrpS.
26
Buller and colleagues engineered a stand-alone TrpB ncAA synthase by directed evolution
of TrpB from the hyperthermophilic archaeon
Pyrococcus furiosus
(
Pf
TrpB; evolution
shown in Fig 9, red).
26
Because it was unknown whether directed evolution could recover
the activity lost by removal of TrpA and, if so, what mutations would be beneficial, Buller
used random mutagenesis to evolve the stand-alone
Pf
TrpB. Variants were screened for Trp
formation by monitoring an increase in 290-nm absorption, caused by a slight red-shift in
the absorption of indole as it is converted to Trp. Impressively, nearly 4% of the 528 first-
generation variants of
Pf
TrpB screened displayed an increase in Trp formation. This is a
higher rate of beneficial mutations than is usually seen in a random mutagenesis experiment
and shows that there are many possible ways to reactivate TrpB. The greatest single
improvement came from a T292S mutation that restored the
k
cat
to that of wild-type
Pf
TrpS
(
Pf
TrpB
2G9
, which we will simplify to
Pf
2G9). An additional five mutations (P12L, E17G,
I68V, F274S, and T321A) resulted in a
Pf
TrpB variant whose
k
cat
exceeded that of the wild-
type TrpB threefold (
Pf
0B2). Interestingly, none of these six beneficial mutations were in the
active site, but rather were distributed throughout the TrpB structure. Further analysis
showed that these mutations recapitulate the action of TrpA and stabilize the enzyme’s
‘closed’ conformation, a state that promotes formation of the reactive amino-acrylate
intermediate.
32
After establishing a stand-alone
Pf
TrpB platform, Buller’s team engineered catalysts for
making
β
-methyltryptophan (
β
-MeTrp) analogues.
46
The ncAA
β
-MeTrp is a component of
biologically important molecules such as indolmyin and streptonigrin (Scheme 11b). The
Almhjell et al.
Page 14
Chem Soc Rev
. Author manuscript; available in PMC 2019 December 21.
Author Manuscript
Author Manuscript
Author Manuscript
Author Manuscript