Coupled Reaction Networks for Noise Suppression

Noise is intrinsic to many important regulatory processes in living cells, and often forms obstacles to be overcome for reliable biological functions. However, due to stochastic birth and death events of all components in biomolecular systems, suppression of noise of one component by another is fundamentally hard and costly. Quantitatively, a widely-cited severe lower bound on noise suppression in biomolecular systems was established by Lestas et. al. in 2010, assuming that the plant and the controller have separate birth and death reactions. This makes the precision observed in several biological phenomena, e.g., cell fate decision making and cell cycle time ordering, seem impossible. We demonstrate that coupling, a mechanism widely observed in biology, could suppress noise lower than the bound of Lestas et. al. with moderate energy cost. Furthermore, we systematically investigate the coupling mechanism in all two-node reaction networks, showing that negative feedback suppresses noise better than incoherent feedforward achitectures, coupled systems have less noise than their decoupled version for a large class of networks, and coupling has its own fundamental limitations in noise suppression. Results in this work have implications for noise suppression in biological control and provide insight for a new efficient mechanism of noise suppression in biology.


I. INTRODUCTION
Many important processes in living cells, such as gene expression, are intrinsically stochastic [1]. The effect of noise is further enhanced by several biological factors, such as the low copy number of many important molecules including DNA, and the fact that even controllers performing noise suppression consists of intrinsically stochastic molecular components themselves [1], [2], [3]. In particular, assuming that the controller and the plant dynamics are separate biochemical processes, Lestas et. al. [2] showed that there is a severe lower bound on the noise for the plant component of a chemical reaction network due to the intrinsic stochasticity of controller components in chemical reactions. Quantitatively, it states that the lower bound for noise is typically inversely proportional to the quartic root of the signaling rate, in contrast with a square root result that is familiar in electrical engineering and statistics. This bound implies that significant noise is inevitable, despite the regulatory mechanisms found in biology that could suppress noise [4].
Specifically, the bound implies that noise suppression in cells is usually prohibitively expensive, as reducing noise by 10 fold would require 10,000 fold increase in signaling rate, which corresponds to 10,000 fold faster production and degradation of molecules.
On the other hand, we observe almost deterministic precision in many biological processes, such as the synchronization of molecular oscillations in mammalian circadian clock [5], precise timing of gene expression dynamics in cell cycle [6] and the cell type differentiation through gene expression patterns during development [7]. This observation naturally urges one to seek biologically relevant low-cost noise suppression mechanisms that are beyond the scope of systems investigated in [2].
In this work we propose coupling as such a noise suppression mechanism that is omnipresent in biological processes. Coupling is used here to describe chemical reactions where the number of more than one chemical species are altered simultaneously, e.g., reactions converting one species into another one, or those producing one molecule each of two species simultaneously. Hence, coupling naturally goes beyond the assumption in [2] that the plant and the controller dynamics are separate biochemical processes, since the increase and/or decrease of a plant component and a controller component could be coupled in a reaction once we allow coupling.
Many natural and genetically engineered circuits in biology have a coupling interpretation. In bacteria, genes are commonly grouped into operons, so they are transcribed and regulated together [8]. Non-coding RNAs with regulatory functions such as microRNA and siRNA are commonly or designed to be in the same transcript with the mRNA they regulate [9], [10]. In metabolic networks, signaling cascades and allosteric regulation, one protein may switch from an inactivated state to an activated state by a reaction which may be regulated by the protein or its downstream products [8], [11]. Regulatory functions in synthetic biology are commonly implemented in binding proteins linked with functional domains, where the link is cut to transform the protein-domain complex into a protein and a domain separately, which could have different regulatory properties than the complex [12], [13].
The fact that specific coupling mechanisms, e.g. putting genes on the same operon, could influence the system's performance is commonly known. Coupling as a noise suppression mechanism is also rather intuitive from an informationtheoretic point of view. Indeed, earlier studies [10], [14], [15], [16] have shown a few specific examples of coupled reactions could have less noise than the decoupled versions. However, it is not known what are the general conditions for coupling to suppress noise, the fundamental limitations on noise suppression once we allow coupling (in the spirit of [2]), or how to use coupling when designing biochemical network architectures.
In this work we set off to provide some initial answers to these problems. After reviewing the notions of chemical reaction networks and methods to analyze them in Section II, we show how coupling could suppress noise through a simple example in Section III-A. We then demonstrate the power of coupling via a biologically plausible coupled system that suppresses noise below the lower bound of Lestas et. al. [2] in Section III-B. Following Lestas et. al. [2], we establish a similar bound for coupled reaction networks in Section III-C to better understand how it compares with decoupled systems and when is coupling most useful. Lastly, in Section III-D, we investigate all possible two-node chemical reaction network architectures to gain insight on the advantage of coupled networks versus decoupled ones, feedback versus feedforward architectures, limitations of coupling, and how to choose network architectures to maximize coupling's effect on noise suppression.

A. Chemical Reaction Networks
Here we provide a minimum background on stochastic chemical reaction networks (CRNs) and relevant analysis methods. For more details, see [3], [17], [18].
A chemical reaction network (CRN) is a collection of reactions of the following form: where x ∈ Z n ≥0 is the state vector describing the number of molecules of the n chemical species in the system, k = 1, ..., m indexes m reactions, and r k : R n → R ≥0 is the propensity function which describes the rate of reaction k. Technically, the system dynamics is described as a continuous-time Markov jump process with x ∈ Z n ≥0 as states of the system, and r k (x)∆t describes the probability that, if the system is in state x at time t, it will jump to state x + d k within time ∆t, for small ∆t. Note that even though the mass action law, where r k are always a special form of polynomial of x, is commonly assumed [19], we do not make that assumption here.
The stochastic description of the system dynamics is described in terms of the chemical master equation (CME), which describes the evolution of the probability distribution for molecular counts x [20]: where P (x, t) is the probability that the system has x number of molecules at time t.
For the mean, we have d dt , where the last equality was used as the definition for R + i and R − i . R + i , the birth flux of x i , is the summed rate of all reactions that increases x i , while R − i , the death flux, is the summed rate of all reactions that decreases x i . Note that, at steady state, we have R + i = R − i . It is worth noting that, in the limit where the reaction volume is large while the number of molecules per volume remains finite (made precise in [21]), the system dynamics can be described by a deterministic rate equation: wherex ∈ R n ≥0 is the continuous concentrations of chemical species, instead of discrete molecular counts x ∈ Z n ≥0 . Through similar calculations as in the equation for the mean, we see that, at steady state, we have the following equation for covariance: One major achievement of the theoretical investigations in [3] is the re-writing of the above equation using physically interpretable quantities. The re-written equation is as follows: Here τ i , the average life time of an x i molecule, and s i|j , the average step sizes or co-step sizes, are introduced. As these are key concepts utilized for this work, they are explained in detail below.
Step sizes. For i = j, we define the average step size of x i as s i|i ≡ k |d ik | ρ ik , where ρ ik = |d ik | r k k |d ik | r k , the probability that when x i changes, that change comes from reaction k. The notation s i|i signifies that this is average step size of x i conditioning on that x i changes. For example, if x 1 has only one birth reaction x 1 → x 1 + 1 and one death reaction x 1 → x 1 − 10, then s 1|1 = 1+10 2 , because birth and death fluxes are always equal at steady state.
For i = j, the co-step size is defined as s j|i ≡ k ρ ik |d jk | sgn {d ik d jk }. So s j|i is the average change of species x j conditioning on that x i changes simultaneously in these reactions. Note that this could be positive or negative. From the definition, we see that the co-step size s i|j captures whether birth and death of x i and x j are coupled in some reaction. For example, if the only reaction that have simultaneous changes to x 1 and x 2 is (x 1 , x 2 ) → (x 1 − 1, x 2 + 1), i.e. one x 1 becomes one x 2 , and x 2 have no other birth reactions, then s 1|2 = − 1 2 . It is negative because when x 1 decreases, x 2 increases. It is divided by 2 because this reaction accounts for all of x 2 's birth, therefore half of x 2 's changes.
Note that the co-step sizes and the birth and death fluxes . CC-BY 4.0 International license a certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under The copyright holder for this preprint (which was not this version posted March 23, 2019. are related to each other: s j|i R i = s i|j R j .
Life times. For general stationary stochastic processes describing the increase and decrease of a quantity, e.g. molecule count, we can define the average life time as the average time that one molecule lasts before its degradation. Little's law in queueing theory then states that the average life time is equal to the ratio between the average number of such molecules and its average birth or death rate, regardless of how nonlinear the reaction rates are. This could be intuitively understood through the following non-rigorous calculation, where , the average life time or the time scale of x i .

B. Linear Noise Approximation
We see that the covariance equation (1) involves terms of the form Cov x i , R ± j , which would involve higher order moments if R ± j is nonlinear. The resulting system of moment equations is generally infinite dimensional and hard to solve. This naturally calls for moment closure methods to approximately solve this system [22]. One particularly simple and analytically tractable method is linear noise approximation (LNA). LNA could be derived by assuming that the mean x is close to a fixed point so that x = R + j ( x ), and by simply approximating birth and death rates by their first order Taylor expansion: ∆x . Define the logarithmic gains In matrix form, we have where Note that an internal relation needs to be utilized when solving this system of equations: which is derived from the relation s i|j R j = s j|i R i . Equation (2) is called the fluctuation dissipation theorem in the statistical physics community [23] and Lyapunov equation in the control community [24]. Note that the normalized covariance η is a solution to this Lyapunov equation shows that it is finite if and only if the linearized deterministic x is asymptotically stable, i.e. −M is Hurwitz. Hence there exists a general relationship between deterministic stability and stochastic variance [25].
As LNA is the main tool of analysis in this work, more comments on the effectiveness of LNA is in order. Although LNA was historically derived as a second order system size approximation of the chemical master equation, resulting in a Fokker-Planck equation with a stochastic differential equation interpretation and a Gaussian steady state distribution [26], this is not necessary. LNA, as well as higher order approximations, could be derived as Taylor expansions on the moment equations [27]. In fact, it could be shown that LNA could be interpreted as a linear propensity CRN approximation that preserves discreteness and non-negativity of the state variables (see Section V-B). So violating discreteness and non-negativity are not valid grounds for rejecting the usefulness of LNA. LNA fails just like how deterministic linearization fails: when nonlinearity significantly influences the state trajectory.

A. Coupling could reduce noise
Here we illustrate how coupling could reduce noise through one simple example with very explicit analysis.
Consider the following feedforward network, called a "sniffer" system in biological literature [28].
It consists of two components x 1 and x 2 , both activated by an external input w, while x 2 acts as an enzyme that catalyze the degradation of x 1 . The system (4) does not couple the birth events of x 1 and x 2 , so they are produced with rate w through separate reactions. The corresponding coupled version of system (4) keeps the last two death reactions the same while substituting the following reaction for the first two birth reactions: For both coupled and decoupled versions, we see that the birth rates of x 1 and where D c is for the coupled case and D d is for the decoupled case.
Solving equation (2) with D = D d for the decoupled . CC-BY 4.0 International license a certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under The copyright holder for this preprint (which was not this version posted March 23, 2019. case, we have We see that the noise of x 1 in the decoupled case, η 11,d , can be decomposed into two parts: the first term 1 x1 that is the intrinsic noise of x 1 doing birth-death by itself, and the second term that is the noise of x 2 carried over to x 1 through its regulation on x 1 . In comparison, solving equation (2) with D = D c for the coupled case, with the additional constraint τ 2 x 1 = τ 1 x 2 which is a consequence of equation (3), we have the following: Note that η 11,c is the same as that of a simple birth-death process of x 1 by itself, with no interaction with x 2 at all. The noise for the decoupled case is larger than the coupled case by η 11,d − η 11,c = τ2 τ1+τ2 η 22 , x 2 's noise passed onto x 1 through their interaction. This indicates that the noise contributed to x 1 from the interaction with x 2 in the decoupled case is cancelled out here due to coupling.
This example shows that coupling could indeed reduce noise. In fact, similar circuits have been analyzed in [14], [10] that report specific RNA circuit designs with coupling could suppress gene expression noise.

B. Coupling suppresses noise below Lestas et. al. bound
We show here through a biologically plausible example that coupled systems could have noise below the bound described in [2]. Consider the following system: Here n molecules of x 2 can be converted to 1 molecule of x 1 , resulting in a negative coupled reaction. Besides this, x 1 has first order death reaction by itself, while x 2 is produced through a reaction that is suppressed by x 1 . When n = 1, the coupled reaction could be an enzyme switching from an inactive state x 2 into an active state x 1 , which in turn suppresses the production of this enzyme x 2 . Alternatively, it could be a binding protein in complex with a functional domain x 2 being digested into separate parts, where the free binding protein x 1 suppresses the production of this complex x 2 . When n > 1, the coupled reaction could be protein subunits x 2 forming a protein complex x 1 involving n such subunits (e.g. n = 2 corresponds to a dimer), while x 1 has one of its active functions the suppression of subunit x 2 's production. Therefore, this system is biologically plausible with potential straight-forward implementations through tools in synthetic biology.
To compare with the Lestas et. al. bound, we need to delve into the details of their work [2]. Lestas et. al. considered reaction networks of the following form: where x 1 is the plant species whose noise is to be controlled, while x 2 is the signaling controller species whose information on x 1 is the only source of information on x 1 we could use to suppress its noise. The signaling of x 2 is through its birth events, as the birth rate f is dependent on x 1 . For example, if f (x 1 ) = x 1 , so x 1 catalyzes the production of x 2 , then we would estimate x 1 to be large if we observe a high density of x 2 's birth events. Therefore, information about x 1 's abundance could be extracted through a trajectory of x 2 's birth events. Because x 2 's death rate g does not depend on x 1 , the death events does not give any information about x 1 . So we ignore death events and focus on x 2 's birth events. On the other hand, with the information about x 1 extracted through observations on x 2 , we try to suppress x 1 's noise by controlling its (possibly non-Markovian) production rate, u[I t (x 2 )], which could be an arbitrary non-anticipatory functional allowing dependence of all the information I t (x 2 ) of past x 2 . In control theory terms, x 1 is the plant, birth events of x 2 with signaling rate f is the sensor, and x 1 's birth rate u is the controller actuation. Lestas et. al. then proceeded to use sensor and actuator separation in order to first bound the channel capacity in terms of f and then bound the noise in terms of channel capacity. The bounds are the following: where C is an achievable upper bound for the channel capacity of x 2 's birth events assuming finite mean and variance of f , and τ 1 is the average life time of x 1 as defined in Section II, which is equal to 1/γ for system (7). If we can write the mean and variance of f in terms of those of x 1 (e.g. when f ∝ x 1 ), then these two bounds could be combined to produce a lower bound of x 1 in terms of only x 1 's mean, variance, and time scales. Importantly, when f is a linear function of x 1 , i.e., f = αx 1 , this bound is: where N 1 = x 1 and N 2 = α x 1 τ 1 are the effective signaling rates, defined as the number of birth events of x 1 and x 2 during an average life time of x 1 , i.e., τ 1 , respectively. This is the quartic bound for the coefficient of variation ( √ η 11 ) mentioned in Section I. Going back to our example (6), we see that the number of x 2 molecules consumed in the coupled reaction, n, is equal to the relative signaling rate N2 N1 as defined above. So we expect to see better noise suppression with increasing n. The birth rate satisfies f = k/x 1 , which is not a linear function of x 1 , so we need to use equation (8) instead of the linear bound (9). In simulation, we need to estimate mean and variance of f to calculate C, which in turn enables calculation for lower bound of η 11 in equation (8). It is important to note that we violated none of the assumptions about (7) in Lestas et. al.
. CC-BY 4.0 International license a certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under The copyright holder for this preprint (which was not this version posted March 23, 2019. ; https://doi.org/10.1101/440453 doi: bioRxiv preprint  (8) [2]. The y axis is the variance divided by mean of x 1 , or its Fano factor. The x axis is the signaling rate N = N 2 /N 1 , taking integer values from 1 to 10. The Gillespie simulation results of system (6) are the black dots (coupled-sim), in good agreement with the theoretical results based on LNA shown as the light grey curve (coupled-theory). The theoretical bound of equation (8) with estimated capacity is shown as the light red curve (bound). The same simulation results for the decoupled version of system (10) are the blue dots (decoupled-sim). Parameters used for simulation: k = 10000,τ 1 = 100,τ 2 = 1.
Using LNA, we could analytically solve system (6) to have η 11 = 1 x1 n+1 4n in the fast controller dynamics limit, where τ 2 τ 1 (see Section III-D). The simulation result comparing the simulated noise of x 1 , its LNA analysis, and the Lestas et. al. bound in equation (8) is shown in Figure 1. We see that the noise is indeed well below the theoretical bound of Lestas et. al. for not too large n, which is the more biologically plausible parameter regime.
We also did simulation for the decoupled version of system (6), where the following two reactions substituted the coupled reaction: The simulation results for this decoupled version are the blue dots in Figure 1, showing that it is indeed above the Lestas et. al. bound, as the decoupled case satisfied all of the assumptions in their work [2]. It should also be noted that although this specific example (6) could achieve noise below the bound of [2], the scaling of noise in N in this system is actually worse than quartic root because it has a lower bound of 1 4 , as is shown by the black curve's convergence to 1 4 in Figure 1 as well as the LNA calculation. We will see in Section III-D that there exists in general a non-zero lower bound on the part of noise that can be suppressed through coupling. In other words, once the architecture of a system is determined, there is a lower bound on how much noise can be reduced by coupling. For a direct comparison, in the following section, we establish a lower bound on noise suppression when coupling reactions are allowed using information theoretic methods in [2].

C. Bound on noise suppression with coupling
To better understand how coupling changes the fundamental limit on noise suppression, here we establish a bound similar to (9) that allows for coupled reactions. For detailed derivation, see V-C.
Let us first understand how coupled reactions go beyond bound (9). Bound (9) assumes that the birth events of x 2 is a Poisson channel whose rate is x 2 's birth rates. However, when the birth or death events of x 1 and x 2 are coupled, the mutual information between the two increases, since a change of x 2 due to this coupled reaction always comes with a known change of x 1 at exactly the same time. Hence the key to establish a bound that includes coupling is in bounding the mutual information through this coupled birth-death.
Consider a coupled reaction in place of the decoupled x 2 → x 2 + 1 in equation (7): which could be regarded a simple conversion reaction where one x 1 is converted into one x 2 . We could split this channel into two Poisson channels, where one is x 2 → x 2 + 1 with rate w(x 1 ) just like the decoupled case, and one is x 1 → x 1 − 1 with rate that is a function of x 1 and x 2 , such that it becomes infinitely fast whenever it sees x 2 change so as to bring x 1 to match the change of x 2 , and once x 1 changed it becomes 0. In other words, the x 1 → x 1 − 1 reaction is to track the change of x 2 instantaneously.
With this consideration, we could bound the channel capacity of reaction (11) by w ln(1+1/ w )+Var {w} / w . The first term comes from using equation (8) to bound the channel capacity of the x 2 → x 2 +1 reaction with rate w(x 1 ). The second term comes from the infinitely fast reaction x 1 → x 1 − 1 mimicing the coupling. If we assume first order reactions w(x 1 ) = βx 1 , then we can derive a bound in a form similar to Lestas et. al. bound in equation (9): where Q = N 2 ln(1 + τ 1 /N 2 ) ≤ τ 1 , and N 2 = w(x 1 ) τ 1 = β x 1 τ 1 now is the signaling rate of the coupled reaction. Note that N2 N1 = βτ 1 here. Compared to the decoupled bound (9), equation (12) contains one more term Q ≥ 0 due to coupling. Note that this bound depends on the absolute value of N 2 in addition to the relative signaling rate N 2 /N 1 .
This means the effect of noise suppression depends on both N 1 = x 1 and τ 1 . To unravel this dependence, we first notice that if τ 1 is small, then Q ≈ τ 1 is small as well, so the bound (12) approaches the decoupled bound (9). Furthermore, since N2 N1 ∝ τ 1 , this means signaling rate is small in general, yielding η 11 ≥ 1 x1 , the Poisson noise lower bound. In other words, a small τ 1 indicates that noise suppression below Poisson is infeasible. This is also intuitively clear, as small τ 1 means x 1 or plant dynamics is so fast that our controller is simply too slow to be effective.
If τ 1 is not too small, then Q is of order N 2 . So whether coupling effect is significant depends on how N 2 compares with N2 N1 . If N 1 = x 1 is small, then relative signaling rate N2 N1 dominates while coupling has little influence, and the bound (12) approaches the decoupled bound (9) again, with minimum CV (i.e., √ η 11 ) bounded by quartic root of N2 N1 . If . CC-BY 4.0 International license a certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under The copyright holder for this preprint (which was not this version posted March 23, 2019. ; https://doi.org/10.1101/440453 doi: bioRxiv preprint N 1 = x 1 is not too small, then N 2 is comparable with N2 N1 and coupling is advantageous. Indeed, in the other extreme where N2 N1 N 2 , we have η 11 ≥ 1 x1 1 1+Q , so the minimum CV is inversely proportional to the square root of N 2 , in contrast to η 11 ≥ 1 x1 implied by the original Lestas et. al. bound, saying that minimum CV cannot be sub-Poisson. The above observation is consistent with example (6) in Section III-B, where small n corresponds to a N 2 comparable with N2 N1 , and large n corresponds to a N 2 dominated by N2 N1 . In summary, from our lower bound (12) on noise suppression for coupled reactions, we see that coupling is most effective when τ 1 is not too small and when absolute signaling rate N 2 is at least comparable with relative signaling rate N2 N1 . This suggests that coupling is most useful for energy efficient noise suppression, where N2 N1 is not too large, so utilizing N 2 through coupling to reduce noise further could be rather significant.

D. Coupling in two-component CRN
Here, in order to discover design principles for noise suppression with coupling, we study the achievable lower bound for noise in two-node chemical reaction networks using LNA, explicitly allowing coupling among reactions.
We start from the following formula for noise for stable two-node chemical reaction networks using LNA (for derivation see Section V-B): Here , with N 1 ,N 2 being the signaling rates as in the Lestas et. al. bound in equation (9), k = H12 H22 sgn s 1|2 , and K = − H12H21 H11H22 . k is the controller actuation gain, as H 12 is the logarithmic gain of x 2 's influence on x 1 's birth and death rates. The sign of s 1|2 enters k so that the sign of k is meaningful as well: k < 0 always implies amplification of noise, while k > 0 may result in suppression of noise. K is the closed loop gain, with negative sign in front to again make positive K correspond to noise suppression and negative K, noise amplification.
The two terms in the minimization in equation (13) are achievable (see Section V-A). η 11 achieves the first term if the controller time scale is much larger than the plant time scale with τ 1 τ 2 , while the second term is achieved when the controller time scale is much faster than the plant time scale with τ 1 τ 2 . It is reasonable that η 11 becomes H11 x1 when the controller x 2 's dynamics is very slow, as x 2 is essentially constant in x 1 's time scale, so x 1 is doing birth-death by itself with constant x 2 . Therefore, we can regard η 11,nc , the first term, as a baseline that is easily achieved when there is no control at all. Note that the no control case is not the same as open loop, as open loop could have controller actuation on the plant with no sensing.
The more interesting term is the second one, which could become less than the first no-control term in certain architec-tures. For the second term to be small, we need k > 0 and K > 0. K > 0 corresponds to negative feedback, with either x 1 activating x 2 and x 2 repressing x 1 , or x 1 repressing x 2 and x 2 activating x 1 . k > 0 corresponds to an actuation that is compatible with the coupling reaction. If we have positive coupling, i.e. s 1|2 > 0, then k > 0 implies H 12 > 0 or that the controller x 2 represses x 1 , as in example (4) of Section III-A. If we have negative coupling, i.e. s 1|2 < 0, then k > 0 implies H 12 < 0 or that the controller activates x 1 , as in example (6) in Section III-B. More detailed implications are discussed below.
1) Feedback vs. feedforward: Both incoherent feedforward and negative feedback architectures are widely found in natural biological regulatory systems with homeostasis [29]. However, our result shows that feedforward architectures always have larger noise compared to negative feedback architectures with the same plant dynamics, controller actuation, and coupling. Feedforward architectures allow controller actuation but no sensing, so K = 0. If K > 0, which is the case for negative feedback, then the lower bound for noise is always lower than the feedforward case by a factor of 1 1+K . Note that noise could be amplified by positive feedback with K < 0.
This is in strong contrast with the result for deterministic robust perfect adaptation, relevant for suppression of extrinsic noise (i.e. noise of parameters). Robust perfect adaptation is the property that a deterministic system could reach a homeostasis at steady state despite constant disturbances and uncertain dynamics [30], which is biologists' term for robust constant disturbance rejection. Both feedforward and feedback architectures could implement robust perfect adaptation equally well, with only differences in the physical controller implementation considerations [31]. Once noise is considered, we see that negative feedback could achieve smaller noise than feedforward, even when the underlying deterministic system dynamics is the same.
2) Coupled vs. decoupled.: The coupled case always have lower noise than the decoupled case if k > 0. If we assume our system does not have coupling, then s 1|2 = 0, so the term −2k s 1|2 , which is negative when k > 0, becomes zero for the decoupled case.
We also see that larger N or higher signaling rate always corresponds to less noise in the decoupled case, which agrees with what is implied by the Lestas et. al. bound. The lower bound for this case is η 11 ≥ η 11,f b ≡ s 1|1 H11(1+K) x1 , achieved in the large signaling (N → ∞) and fast controller (τ 2 τ 1 ) limit. Therefore we can regard η 11,f b as the lowest noise achievable with only feedback regulations and no coupling. Note that η 11,f b = 1 1+K η 11,nc , so we decrease noise by a factor of 1 1+K compared to no control case when we have negative feedback.
In comparison, the coupled case have a different interpretation of the relative signaling rate N , as N = s 2|1 s 1|2 if s 1|2 = 0, a consequence of equation (3). In this case, the relative signaling rate and the coupling strength are strongly related to each other. In particular, if the coupling reaction is known, then the optimal signaling rate is no longer infinite. This is shown explicitly in the following subsection.
3) Limit on noise suppression by coupling: We show that the relative signaling rate N is not larger the better for coupled systems and that the suppression of noise through one coupled reaction is limited to a factor of 1 2 beyond the feedback lower bound η 11,f b .
Consider the case where we have only one coupling reaction of the form {x 1 , are the molecule counts of x 1 and x 2 changed by this reaction, and σ 1 , σ 2 ∈ {1, −1} are signs of the change. We assume this coupled reaction has α 1 portion of the birth or death flux of x 1 , and α 2 portion of the flux of x 2 , where α 1 , α 2 ∈ (0, 1]. We also assume that all reactions other than the coupled one have step size 1. In terms of these parameters, we have s 1|1 = α1b1+(1−α1)+1 . Substituting these into equation (13) yields lower bounds for noise in terms of α i and b i . Notice that once the coupling reaction is known or b 1 , b 2 are determined, varying the relative signaling rate N is the same as varying α 1 and α 2 , the fractions of fluxes that the coupled reaction take.
The partial derivative of the second term of the lower bound in equation (13) with respect to b 1 is always positive, so b 1 should be as small as possible, while the partial derivative with respect to b 2 is always negative, so b 2 should be as large as possible. This makes sense, as larger b2 b1 means that x 1 's change is amplified in x 2 's change, so the signal is less corruptible by internal noise of x 2 .
If we take b 1 = 1, then the partial derivative with respect to α 1 is always negative, so α 1 = 1 is optimal. This is reasonable, as α 1 = 1 means the coupling reaction accounts for as large a fraction of change as possible of x 1 , so the knowledge of x 1 's change obtained from x 2 through the coupling reaction is more informative.
The partial derivative with respect to α 2 , however, is usually not as simple. The optimal α 2 is This shows that the optimal fraction of flux in x 2 of the coupled reaction is related to the controller actuation gain as well as the stoichiometry of the coupled reaction. In particular, this shows that the optimal relative signaling rate N is determined by parameters of the coupled reaction, such as b 1 , b 2 and k. For example, taking α 1 = 1 and large b 2 , we have 1 k as the optimal α 2 , and the optimal effective signaling rate N is k b1 b2 . It should be noted that the relative signaling rate is not larger the better when the coupled reaction is fixed does not contradict that larger relative signaling rate should always reduce noise as indicated in equation (13). Indeed, larger effective signaling rate N always corresponds to smaller lower bound for noise if we do not constrain b 1 and b 2 . The optimal b 2 is infinity, which corresponds to infinite N . However, once b 1 and b 2 are fixed, then N is no longer free to vary by itself. For example, if we fix parameters at their optimal values so that b 1 = 1, α 1 = 1 and b 2 is large, then N = 1 α2b2 , which could only become larger by decreasing α 2 . However, smaller α 2 would also mean that the coupling reaction accounts for a smaller fraction of x 2 's change, so the signal about x 1 becomes more easily obfuscated by noise of x 2 's other reactions not coupled with x 1 . In other words, since smaller α 2 corresponds to more signal amplification through x 2 's own birth-death reactions that are not coupled with x 1 , there is a tradeoff between noisy amplification of signals through the decoupled reactions and the preservation of accurate signals through the coupled reactions.
Another implication of the above observation is that with coupled reactions, a moderate signaling rate could result in significant noise suppression, as the optimal N could be small. This suggests that noise suppression with coupling could be rather cost-effective, therefore preferable for biological systems. Now, taking the optimal values of b 1 = 1, α 1 = 1, b 2 → ∞, and α 2 = 1 k results in η 11 = 1 2 η 11,f b in the fast controller limit (τ 2 τ 1 ), so the optimal lower bound with one coupling reaction while other reactions have step size 1 is half of the optimal lower bound of negative feedback. This shows that coupling cannot make noise arbitrarily small beyond the decoupled case.

IV. DISCUSSION
In this work we investigated the coupling mechanisms that is ubiquitous in biology for noise suppression. We constructed examples that can suppress noise below the Lestas et. al. bound [2] and, by systematically analyzing all two-node chemical reaction networks, showed that coupling has implications on effective network architectures and that coupling has its own limitations in noise suppression.
Our example constructed in Section III-B suppresses noise below the Lestas et. al. bound, but broke one of their assumptions that require the plant and the controller to have separate birth-death reactions. It may therefore seem natural that we could go below their bound. However, it should be noted that not only is the Lestas et. al. bound hypothesized to be true in much more general situations [2], previous efforts in going beyond this bound either employed complex nonbiological controllers [32] or could only show a theoretically better lower bound with unknown achievability [33]. Hence, finding an example or a class of systems that suppresses noise beyond the Lestas et. al. bound is highly nontrivial with significant implications. In particular, the example in Section III-B shows that biologically plausible reaction systems could indeed suppress noise below the Lestas et. al. bound, and situations beyond the bound should be seriously explored and considered for regulation and information processing in biological systems.
With that said, the results in this work does not try to contradict the results in [2], but rather follows its pioneering efforts in exploring fundamental limits of noise suppression in biological systems. We showed that noise suppression with coupling could further suppress noise, but still has its own fundamental limitations. This is also connected with the work of [3], which does not assume LNA and derives tight exact lower bounds for noise, but the lower bounds are harder to analyze for different cases. It is therefore of high interest to derive exact lower bounds in the spirit of [3] that includes coupling. Furthermore, theoretical investigations that relate LNA analysis with the methods of [3] would be highly beneficial, as the LNA method is scalable and analytically tractable, hence useful for controller design and architecture exploration.
This work focused on intrinsic noise, while noise in biology could be intrinsic as well as extrinsic. Intrinsic noise arises from stochastic nature of the system itself, while extrinsic noise comes from fluctuations in system parameters [34]. Extrinsic noise could be treated rather well by ignoring intrinsic noise [34], so perfect extrinsic noise suppression could be considered as perfect adaptation in the deterministic system. Robust perfect adaptation and its constraints on system architecture for biomolecular systems is studied in [31]. It is therefore desirable to connect intrinsic noise suppression together with robust perfect adaptation concerns. After all, noise suppression is not meaningful if the fixed point is not desirable.
While coupling considered in this work is of a binary kind, where two components are either coupled or decoupled, several biological phenomena observed in nature also have a "soft" coupling interpretation. For example, genes that are spatially close to each other due to their position on the genome or its 3D structure are more likely to be transcribed together in a short period of time [8]. Indeed, the positioning of genes are evolutionarily selected and the 3D structure of the genome is highly regulated [12]. This urges the study of a generalization of the coupling notion here to include these cases. It is also worth investigating how these general "soft" coupling notions relate to the cases investigated in Section III-D where coupling is only a fraction of the birth-death fluxes.
Lastly, coupling is a phenomenon rather specific to biomolecular control. Although covariance is considered in canonical stochastic control theory, coupling is a structually and physically different way of influencing variables' covariance than regulations through birth and death rates. This echoes the results in [31], where physical constraints specific to biomolecular reaction systems is shown to make integral feedback implementation a nontrivial design problem in biology. With these observations, it is highly desirable to develop theoretical tools custom-fit for biological systems so as to gain insight for biomolecular control problems.
By the causality nature of such channel, the Liptser-Shiryaev formula is actually the formula for directed information: I T (θ → n) (see [36] for a complete introduction of directed information). Here we show that if we have one another parallel channel: where β t (t, θ, n) is again F t -measurable, then the mutual information (or more precisely, the directed information) from θ t to n t is the sum of mutual (directed) information of each channel: Equation (18) can be bounded by using Lagrangian multiplier methods, assuming the first and second moments of the propensities are finite, therefore we obtain an upper bound of the mutual information of two parallel Poisson channels (equation (14) and (17) Now let's turn to the coupled reaction, using equation (11) as an example. Equation (11) is equivalent to two Poisson channels:  (20) where λ switches between 0 and a large constants K. It switches from 0 to K when x 2 increases by 1, then switches back to 0 when x 1 also increases by 1. Hence the overall effect of the first reaction to tracking the dynamics of the second reaction, when the second reaction occurs, the first immediately occurs, when setting K → ∞. Then the channel capacity of this coupled reaction can be calculated by summing up the channel capacity of the two channels (equation (20)). The second one, as the same as a usual Poisson channel, has a channel capacity C 2 = Var {w(x 1 } / w(x 1 ) . For the first reaction, since λ basically tracks the number of occurrence of the second one, we instead can use a classic conclusion from large deviation theory, which claims that Var {λ} is equal to its mean [37]. On the other hand, in the limit of long time, the empirical mean of number of events is equal to w(x 1 ) , then from ergodic theorem we conclude that λ = w(x 1 ) . Therefore the channel capacity of the coupled reaction is then: Furthermore, coupling does not alter the lower bound on mutual information, which is derived from Pinsker's nonanticipatory ε-entropy [33], [38]. So the lower bound on noise suppression is only due to an increase of channel capacity.
. CC-BY 4.0 International license a certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under The copyright holder for this preprint (which was not this version posted March 23, 2019. ; https://doi.org/10.1101/440453 doi: bioRxiv preprint