Reconciling kinetic and thermodynamic models of bacterial transcription - journal.pcbi.1008572.pdf

RESEA

RCH

ARTICL

E

Reconciling

kinetic

and

thermodynamic

models

of

bacterial

transcription

Muir

Morrison

ID

1

, Manuel

Razo-Mejia

ID

2

, Rob

Phillips

ID

1,2

*

1

Department

of Physics,

Californi

a Institute

of Technolo

gy,

Pasade

na,

California,

USA,

2

Division

of Biology

and

Biologic

al Engine

ering,

California

Institute

of Technolo

gy,

Pasade

na,

Californi

a, USA

*

phillips

@pboc.caltec

h.edu

Abstract

The

study

of transcription

remains

one

of the

centerpieces

of modern

biology

with

implica-

tions

in settings

from

development

to metabolism

to evolution

to disease.

Precision

mea-

surements

using

a host

of different

techniques

including

fluorescence

and

sequencing

readouts

have

raised

the

bar

for

what

it means

to quantitatively

understand

transcriptional

regulation.

In particular

our

understanding

of the

simplest

genetic

circuit

is sufficiently

refined

both

experimentally

and

theoretically

that

it has

become

possible

to carefully

dis-

criminate

between

different

conceptual

pictures

of how

this

regulatory

system

works.

This

regulatory

motif,

originally

posited

by

Jacob

and

Monod

in the

1960s,

consists

of a single

transcriptional

repressor

binding

to a promoter

site

and

inhibiting

transcription.

In this

paper,

we

show

how

seven

distinct

models

of this

so-called

simple-repression

motif,

based

both

on

thermodynamic

and

kinetic

thinking,

can

be

used

to derive

the

predicted

levels

of gene

expression

and

shed

light

on

the

often

surprising

past

success

of the

thermodynamic

mod-

els.

These

different

models

are

then

invoked

to confront

a variety

of different

data

on

mean,

variance

and

full

gene

expression

distributions,

illustrating

the

extent

to which

such

models

can

and

cannot

be

distinguished,

and

suggesting

a two-state

model

with

a distribution

of

burst

sizes

as

the

most

potent

of the

seven

for

describing

the

simple-repress

ion

motif.

Author

summary

With

the

advent

of

new

technologies

allowing

us

to

query

biological

activity

with

ever

increasing

precision,

the

deluge

of

quantitative

biological

data

demands

quantitative

mod-

els.

Transcriptional

regulation—a

feature

that

lies

at

the

core

of

our

understanding

of

cel-

lular

control

in

myriad

context

ranging

from

development

to

disease—is

no

exception,

with

single-cell

and

single-molecule

techniques

being

routinely

deployed

to

study

cellular

decision

making.

These

data

have

served

as

a fertile

proving

ground

to

test

models

of

tran-

scription

that

mainly

come

in

two

flavors:

thermodynamic

models

(based

on

equilibrium

statistical

mechanics)

and

kinetic

models

(based

on

chemical

kinetics).

In

this

paper

we

study

the

correspondence

between

these

theoretical

frameworks

in

the

context

of

the

sim-

ple

repression

motif,

a common

regulatory

architecture

in

prokaryotes

in

which

a repres-

sor

with

a single

binding

site

regulates

expression.

We

explore

the

consequences

of

PLOS COMP

UTATIONAL

BIOLOGY

PLOS

Computationa

l Biology

| https:/

/doi.org/10.13

71/journal.p

cbi.1008572

January

19,

2021

1 / 30

a1111111111

a1111111111

a1111111111

a1111111111

a1111111111

OPEN

ACCESS

Citation:

Morrison

M, Razo-Mejia

M, Phillips

R

(2021)

Reconcilin

g kinetic

and thermodynam

ic

models

of bacterial

transcripti

on. PLoS

Comput

Biol 17(1):

e1008572.

https://doi.

org/10.1371/

journal.pcb

i.1008572

Editor:

James

R. Faeder,

Univers

ity of Pittsburgh

,

UNITED

STATES

Received:

June

14, 2020

Accepted:

November

28, 2020

Published:

January

19, 2021

Peer Review

History:

PLOS

recognize

s the

benefits

of transpar

ency

in the peer review

process;

therefore,

we enable

the publication

of

all of the content

of peer review

and author

response

s alongside

final,

published

articles.

The

editorial

history

of this article

is available

here:

https://doi.o

rg/10.1371/jo

urnal.pcbi

.1008572

Copyright:

©

2021

Morrison

et al. This is an open

access

article

distributed

under

the terms

of the

Creative

Commons

Attribution

License,

which

permits

unrestricte

d use, distribu

tion, and

reproduction

in any medium,

provided

the original

author

and source

are credited.

Data

Availabilit

y Statement:

All data and custom

scripts

were

collected

and stored

using

Git version

control.

Code

for Bayesia

n inference

and figure

generatio

n is available

on the GitHub

repository

different

levels

of

coarse-graining

of

the

molecular

steps

involved

in

transcription,

finding

that,

at

the

level

of

mean

gene

expression,

the

different

models

cannot

be

distinguished.

We

then

study

higher

moments

of

the

gene

expression

distribution

which

allows

us

to

dis-

card

several

of

the

models

that

disagree

with

experimental

data

and

supporting

a minimal

kinetic

model.

Introduction

Gene

expression

presides

over

much

of

the

most

important

dynamism

of

living

organisms.

The

level

of

expression

of

batteries

of

different

genes

is altered

as

a result

of

spatiotemporal

cues

that

integrate

chemical,

mechanical

and

other

types

of

signals.

As

our

ability

to

experi-

mentally

observe

and

measure

the

dynamical

processes

that

constitute

the

central

dogma

improves,

there

is an

opportunity

to

undertake

a theory-experiment

dialogue

in

order

to

sharpen

our

understanding

of

such

a fundamental

biological

process.

One

of

the

remaining

outstanding

challenges

to

have

emerged

in

the

genomic

era

is our

continued

inability

to

pre-

dict

the

regulatory

consequences

of

different

regulatory

architectures,

i.e.

the

arrangement

and

affinity

of

binding

sites

for

transcription

factors

and

RNA

polymerases

on

the

DNA.

This

chal-

lenge

stems

first

and

foremost

from

our

ignorance

about

what

those

architectures

even

are,

with

more

than

60%

of

the

genes

even

in

an

ostensibly

well

understood

organism

such

as

E.

coli

having

no

regulatory

insights

at

all

[1–4].

But

even

once

we

have

established

the

identity

of

key

transcription

factors

and

their

binding

sites

for

a given

promoter

architecture,

there

remains

the

predictive

challenge

of

understanding

its

input-output

properties,

an

objective

that

can

be

met

by

a myriad

of

approaches

using

the

tools

of

statistical

physics

[5–24].

One

route

to

such

predictive

understanding

is to

focus

on

the

simplest

regulatory

architecture

and

to

push

the

theory-experiment

dialogue

to

increase

the

predictive

power

of

our

theoretical

models

[25,

26].

If we

demonstrate

that

we

can

pass

that

test

by

successfully

predicting

both

the

means

and

variance

in

gene

expression

at

the

mRNA

level,

then

that

provides

a more

solid

foundation

upon

which

to

launch

into

more

complex

problems—for

instance,

some

of

the

pre-

viously

unknown

architectures

uncovered

in

[2]

and

[27].

To

that

end,

in

this

paper

we

examine

a wide

variety

of

distinct

models

for

the

simple

repression

regulatory

architecture.

This

genetic

architecture

consists

of

a DNA

promoter

regu-

lated

by

a transcriptional

repressor

that

binds

to

a single

binding

site

as

developed

in

pioneer-

ing

early

work

on

the

quantitative

dissection

of

transcription

[28,

29].

One

of

the

main

features

of

the

models

we

explore

is that,

by

construction,

all

features

related

to

the

microstates

in

which

the

repressor

is bound

to

the

promoter

can

be

separated

from

the

microstates

in

which

the

RNA

polymerase

(RNAP)

is bound.

From

a modeling

perspective,

this

means

that

some

of

the

models

can

be

written

as

effective

two-state

models

for

which

there

is a rich

litera-

ture

[17,

21,

24,

30–36].

Here,

we

systematically

compare

the

predictions

of

several

models

with

different

levels

of

coarse

graining

written

in

terms

of

thermodynamic

and

kinetic

parame-

ters.

One

goal

in

exploring

such

coarse-grainings

is to

build

towards

the

future

models

of

regu-

latory

response

that

will

be

able

to

serve

the

powerful

predictive

role

needed

to

take

synthetic

biology

from

a brilliant

exercise

in

enlightened

empiricism

to

a rational

design

framework

as

in

any

other

branch

of

engineering.

More

precisely,

we

want

phenomenology

in

the

sense

of

coarse-graining

away

atomistic

detail,

but

still

retaining

biophysical

meaning.

In

particular

a

key

question

is:

at

this

level

of

coarse-graining,

what

microscopic

details

do

we

need

to

explic-

itly

model,

and

how

do

we

figure

that

out?

For

example,

do

we

need

to

worry

about

all

or

even

any

of

the

steps

that

individual

RNA

polymerases

go

through

each

time

they

make

a transcript?

PLOS COMP

UTATIONAL

BIOLOGY

Reconciling

kinetic

and

thermodyn

amic

models

of bacterial

transcrip

tion

PLOS

Computationa

l Biology

| https:/

/doi.org/10.13

71/journal.p

cbi.1008572

January

19,

2021

2 / 30

(https://git

hub.com/RPG

roup-PBoC

/bursty_

transcripti

on).

Funding:

This material

is based

upon

work

supported

by the National

Science

Foundation

Graduate

Research

Fellowship

under

Grant

No.

DGE-1745

301 (to M.J.M.).

This work

was also

supported

by La Fondation

Pierre-Gilles

de Gennes,

the Rosen

Center

at Caltech,

and the NIH

5R35GM1

18043-05

(MIRA)

to R.P. M.R.M.

was

supported

by the Caldwell

CEMI

fellowship.

The

funders

had no role in study

design,

data collection

and analysis,

decision

to publish,

or preparation

of

the manuscript.

Competing

interests

:

The authors

have declared

that no competing

interests

exist.

Turning

the

question

around,

can

we

see

any

imprint

of

those

processes

in

the

available

data?

If the

answer

is no,

then

those

processes

are

irrelevant

for

our

purposes.

Forward

modeling

and

inverse

(statistical

inferential)

modeling

are

necessary

to

tackle

such

questions.

We

com-

bine

both

approaches

in

order

to

discard

models

that

cannot

empirically

satisfy

the

main

fea-

tures

of

experimental

data.

First

we

apply

forward

modeling

to

demonstrate

that

none

of

the

models

are

distinguishable

at

the

level

of

mean

gene

expression.

We

then

extend

the

modeling

to

look

at

higher

moments

of

the

distribution,

eliminating

models

that

do

not

empirically

sat-

isfy

the

observed

cell-to-cell

variability.

Finally

we

arrive

at

a minimal

model

on

which

we

can

apply

inverse

modeling

in

order

to

infer

the

parameters

that

explain

the

data.

Fig

1A

shows

the

qualitative

picture

of

simple

repression

that

is implicit

in

the

repressor-

operator

model.

An

operator,

i.e.,

the

binding

site

on

the

DNA

for

a repressor

protein,

may

be

found

occupied

by

a repressor,

in

which

case

transcription

is blocked

from

occurring.

Alterna-

tively,

that

binding

site

may

be

found

unoccupied,

in

which

case

RNA

polymerase

(RNAP)

may

bind

and

transcription

can

proceed.

The

key

assumption

we

make

in

this

simplest

incar-

nation

of

the

repressor-operator

model

is that

binding

of

repressor

and

RNAP

in

the

promoter

region

of

interest

is exclusive,

meaning

that

one

or

the

other

may

bind,

but

never

may

both

be

simultaneously

bound.

It is often

imagined

that

when

the

repressor

is bound

to

its

operator,

RNAP

is sterically

blocked

from

binding

to

its

promoter

sequence.

Current

evidence

suggests

this

is sometimes,

but

not

always

the

case,

and

it remains

an

interesting

open

question

pre-

cisely

how

a repressor

bound

far

upstream

is able

to

repress

transcription

[1].

Suggestions

include

“action-at-a-distanc

e”

mediated

by

kinks

in

the

DNA,

formed

when

the

repressor

is

bound,

that

prevent

RNAP

binding.

Nevertheless,

our

modeling

in

this

work

is sufficiently

coarse-grained

that

we

simply

assume

exclusive

binding

and

leave

explicit

accounting

of

these

details

out

of

the

problem.

The

logic

of

the

remainder

of

the

paper

is as

follows.

In

the

section

1,

we

show

how

both

thermodynamic

models

(Fig

1B)

and

kinetic

models

based

upon

the

chemical

master

equation

(Fig

1C)

all

culminate

in

the

same

underlying

functional

form

for

the

fold-change

in

the

aver-

age

level

of

gene

expression

with

an

effective

free

energy

Δ

F

R

capturing

the

regulation

given

by

the

transcription

factor,

and

a term

ρ

describing

the

level

of

coarse-graining

of

the

transcrip-

tional

events

as

shown

in

Fig

1D.

Section

2 goes

beyond

an

analysis

of

the

mean

gene

expres-

sion

by

asking

how

the

same

models

presented

in

Fig

1C

can

be

used

to

explore

noise

in

gene

expression.

To

make

contact

with

experiment,

all

of

these

models

must

make

a commitment

to

some

numerical

values

for

the

key

parameters

found

in

each

such

model.

Therefore

in

Sec-

tion

3 we

explore

the

use

of

Bayesian

inference

to

establish

these

parameters

and

to

rigorously

answer

the

question

of

how

to

discriminate

between

the

different

models.

Materials

and

methods

All

data

and

custom

scripts

were

collected

and

stored

using

Git

version

control.

Code

for

Bayesian

inference

and

figure

generation

is available

on

the

GitHub

repository

(https://github.

com/RPGroup-PBoC/bursty_

transcription).

Results

1 Mean

gene

expression

As

noted

in

the

previous

section,

there

are

two

broad

classes

of

models

in

play

for

computing

the

input-output

functions

of

regulatory

architectures

as

shown

in

Fig

1.

In

both

classes

of

model,

the

promoter

is imagined

to

exist

in

a discrete

set

of

states

of

occupancy,

with

each

such

state

of

occupancy

accorded

its

own

rate

of

transcription–including

no

transcription

for

many

of

these

states.

This

discretization

of

a potentially

continuous

number

of

promoter

states

PLOS COMP

UTATIONAL

BIOLOGY

Reconciling

kinetic

and

thermodyn

amic

models

of bacterial

transcrip

tion

PLOS

Computationa

l Biology

| https:/

/doi.org/10.13

71/journal.p

cbi.1008572

January

19,

2021

3 / 30

Fig

1.

An

overview

of

the

simple

repressio

n motif

at

the

level

of

means.

(A)

Schematic

of

the

qualitati

ve

biological

picture

of

the

simple

repression

genetic

architectur

e. (B)

and

(C)

A variety

of

possible

mathemati

cized

cartoons

of

simple

repression,

along

with

the

effective

parame

ter

ρ

which

subsumes

all

regulatory

details

of

the

architectur

e that

do

not

directly

involve

the

repressor.

(B)

Simple

repression

models

from

a thermodyn

amic

perspective

. (C)

Equivalent

models

cast

in

chemical

kinetics

language.

(D)

The

“master

curve”

to

which

all

cartoons

in

(B)

and

(C)

collapse

.

https://

doi.org/10.1371

/journal.pcbi.10

08572.g001

PLOS COMP

UTATIONAL

BIOLOGY

Reconciling

kinetic

and

thermodyn

amic

models

of bacterial

transcrip

tion

PLOS

Computationa

l Biology

| https:/

/doi.org/10.13

71/journal.p

cbi.1008572

January

19,

2021

4 / 30

(due

to

effects

such

as

supercoiling

of

DNA

[37,

38]

or

DNA

looping

[39])

is analogous

to

how

the

Monod-Wyman-Changeux

model

of

allostery

coarse-grains

continuous

molecule

confor-

mations

into

a finite

number

of

states

[40].

The

models

are

probabilistic

with

each

state

assigned

some

probability

and

the

overall

rate

of

transcription

given

by

average rate of transcription

¼

X

i

r

i

p

i

;

ð

1

Þ

where

i

labels

the

distinct

states,

p

i

is the

probability

of

the

i

th

state,

and

r

i

is the

rate

of

tran-

scription

of

that

state.

Ultimately,

the

different

models

differ

along

several

key

aspects:

what

states

to

consider

and

how

to

compute

the

probabilities

of

those

states.

The

first

class

of

models

that

are

the

subject

of

the

present

section

focus

on

predicting

the

mean

level

of

gene

expression.

These

models,

sometimes

known

as

thermodynamic

models,

invoke

the

tools

of

equilibrium

statistical

mechanics

to

compute

the

probabilities

of

the

pro-

moter

microstates

[5–11,

13–15].

As

seen

in

Fig

1B,

even

within

the

class

of

thermodynamic

models,

we

can

make

different

commitments

about

the

underlying

microscopic

states

of

the

promoter.

Model

1 considers

only

two

states:

a state

in

which

a repressor

(with

copy

number

R

)

binds

to

an

operator

and

a transcriptionally

active

state.

The

free

energy

difference

between

the

repressor

binding

the

operator,

i.e.

a specific

binding

site,

and

one

of

the

N

NS

non-specific

sites

is given

by

Δ

ε

R

(given

in

k

B

T

units

with

β

�

(

k

B

T

)

−

1

). Model

2 expands

this

model

to

include

an

empty

promoter

where

no

transcription

occurs,

as

well

as

a state

in

which

one

of

the

P

RNAPs

binds

to

the

promoter

with

binding

energy

Δ

ε

P

. Indeed,

the

list

of

options

con-

sidered

here

does

not

at

all

exhaust

the

suite

of

different

microscopic

states

we

can

assign

to

the

promoter.

The

essence

of

thermodynamic

models

is to

assign

a discrete

set

of

states

and

to

use

equilibrium

statistical

mechanics

to

compute

the

probabilities

of

occupancy

of

those

states.

The

second

class

of

models

that

allow

us

to

access

the

mean

gene

expression

use

chemical

master

equations

to

compute

the

probabilities

of

the

different

microscopic

states

[16–23].

The

main

differences

between

both

modeling

approaches

can

be

summarized

as:

1)

Although

for

both

classes

of

models

the

steps

involving

transcriptional

events

are

assumed

to

be

strictly

irre-

versible,

thermodynamic

models

force

the

regulation,

i.e.,

the

control

over

the

expression

exerted

by

the

repressor,

to

be

in

equilibrium.

This

does

not

need

to

be

the

case

for

kinetic

models.

2)

Thermodynamic

models

ignore

the

mRNA

count

from

the

state

of

the

Markov

pro-

cess,

while

kinetic

models

keep

track

of

both

the

promoter

state

and

the

mRNA

count.

3)

Finally,

thermodynamic

and

kinetic

models

coarse-grain

to

different

degrees

the

molecular

mechanisms

through

which

RNAP

enters

the

transcriptional

event.

As

seen

in

Fig

1C,

we

con-

sider

a host

of

different

kinetic

models,

each

of

which

will

have

its

own

result

for

both

the

mean

(this

section)

and

noise

(next

section)

in

gene

expression.

1.1

Fold-changes

are

indistinguishable

across

models.

As

a first

stop

on

our

search

for

the

“right”

model

of

simple

repression,

let

us

consider

what

we

can

learn

from

theory

and

experimental

measurements

on

the

average

level

of

gene

expression

in

a population

of

cells.

One

experimental

strategy

that

has

been

particularly

useful

(if

incomplete

since

it misses

out

on

gene

expression

dynamics)

is to

measure

the

fold-change

in

mean

expression

[25].

The

fold-change

FC

is defined

as

FC

ð

R

Þ ¼

h

gene

expression with

R

>

0

i

h

gene

expression with

R

¼

0

i

¼

h

m

ð

R

Þi

h

m

ð

0

Þi

¼

h

p

ð

R

Þi

h

p

ð

0

Þi

;

ð

2

Þ

where

angle

brackets

h�i

denote

the

average

over

a population

of

cells

and

mean

mRNA

h

m

i

and

mean

protein

h

p

i

are

viewed

as

a function

of

repressor

copy

number

R

.

What

this

means

is that

the

fold-change

in

gene

expression

is a relative

measurement

of

the

effect

of

the

PLOS COMP

UTATIONAL

BIOLOGY

Reconciling

kinetic

and

thermodyn

amic

models

of bacterial

transcrip

tion

PLOS

Computationa

l Biology

| https:/

/doi.org/10.13

71/journal.p

cbi.1008572

January

19,

2021

5 / 30

transcriptional

repressor

(

R

>

0)

on

the

gene

expression

level

compared

to

an

unregulated

promoter

(

R

= 0).

The

third

equality

in

Eq

2 follows

from

assuming

that

the

translation

effi-

ciency,

i.e.,

the

number

of

proteins

translated

per

mRNA,

is the

same

in

both

conditions.

In

other

words,

we

assume

that

mean

protein

level

is proportional

to

mean

mRNA

level,

and

that

the

proportionality

constant

is the

same

in

both

conditions

and

therefore

cancels

out

in

the

ratio.

This

is reasonable

since

the

cells

in

the

two

conditions

are

identical

except

for

the

pres-

ence

of

the

transcription

factor,

and

the

model

assumes

that

the

transcription

factor

has

no

direct

effect

on

translation.

Fold-change

has

proven

a very

convenient

observable

in

past

work

[41–44].

Part

of

its

util-

ity

in

dissecting

transcriptional

regulation

is its

ratiometric

nature,

which

removes

many

sec-

ondary

effects

that

are

present

when

making

an

absolute

gene

expression

measurement.

Also,

by

measuring

otherwise

identical

cells

with

and

without

a transcription

factor

present,

any

bio-

logical

noise

common

to

both

conditions

can

be

made

to

cancel

out.

Fig

1B

and

1C

depicts

a

smorgasbord

of

mathematicized

cartoons

for

simple

repression

using

both

thermodynamic

and

kinetic

models,

respectively,

that

have

appeared

in

previous

literature.

For

each

cartoon,

we

calculate

the

fold-change

in

mean

gene

expression

as

predicted

by

that

model,

deferring

most

algebraic

details

to

the

S1

Supporting

Information.

What

we

will

find

is that

for

all

car-

toons

the

fold-change

can

be

written

as

a Fermi

function

of

the

form

FC

ð

R

Þ ¼ð

1

þ

exp

ð

D

F

R

ð

R

Þ þ

log

ð

r

ÞÞÞ

1

;

ð

3

Þ

where

the

effective

free

energy

contains

two

terms:

the

parameters

Δ

F

R

, an

effective

free

energy

parametrizing

the

repressor-DNA

interaction,

and

ρ

,

a term

derived

from

the

level

of

coarse-

graining

used

to

model

all

repressor-free

states.

In

other

words,

the

effective

free

energy

of

the

Fermi

function

can

be

written

as

the

additive

effect

of

the

regulation

given

by

the

repressor

via

Δ

F

R

, and

the

kinetic

scheme

used

to

describe

the

steps

that

lead

to

a transcriptional

event

via

log(

ρ

)

(See

Fig

1D,

left

panel).

This

implies

all

models

collapse

to

a single

master

curve

as

shown

in

Fig

1D.

We

will

offer

some

intuition

for

why

this

master

curve

exists

and

discuss

why

at

the

level

of

the

mean

expression,

we

are

unable

to

discriminate

“right”

from

“wrong”

cartoons

given

only

measurements

of

fold-changes

in

expression.

1.1.1 Two- and three-state

thermodynamic

models

We

begin

our

analysis

with

models

1 and

2 in

Fig

1B.

In

each

of

these

models

the

promoter

is idealized

as

existing

in

a set

of

discrete

states;

the

difference

being

whether

or

not

the

RNAP

bound

state

is included

or

not.

Gene

expression

is then

assumed

to

be

proportional

to

the

probability

of

the

promoter

being

in

either

the

empty

state

(model

1)

or

the

RNAP-bound

state

(model

(2)).

We

direct

the

reader

to

the

S1

Supporting

Information

for

details

on

the

der-

ivation

of

the

fold-change.

For

our

purposes

here,

it suffices

to

state

that

the

functional

form

of

the

fold-change

for

model

1 is

FC

ð

R

Þ ¼

1

þ

R

N

NS

e

b

D

ε

R

�

�

1

;

ð

4

Þ

where

R

is the

number

of

repressors

per

cell,

N

NS

is the

number

of

non-specific

binding

sites

where

the

repressor

can

bind,

Δ

ε

R

is the

repressor-operator

binding

energy,

and

β

�

(

k

B

T

)

−

1

.

This

equation

matches

the

form

of

the

master

curve

in

Fig

1D

with

ρ

= 1 and

Δ

F

R

=

β

Δ

ε

R

−

log

PLOS COMP

UTATIONAL

BIOLOGY

Reconciling

kinetic

and

thermodyn

amic

models

of bacterial

transcrip

tion

PLOS

Computationa

l Biology

| https:/

/doi.org/10.13

71/journal.p

cbi.1008572

January

19,

2021

6 / 30

(

R

/

N

NS

). For

model

2 we

have

a similar

situation.

The

fold-change

takes

the

form

FC

ð

R

Þ ¼

1

þ

R

N

NS

e

b

D

ε

R

1

þ

P

N

NS

e

b

D

ε

P

!

1

ð

5

Þ

¼ ð

1

þ

exp

ð

D

F

R

þ

log

r

ÞÞ

1

;

ð

6

Þ

where

P

is the

number

of

RNAP

per

cell,

and

Δ

ε

P

is the

RNAP-promoter

binding

energy.

For

this

model

we

have

Δ

F

R

=

β

Δ

ε

R

−

log(

R

/

N

NS

) and

r

¼

1

þ

P

N

NS

e

b

D

ε

P

. Thus

far,

we

see

that

the

two

thermodynamic

models,

despite

making

different

coarse-graining

commitments,

result

in

the

same

functional

form

for

the

fold-change

in

mean

gene

expression.

We

now

explore

how

kinetic

models

fare

when

faced

with

computing

the

same

observable.

1.1.2 Kinetic

models

One

of

the

main

difference

between

models

shown

in

Fig

1C,

cast

in

the

language

of

chemi-

cal

master

equations,

compared

with

the

thermodynamic

models

discussed

in

the

previous

section

is the

probability

space

over

which

they

are

built.

Rather

than

keeping

track

only

of

the

microstate

of

the

promoter,

and

assuming

that

gene

expression

is proportional

to

the

probabil-

ity

of

the

promoter

being

in

a certain

microstate,

chemical

master

equation

models

are

built

on

the

entire

probability

state

of

both

the

promoter

microstate,

and

the

current

mRNA

count.

Therefore,

in

order

to

compute

the

fold-change,

we

must

compute

the

mean

mRNA

count

on

each

of

the

promoter

microstates,

and

add

them

all

together

[32].

Again,

we

consign

all

details

of

the

derivation

to

the

S1

Supporting

Information.

Here

we

just

highlight

the

general

findings

for

all

five

kinetic

models.

As

already

shown

in

Fig

1C

and

1D,

all

the

kinetic

models

explored

can

be

collapsed

onto

the

master

curve.

Given

that

the

repressor-bound

state

only

connects

to

the

rest

of

the

promoter

dynamics

via

its

binding

and

unbinding

rates,

k

þ

R

and

k

R

respectively,

all

models

can

effectively

be

separated

into

two

catego-

ries:

a single

repressor-bound

state,

and

all

other

promoter

states

with

different

levels

of

coarse

graining.

This

structure

then

guarantees

that,

at

steady-state,

detailed

balance

between

these

two

groups

is satisfied.

What

this

implies

is that

the

steady-state

distribution

of

each

of

the

non-repressor

states

has

the

same

functional

form

with

or

without

the

repressor,

allowing

us

to

write

the

fold-change

as

a product

of

the

ratio

of

the

binding

and

unbinding

rates

of

the

pro-

moter,

and

the

promoter

details.

This

results

in

a fold-change

of

the

form

FC

¼

1

þ

k

þ

R

k

R

r

�

�

1

;

ð

7

Þ

¼ ð

1

þ

exp

ð

D

F

R

þ

log

ð

r

ÞÞÞ

1

;

ð

8

Þ

where

D

F

R

�

log

ð

k

þ

R

=

k

R

Þ

, and

the

functional

forms

of

ρ

for

each

model

change

as

shown

in

Fig

1C.

Another

intuitive

way

to

think

about

these

two

terms

is as

follows:

in

all

kinetic

models

shown

in

Fig

1C

the

repressor-bound

state

can

only

be

reached

from

a single

repressor-free

state.

The

ratio

of

these

two

states

--repressor-bound

and

adjacent

repressor-free

state-

- must

remain

the

same

for

all

models,

regardless

of

the

details

included

in

other

promoter

states

if

Δ

F

R

represents

an

effective

free

energy

of

the

repressor

binding

the

DNA

operator.

The

pres-

ence

of

other

states

then

draws

probability

density

from

the

promoter

being

in

either

of

these

two

states,

making

the

ratio

between

the

repressor-bound

state

and

all

repressor-free

states

dif-

ferent.

The

log

difference

in

this

ratio

is given

by

log(

ρ

).

Since

model

1 and

model

5 of

Fig

1C

consist

of

a single

repressor-free

state,

ρ

is then

necessarily

1 (See

the

S1

Supporting

Informa-

tion

for

further

details).

PLOS COMP

UTATIONAL

BIOLOGY

Reconciling

kinetic

and

thermodyn

amic

models

of bacterial

transcrip

tion

PLOS

Computationa

l Biology

| https:/

/doi.org/10.13

71/journal.p

cbi.1008572

January

19,

2021

7 / 30

The

key

outcome

of

our

analysis

of

the

models

in

Fig

1 is the

existence

of

a master

curve

shown

in

Fig

1D

to

which

the

fold-change

predictions

of

all

the

models

collapse.

This

master

curve

is parametrized

by

only

two

effective

parameters:

Δ

F

R

, which

characterizes

the

number

of

repressors

and

their

binding

strength

to

the

DNA,

and

ρ

,

which

characterizes

all

other

fea-

tures

of

the

promoter

architecture.

The

key

assumption

underpinning

this

result

is that

no

transcription

occurs

when

a repressor

is bound

to

its

operator.

Given

this

outcome,

i.e.,

the

degeneracy

of

the

different

models

at

the

level

of

fold-change,

a mean-based

metric

such

as

the

fold-change

that

can

be

readily

measured

experimentally

is insufficient

to

discern

between

these

different

levels

of

coarse-graining.

The

natural

extension

that

the

field

has

followed

for

the

most

part

is to

explore

higher

moments

of

the

gene

expression

distribution

in

order

to

establish

if those

contain

the

key

insights

into

the

mechanistic

nature

of

the

gene

transcription

process

[24,

35].

Following

a similar

trend,

in

the

next

section

we

extend

the

analysis

of

the

models

to

higher

moments

of

the

mRNA

distribution

as

we

continue

to

examine

the

discrimi-

natory

power

of

these

different

models.

2 Beyond

means

in

gene

expression

In

this

section,

our

objective

is to

explore

the

same

models

considered

in

the

previous

section,

but

now

with

reference

to

the

question

of

how

well

they

describe

the

distribution

of

gene

expression

levels,

with

special

reference

to

the

variance

in

these

distributions.

To

that

end,

we

repeat

the

same

pattern

as

in

the

previous

section

by

examining

the

models

one

by

one.

In

par-

ticular

we

will

focus

on

the

Fano

factor,

defined

as

the

variance/mean.

This

metric

serves

as

a

powerful

discriminatory

tool

to

compare

our

different

models

to

the

null

model

that

the

steady-state

mRNA

distribution

must

be

Poisson,

resulting

a Fano

factor

of

one.

2.1

Kinetic

models

for

unregulated

promoter

noise.

Before

we

can

tackle

simple

repres-

sion,

we

need

an

adequate

phenomenological

model

of

constitutive

expression.

The

literature

abounds

with

options

from

which

we

can

choose,

and

we

show

several

potential

kinetic

mod-

els

for

constitutive

promoters

in

Fig

2A.

Let

us

consider

the

suitability

of

each

model

for

our

purposes

in

turn.

2.1.1 Poisson

noise promoter

The

simplest

model

of

constitutive

expression

that

we

can

imagine

is shown

as

model

1 in

Fig

2A

and

assumes

that

transcripts

are

produced

as

a Poisson

process

from

a single

promoter

state.

This

is the

picture

from

Jones

et.

al.

[33]

that

was

used

to

interpret

a systematic

study

of

gene

expression

noise

over

a series

of

promoters

designed

to

have

different

strengths,

but

no

regula-

tion.

This

model

insists

that

the

“true”

steady-state

mRNA

distribution

is Poisson,

implying

the

Fano

factor

ν

must

be

1. In

[33],

the

authors

carefully

attribute

measured

deviations

from

Fano

= 1 to

intensity

variability

in

fluorescence

measurements,

gene

copy

number

variation,

and

copy

number

fluctuations

of

the

transcription

machinery,

e.g.,

RNAP

itself.

In

this

picture,

all

the

corrections

to

Poisson

behavior

are

derived

as

additive

corrections

to

the

Fano

factor.

This

picture

is appealing

in

its

simplicity,

with

only

two

parameters,

the

initiation

rate

r

and

degradation

rate

γ

. In

other

words,

the

model

is not

excessively

complex

for

the

data

at

hand.

But

for

many

inter-

esting

questions,

for

instance

in

the

recent

work

[47],

attributing

all

deviations

from

the

model

to

extrinsic

noise

sources,

limits

the

kinds

of

predictions

that

can

be

done.

To

make

progress

then

we

need

a (slightly)

more

complex

model

than

model

1 that

would

allow

us

to

incorporate

the

non-Poissonian

features

of

constitutive

promoters

directly

into

a master

equation

formulation.

2.1.2 Sub-Poissoninan

noise promoters

with RNAP

escape

A natural

extension

of

the

one-state

promoter

studied

in

the

previous

section

is to

explicitly

include

an

empty

promoter

state.

This

state

allows

for

single

RNAP

to

bind

and

unbind

from

the

promoter

with

rates

k

þ

P

and

k

P

, respectively,

before

engaging

in

a transcriptional

event.

PLOS COMP

UTATIONAL

BIOLOGY

Reconciling

kinetic

and

thermodyn

amic

models

of bacterial

transcrip

tion

PLOS

Computationa

l Biology

| https:/

/doi.org/10.13

71/journal.p

cbi.1008572

January

19,

2021

8 / 30