Principles of Genetic Design Automation

10 min readJul 6, 2023

Imagine a world where cells operate like tiny, sophisticated computers, capable of processing environmental data, making informed decisions, and finely tuning their internal functions. This isn’t a plot from a science fiction novel, but rather, a reality that unfolds within us at every moment, powered by the intricate machinery of genetic circuits.

In the same way a computer’s circuitry manages the flow of data, genetic circuits — complex networks of interplaying genes and their products — guide the intricate dance of life at a cellular level. They deftly regulate gene expression, adroitly adjusting to a variety of environmental signals like a perfectly calibrated, high-tech sensor.

Consider transcription factors, the key actors in these genetic circuits. These cellular ‘microprocessors’ can sense a multitude of environmental inputs and, depending on their interpretation, can adjust gene expression, switching genes on or off as required.

The new problem that we face today is understanding how we can construct these genetic circuits ourselves.

What is a genetic circuit?

Genetic circuits are regulatory interaction networks that enable cells to make decisions and integrate signals from their environment.

Here is an example of the genetic circuit in the phage lysis-lysogeny cycle.

Phages are viruses that infect bacteria. When a bacteriophage (a type of phage) infects a bacterium, it typically follows one of two life-cycle paths — the lytic cycle or the lysogenic cycle. This is often dictated by the environmental conditions and is controlled by a genetic switch.

In the lytic cycle, the phage takes over the machinery of the bacterium to replicate its own genetic material and produce more phages. The host bacterium is ultimately destroyed or lysed to release the newly produced phages, which can then go on to infect other bacteria.

In the lysogenic cycle, instead of killing the host bacterium, the phage integrates its genetic material into the bacterium’s genome. This integrated phage DNA is known as a prophage. The bacterium can then continue to live and reproduce normally, replicating the prophage DNA along with its own. The prophage can remain dormant in the bacterium’s genome for many generations until certain conditions trigger it to switch to the lytic cycle.

The decision between lysis and lysogeny is controlled by a genetic switch. In bacteriophage lambda, one of the most well-studied phages, this switch involves two key genes — cI and cro. The cI gene codes for the lambda repressor protein which promotes lysogeny, while the cro gene codes for a protein that promotes lysis.

When the phage DNA is first injected into the bacterium, both cI and cro genes are transcribed and translated. However, the cI protein is more stable initially and thus accumulates faster than the cro protein, leading to the repression of the cro gene and promotion of the lysogenic cycle. If the cI gene is inactivated for some reason (e.g. due to DNA damage), the cro protein will accumulate, switch off the cI gene, and shift the phage to the lytic cycle.

This genetic switch is an example of a bistable system — it can stably exist in either of two states (lysis or lysogeny), and the state of the system is determined by which gene product, cI or cro, gains dominance first.

What is the process of Modelling & Analysing Genetic Circuit behaviour?

1. Genetic circuit topology: This refers to the structure of regulatory interactions. In essence, it outlines the ‘roadmap’ of how different genetic elements communicate and influence each other, providing the structural blueprint for the circuit.

2. Biochemical reactions: Once we understand the topology, we can identify the biochemical reactions that occur within the genetic circuit. These reactions represent the functional aspects of the circuit, such as how proteins interact and how genes are activated or deactivated.

3. Detailed ODE: At this stage, a mathematical model, typically in the form of ordinary differential equations (ODEs), is developed. This model captures the biochemical reactions, often describing the rate of production, kinetic parameters, and concentration of reactants. The ODE offers a mathematical representation of the dynamic behaviour of the circuit.

4. Reduced model: Due to the complexity of the detailed ODE, a simplified or ‘reduced’ model is often created. This model, containing fewer parameters, is more amenable to detailed analysis and evaluation. It allows us to better understand and predict the behaviour of the circuit without getting lost in the complexity.

5. Simulation & Evaluation of the results: Finally, we simulate the genetic circuit behaviour using the reduced model and evaluate the results. The simulation provides a dynamic picture of the circuit in action, enabling us to test and validate our model against experimental data.

An important concept: The Hill Function

One way we typically represent this reduced model is through the Hill function.

The Hill function is a mathematical model commonly used to represent the regulatory effects of transcription factors on gene expression. The function is named after Archibald Hill, who originally developed it to describe the binding of oxygen to haemoglobin, a process which is cooperative in nature (i.e., the binding of one oxygen molecule to haemoglobin increases the probability of subsequent oxygen molecules binding).

In the context of gene regulation, the Hill function can be used to represent the effect of a transcription factor on a gene’s expression rate. It essentially models the idea that as the concentration of a transcription factor increases, the rate of gene expression changes in a sigmoidal (S-shaped) manner: initially, when TF concentration is low, the gene expression is barely affected, then it rapidly increases as TF concentration reaches a certain threshold, and finally it plateaus as the TF concentration continues to increase.

Mathematically, the Hill function looks like this:

f(x) = (x^n) / (k^n + x&n)

Where:

- x is the concentration of the transcription factor.

- n is the Hill coefficient, representing the degree of cooperativity. If n > 1, there is positive cooperativity (i.e., the binding of one molecule promotes the binding of subsequent molecules). If n = 1, there is no cooperativity, and if n < 1, there is negative cooperativity (i.e., the binding of one molecule inhibits the binding of subsequent molecules).

- k is the so-called “dissociation constant,” representing the TF concentration at which the gene expression rate is half its maximum value.

This model’s intuition is that the TF molecules and the gene’s promoter sites can be thought of as interacting particles, and the gene expression rate can be thought of as being proportional to the likelihood of the TFs and promoter sites being bound together.

While the Hill function is a simplification of the complexities of gene regulation, it captures the essential non-linear and cooperative nature of transcription factor activity and provides a useful tool for mathematical and computational modelling in systems biology.

Modularity

Much like how many electronic devices are constructed from a few basic components — transistors, resistors, and capacitors — the modularity of these regulatory interactions is crucial to build more complex genetic circuits.

The last two decades have witnessed the development of several modular units that exhibit unique genetic functions. In 2000, Gardner et al. pioneered the creation of the toggle switch, a fundamental genetic circuit component, similar to a binary switch in electronics. It can stably oscillate between two states, responding to distinct external stimuli.

Following this, the genetic oscillator emerged as a tool that produces consistent, rhythmic oscillations. These serve as the bedrock for exploring more complex genetic modules, with the capacity to coordinate downstream biochemical reactions.

Building upon the idea of oscillation, Elowitz et al advanced the development of the repressilator, an extension of the genetic oscillator. This synthetic construct orchestrates a regular, rhythmic expression of certain genes, functioning as a biological clock or controller for periodic cellular processes. The repressilator consists of three interconnected transcriptional repressor systems, each inhibiting the next while being suppressed by the preceding one, resulting in a steady oscillation of gene expression that embodies predictable rhythmic behaviour.

With this modular approach, we can shift our focus from the nitty-gritty of gene sequences to designing and creating more sophisticated genetic circuits, much like how electrical engineers design advanced circuits from basic electronic components.

What are factors that control genetic circuit behaviour?

Circuit Topology: The architecture of the regulatory networks — also known as the circuit topology — plays a significant role in the behaviour of genetic circuits. For example, negative feedback loops, where the output of a pathway inhibits its own production, can generate oscillatory behaviour. On the other hand, positive feedback loops, in which the output enhances its own production, can create an amplification effect, like a snowball rolling down a hill, growing larger and faster as it progresses

Kinetic Parameters: These are essentially the “speed limits” of the genetic circuit, governing the rates of various biological processes. For instance, the rate at which proteins are synthesised is controlled by factors like promoter strength, ribosome binding site (RBS) strength, and the copy number of the plasmid that houses the gene of interest. By manipulating these kinetic parameters, we can fine-tune the performance of our genetic circuits.

Stochastic Fluctuations: It’s also important to account for the inherent variability or “noise” in biological systems. This is due to the fact that the processes of transcription and translation, crucial for gene expression, are inherently probabilistic at the molecular level. These stochastic fluctuations can lead to non-genetic cell-to-cell variability, meaning that even genetically identical cells can behave differently due to the small number of regulators and the random nature of their interactions. Understanding this biological noise is key for the reliable design and operation of genetic circuits.

What factors can we use to finetune gene expression in a circuit?

Understanding the expression tuning knobs is crucial for achieving precision in gene expression of circuit regulators. In the same way that a conductor manipulates various parts of the orchestra to achieve a precise symphony of instruments, we need to understand the various knobs we can turn to attain our desired behaviour from a genetic circuit.

1. Promoter Strength: The promoter is a DNA sequence that determines where transcription by RNA polymerase begins. Different promoters can initiate transcription at different rates, and thus, adjusting promoter strength can effectively tune the level of gene expression.

2. Ribosome Binding Site (RBS) Strength: The RBS is a sequence on the mRNA where the ribosome attaches to begin protein synthesis. The strength of the RBS influences how efficiently the ribosome can bind and initiate translation, thus affecting the rate of protein production.

3. Copy Number: This refers to the number of identical DNA sequences (plasmids) within a cell. Higher copy numbers can increase gene expression levels by providing more templates for transcription.

4. Protein Degradation Tags: These are molecular “tags” added to a protein that signal it for degradation. By manipulating these tags, we can control the lifespan of a protein within the cell, providing another way to tune gene expression.

5. Cooperativity of Repressor Binding to Promoter: Repressors are proteins that inhibit gene expression. The number and position of operators (specific DNA sequences where repressors bind) within a regulated promoter can influence the level of gene repression, affecting overall gene expression.

6. Expression of a Sequestering Molecule: This involves the production of molecules that can bind and “sequester” or remove a circuit regulator from its target site, effectively tuning gene expression. These can include decoy DNA operons, small RNAs, or Anti-Sigma factors that sequester Sigma factors, proteins needed for transcription initiation in bacteria.

What types of modules of gene regulators can we use to control transcription genetic circuits?

1. DNA-binding proteins: These are transcription factors that can either enhance or inhibit the flux of RNA polymerase (RNAP) on DNA. An example is the lac repressor, a protein that inhibits the transcription of lactose metabolism genes in the absence of lactose.

2. CRISPRi regulators: These make use of a catalytically inactive version of the Cas9 protein from the CRISPR system. When fused with a domain that recruits RNAP, it can act as a transcriptional activator, guided by a small guide RNA to the appropriate location on the DNA. Conversely, when used alone, it can function as a transcriptional repressor, blocking the transcription of target genes.

3. Recombinases: These are enzymes that can catalyse the rearrangement of DNA sequences, such as the unidirectional inversion catalysed by serine integrases. These rearrangements can affect the flux of RNAP, and hence gene expression. Due to their ability to flip DNA sequences, recombinases have been used to build logic gates and memory circuits. They are also considered more efficient than transcription factors due to their switch-like behaviour.

4. RNA regulators of transcription: Some genetic circuits make use of RNAs to control transcription. A mechanism that has been adapted for use in genetic circuits is Rho-mediated transcription termination, where short non-coding RNA (termed “RNA-out”) can repress the transcription of a target mRNA (“RNA-in”). This mechanism has been adapted from the tna operon, a gene cluster in bacteria involved in the metabolism of the amino acid tryptophan. These RNA regulators can be used to generate orthogonal circuits, which are circuits that function independently of each other within the same cell.

Common Failure Modes

Genetic circuits, while highly functional, are not immune to failure. These failures can manifest in several ways:

Genetic circuits are often very sensitive to their surrounding context. For instance, slight changes in the cellular environment can alter the function of a genetic circuit. For instance, temperature changes or alterations in nutrient availability can lead to changes in gene expression, and thus, alter the function of the circuit.

They also often require a delicate balance of their component regulators to function correctly. If the balance is disrupted, the circuit can fail. An example of this is the lac operon in E. coli, a naturally occurring genetic circuit. If there is an imbalance in the levels of the regulatory proteins, the lac operon may not respond appropriately to lactose levels, causing the bacteria to lose the ability to effectively metabolise lactose.

Sometimes, the regulators that are part of the genetic circuit can be toxic to the host cell or cause cross-talk with other circuits or the host’s native machinery. For instance, if a synthetic genetic circuit is introduced into a host bacterium, the bacterium’s native machinery could interfere with the circuit’s function, leading to circuit failure.

Over time, the sequences that make up a genetic circuit can change due to errors in DNA replication or other mutagenic processes, potentially leading to circuit failure. Finally, genetic circuits can fail if they have to compete for shared cellular resources, such as nucleotides for DNA replication, or amino acids for protein synthesis. If the cell is under stress and these resources become scarce, the function of the genetic circuit can be compromised.

Sources:
https://www.youtube.com/watch?v=Fy2v5DF8FCI This article summarises and expands upon the workshop held by Hatem Mohamed Gaber Abdelrahman.

Principles of Genetic Design Automation

Written by Sarrah Rose