Tag Archives: codon

Codon Usage Bias – Part II

Yesterday I got caught up writing about the way that biologists use codon usage bias to optimize cloned genes for expression. Today, I want to finish this up by discussing two ideas about why codon optimization occurs.

#1: Through some stochastic (random) process, one or more tRNAs corresponding to certain codons was / were amplified through gene duplication resulting in higher efficiency translation when these codon are used. This implies that the efficiency acts as a selection process that re-inforces the once-random preference.

Evidence for this model lies in the observed relationship between tRNA genes and codon usage. Work done at the Institut Pasteur shows that, “By analyzing 102 genomes we showed that as minimal generation times get shorter, the genomes contain more tRNA genes, but fewer anticodon species.”


Codon usage correlates with gene copy number of corresponding tRNA

#2: Weatheritt and Babu suggest that there are additional codes overlaying that of the DNA->RNA->Protein code. These codes are the result of DNA binding to proteins, RNA looping, or micro-RNA binding that may impose their own restrictions on sequence.

The original paper, by Stergachiset al. of the Stamatoyannopoulos laboratory at the University of Washington, used DNAse footprinting to determines the areas of DNA that were bound by proteins.

Imagine DNA as a long clothesline. In some locations socks hang from the clothesline covering up small areas of the string. DNAse is an enzyme that can chew up open DNA, but is not capable to displacing proteins to chew up the sequences they bind. That is, wherever the clothesline is empty, it is goggled up; wherever a sock hangs, those regions are protected and we can go back to see what they are.

ImageStergachis et al decided to look at these sequences to determine if any of these correlated with the preferred codons for several amino acids that have a number of possible codon alternatives.

What they found actually does account for some of the observed codon bias. In the figure below, taken from their paper, note the difference between the preference for the codon, CTG, if this codon appears in an area where proteins bind to the DNA. This paper does not specifically define the proteins that bind the DNA at any given location, but it is clear that this sequence is vital to two distinct functions.

Screen Shot 2014-02-10 at 11.05.25 AM

CTG codon preference is greater when occurring at a protein-binding site

Because CTG remains a preferred codon even in the absence of protein binding, it is reasonable that both models may be correct. i.e. protein binding may have tipped the balance in favor of certain codons which sets up an environment where multiple tRNA genes for this specific codon, over others coding for the same amino acid, is preferred.

Lastly, I was alerted to the following video blog addressing a different interpretation of the same data:

Leave a comment

Posted by on February 10, 2014 in Uncategorized


Tags: , , , , , , ,

Codon Usage Bias – Part I

To the molecular biologists:

Optimize ye codons while ye may

For time is a-flying

And this clone you have in R & D today

Tomorrow will be … in manufacturing- and it’s just impossible to change anything at that point, so forget it.


I”m no rocket surgeon

 Codon Usage Bias – Part I

I read an article yesterday about codon bias that has been stuck in my head ever since. The article, appeared as a ‘Perspectives’ piece in the 13 Dec 2014 issue of Science, with the title, ‘The Hidden Codes That Shape Protein Evolution.’

This article addresses some details not often considered in how DNA directs the synthesis of proteins.

 I spend a lot of time in my classes discussing the basic mechanism by which DNA –>RNA –> Protein, known as the Central Dogma. A lot gets left out of these lectures in order to keep it simple, which sometimes keeps the way I think about the flow of information pretty simplified as well.

Fortunately, this article rattled my cage enough to open my mind to the myriad influences that go into the stuff of life. Here, Weatheritt and Babu, look at how DNA sequences may be under selective pressures independent of just the proteins they encode.

I’ve done a fair amount of molecular biology in my life, including cloning genes and moving them into other organisms for expression as drugs or drug components. One example of this was in a lab where we used live-vectors as immunogens in order to take advantage of the uniquely broad immune response this single-cell pathogen elicits. The immune responses we wanted to trigger / amplify were typically against human tumor proteins or the products of human viruses (e.g. HIV, HPV), however the organism we were using as a vaccine was a bacteria.

As I said above, I usually teach the Central Dogma in a way that omits many of the complications seen in the real world. So, when we look at a codon chart, we see the redundancy (multiple DNA codons make the same amino acid) to illustrate how a change in the DNA sequence can often fail to change the protein sequence at all. These are called ‘silent mutations’.


The way these codon charts work is by triangulating a position in the middle of the chart using the bases depicted along three of the four edges. For example, the codon AUG is read by locating the ‘A’ on the left margin, the ‘U’ on the top margin, and the ‘G’ on the right margin. The location this identifies is an amino acid called Methionine (abbreviated as met) on this chart.

Notice that if the first two bases in a codon are CU, then it does not matter what the third base is, no matter what, this codon will call for a Leucine (leu).. This means is the sequence of RNA is CUU originally, but mutates to CUC, there will be no change in the protein.

What codon optimization addresses is the fact that different organisms tend to prefer some codons over others, even if they encode the same amino acid. This has been appreciated for many years now so when a molecular biologist takes a protein (e.g. from a human tumor) that they want made by bacteria and they redesign the DNA sequence in a way that codon preferences are maximized in the organism that will express the protein.

This figure examines the percentage of times a gene uses a particular codon to make Leucine. In the bacteria, E. coli, CTG is used nearly 50% of the time. Meanwhile, in the yeast, S. cerevisea, TTA and TTG are preferred.

 (Note that T in DNA = U in RNA)




So, what does this mean? Consider a simplified example…

I want to clone this protein from a yeast and grow it up in bacteria:

Met – Leu – Leu   [stop]


 We would take this DNA from the yeast and then modify the sequence by changing the two Leucine codons into the preferred sequence in bacteria (CTG):

Met – Leu – Leu   [stop]


The result should be a sequence of DNA that the bacteria will be able to optimally translate into protein. 


This has worked out to be much longer and more technical than I intended – and I haven’t even addressed the new ideas brought up in the Science article.

Therefore, I’m going to stop here and continue tomorrow with part II

Leave a comment

Posted by on February 9, 2014 in Uncategorized


Tags: , , , , , , , ,

DNA –> RNA –> Protein (in greater detail)

I emphasize the importance of ‘the central dogma’Image pretty regularly throughout the semester in General Biology. This idea represents one of the core theories of biology and helps to explain an enormous amount about life. This ‘dogma’ explains how the information contained in DNA is replicated prior to cell division – and used to make drafts of that information (RNA) that can guide the construction of proteins that get the work of the cell done.

In the current unit, we are expanding this theory and examining the processes and the molecules they create more closely. Fortunately, each of these processes is elegantly illustrated in a set of animations available on the HHMI website.

The first video presents a model of replication. The model (as shown) is correct in its idea, but is not intended to be a model of HOW replication occurs, only how, in the words of Watson and Crick, “… [T]he specific pairing we have postulated immediately suggests a possible copying mechanism for the genetic material.”

The actual mechanism of DNA replication is complicated by the anti-parallel arrangement of the paired DNA strands and the fact that DNA Polymerase, the enzyme responsible for copying the DNA, can work only in one direction.

The action of the DNA polymerase, along with some additional enzymes (Helicase and Ligase) is illustrated mechanistically in the following animation:

Screen Shot 2013-04-30 at 6.36.15 PMOnce DNA is replicated, during S Phase of the cell cycle, the cell is ready to divide and provide one complete copy of the DNA to each of the two daughter cells. In this way, DNA replication allows for the continuity of genetic information from one generation (of cells or whole organisms) to the next.

Throughout the cell’s life, it is necessary to produce proteins to accomplish the work of that particular cell. Again, the information contained in the DNA is copied, this time to a messenger RNA (mRNA) strand, and the instructions to make the protein are carried into the cytoplasm. This process, called Transcription, is carried out primarily by the enzyme, RNA polymerase, as illustrated below:

Once an mRNA is constructed, it is transported from the nucleus (where the DNA resides) into the cytoplasm. There, a Ribosome will coordinate the recruitment of transfer RNAs (tRNA) bearing specific Amino Acid building blocks called for to synthesize the protein. This process, called translation, is illustrated by HHMI below:

1 Comment

Posted by on April 29, 2013 in Uncategorized


Tags: , , , , , , , , , , , ,