RSS

Tag Archives: Protein

A quick followup on that lac operon post

Last week I posted a quick link about operons for my micro class to check out before taking their quiz on bacterial gene regulation This post is intended to complement that one. To go back to that post, click here. If there’s one thing to remember about operons it is that bacteria, lacking a nuclear membrane, regulate their genes differently than Eukaryotes. Having a nuclear membrane separates transcription and translation into two distinct compartments allowing for more subtle tweaking of Eukaryotic mRNAs before they are exported for translation.

Image

Click on this figure to go to a good description of how polycistronic genes work

One thing this does is it makes it very beneficial to package genes with related function closely on the genome and use a single regulatory region to control them all together. They wind up getting packed so closely together that they are actually expressed as a single messenger RNA – known as a polycistronic (meaning ‘many gene’) message.

Upstream of this polycistronic cassette are regulatory elements. One element common to all regulatory elements is the promoter. The promoter consists of several elements which ‘promote’ the binding of an RNA polymerase to the DNA. Additional regulatory elements exist to ensure that this polymerase only transcribes the genes if they are needed. In doing so, the cell conserves energy and components (e.g. Amino Acids) for only necessary processes.

In the case of the paradigm lac operon, lactose is a fuel source, but not as good as glucose. Therefore, enzymes to digest lactose are only needed when lactose is present, but glucose is not. In order to interrogate both conditions, two additional regulatory elements are present.

First, the operator sequence. This sequence binds a repressor protein that physically blocks the polymerase’s path in the absence of lactose. However, if lactose is present, the sugar binds to the repressor, causing a conformational (shape) change that causes the protein to release its grip on the operator sequence.

Second, a catabolite activator protein (CAP) will only bind to the DNA behind the RNA polymerase if cAMP is present. Let’s not get too distracted, other than to say that cAMP levels are high in the ABSENCE of glucose, and low when that sugar is present. When cAMP binds to the CAP protein it can now bind the DNA and do it’s other job: making a nice binding site for the RNA polymerase. Without CAP, the polymerase binds very inefficiently.

Together, the production of lactase enzymes (those that digest lactose) is exquisitely controlled in a way that conserves the most energy.

ImagePs – take a look at this graph and tell me why (not mechanistically, but rationally) the cell does not make lactase enzymes when both glucose and lactose are present.

 
Leave a comment

Posted by on March 15, 2014 in Uncategorized

 

Tags: , , , , , , , , , ,

BLyS Sequence Analysis

I’ve been playing with some sequence analysis and phylogentic tree construction programs recently because I would like to introduce these sorts of data analysis into my biology classes. As a sample protein, I decided to use BLyS / BAFF, a protein important in regulating B Cell numbers. I’ve always wondered about the origin of this kind of molecule, since working on it in grad school, and this seemed like a decent way to get some ideas about where it might come from.

The first thing I did was go to the NIH’s National Library of Medicine website: http://www.ncbi.nlm.nih.gov

It’s easy to search for any protein / gene / whole genome you are interested in examining. Knowing that BLyS is vital in humans and mice, I chose to start with the human sequence. I retrieved it as the following:

>gi|20196464|dbj|BAB90856.1| BLyS [Homo sapiens]
MDDSTEREQSRLTSCLKKREEMKLKECVSILPRKESPSVRSSKDGKLLAATLLLALLSCCLTVVSFYQVA
ALQGDLASLRAELQGHHAEKLPAGAGAPKAGLEEAPAVTAGLKIFEPPAPGEGNSSQNSRNKRAV

The easiest tool to find similar proteins in other animals is the Basic Local Alignment Search Tool for proteins, or BLASTp. Just using default settings, I pasted the sequence in the search field and hit go. (note, I actually just used the accession number, not the whole sequence)

Image

This retrieved tons of proteins with similar sequences from the vast database of sequence information, from which I chose several model species. One thing I wanted to do was to include several primates as a sort of internal calibration (assuming that they would all have very similar sequences compared to more distantly related species). I also wanted to get a few animals’ sequences who are quite distantly related to humans (frog and ground tit fir that bill)

Once I had a list, I put them all into a single text file and then used that in a second program. This time, I decided that the best ‘multiple alignment tool’ would be CLUSTALX. It’s been around for a while and can create data in a number of different forms. Besides, it’s free and versions are available for both mac and PC.

Again, for starters, I just accepted the default parameters and did a quick alignment:

Image

Obviously, there’s something odd about the canid familiars (dog) sequence, but before I did anything about that, I just wanted to see what a phylogenetic tree looked like. This is another thing that Clustal does well, it will export your sequence alignment as tree data in a number of formats, then I could plug that data into one final program. This last is a web based program that I access through a french site (but you can probably find it in a number of places). The program is called DRAWGRAM. It accepts alignment data and outputs a graphical tree representation of the alignment.

This is an important logical step… What I’m doing is asking for a family tree of sorts to be displayed that represents the relationship of the sequences I provided. We might want to assume that this also tells us how related the organisms that have these proteins are – and that’s not wrong, but it’s also not thorough as we’re only using ONE protein to make that assumption.

Here’s my first tree:

Image

Note how isolated Canis is on this representation.

Finally, I went back and truncated the Canis sequence to a place where I suspect the protein actually starts – my sequence from the NCBI gave me a string of Amino Acids at the front of the protein that I think are probably not there, but just got added by some computer algorithm without proper human oversight.

Once I did that Canis (by the way, I remained the sequence ‘DOG’ so I was sure it was the new one) fell in line with a sequence more similar to that seen in cats (felis):

ImageThat’s it for now. Although I expect that I will dig a little deeper with more animals to see if I can come closer to an ‘original BLyS’.

 References:

  1. Dereeper A., Audic S., Claverie J.M., Blanc G. BLAST-EXPLORER helps you building datasets for phylogenetic analysis. BMC Evol Biol. 2010 Jan 12;10:8. (PubMed)
  2. Dereeper A.*, Guignon V.*, Blanc G., Audic S., Buffet S., Chevenet F., Dufayard J.F., Guindon S., Lefort V., Lescot M., Claverie J.M., Gascuel O. Phylogeny.fr: robust phylogenetic analysis for the non-specialist. Nucleic Acids Res. 2008 Jul 1;36(Web Server issue):W465-9. Epub 2008 Apr 19. (PubMed) *: joint first authors
  3. Felsenstein J. PHYLIP – Phylogeny Inference Package (Version 3.2). 1989, Cladistics 5: 164-166
  4. Larkin,M.A., Blackshields, G., Brown, N.P., Chenna, R., McGettigan, P.A., McWilliam, H., Valentin, F., Wallace, I.M., Wilm, A., Lopez, R., Thompson, J.D., Gibson, T.J., Higgins, D.G. (2007) Clustal W and Clustal X version 2.0. Bioinformatics, 23:2947-2948.
  5. Thompson,J.D., Gibson,T.J., Plewniak,F., Jeanmougin,F. and Higgins,D.G. (1997) The ClustalX windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Research, 25:4876-4882.
 
Leave a comment

Posted by on March 7, 2014 in Uncategorized

 

Tags: , , , , , , , , , , , , , ,

Playing with sequence data

I’ve been playing with sequence data a lot recently. It’s been so long since I have done any real work with this stuff that I’ve fallen behind in my knowledge of the tools for analyzing the data. 

I hope to post a couple of things here after I have been working on as a sort of tutorial on how to use ClustalX as an alignment tool that (I think) can even build phylogenetic trees from homology data.

 
Leave a comment

Posted by on March 5, 2014 in Uncategorized

 

Tags: , , , , , , ,

Codon Usage Bias – Part II

Yesterday I got caught up writing about the way that biologists use codon usage bias to optimize cloned genes for expression. Today, I want to finish this up by discussing two ideas about why codon optimization occurs.

#1: Through some stochastic (random) process, one or more tRNAs corresponding to certain codons was / were amplified through gene duplication resulting in higher efficiency translation when these codon are used. This implies that the efficiency acts as a selection process that re-inforces the once-random preference.

Evidence for this model lies in the observed relationship between tRNA genes and codon usage. Work done at the Institut Pasteur shows that, “By analyzing 102 genomes we showed that as minimal generation times get shorter, the genomes contain more tRNA genes, but fewer anticodon species.”

Image

Codon usage correlates with gene copy number of corresponding tRNA

#2: Weatheritt and Babu suggest that there are additional codes overlaying that of the DNA->RNA->Protein code. These codes are the result of DNA binding to proteins, RNA looping, or micro-RNA binding that may impose their own restrictions on sequence.

The original paper, by Stergachiset al. of the Stamatoyannopoulos laboratory at the University of Washington, used DNAse footprinting to determines the areas of DNA that were bound by proteins.

Imagine DNA as a long clothesline. In some locations socks hang from the clothesline covering up small areas of the string. DNAse is an enzyme that can chew up open DNA, but is not capable to displacing proteins to chew up the sequences they bind. That is, wherever the clothesline is empty, it is goggled up; wherever a sock hangs, those regions are protected and we can go back to see what they are.

ImageStergachis et al decided to look at these sequences to determine if any of these correlated with the preferred codons for several amino acids that have a number of possible codon alternatives.

What they found actually does account for some of the observed codon bias. In the figure below, taken from their paper, note the difference between the preference for the codon, CTG, if this codon appears in an area where proteins bind to the DNA. This paper does not specifically define the proteins that bind the DNA at any given location, but it is clear that this sequence is vital to two distinct functions.

Screen Shot 2014-02-10 at 11.05.25 AM

CTG codon preference is greater when occurring at a protein-binding site

Because CTG remains a preferred codon even in the absence of protein binding, it is reasonable that both models may be correct. i.e. protein binding may have tipped the balance in favor of certain codons which sets up an environment where multiple tRNA genes for this specific codon, over others coding for the same amino acid, is preferred.

Lastly, I was alerted to the following video blog addressing a different interpretation of the same data:

 
Leave a comment

Posted by on February 10, 2014 in Uncategorized

 

Tags: , , , , , , ,

Codon Usage Bias – Part I

To the molecular biologists:

Optimize ye codons while ye may

For time is a-flying

And this clone you have in R & D today

Tomorrow will be … in manufacturing- and it’s just impossible to change anything at that point, so forget it.

Image

I”m no rocket surgeon

 Codon Usage Bias – Part I

I read an article yesterday about codon bias that has been stuck in my head ever since. The article, appeared as a ‘Perspectives’ piece in the 13 Dec 2014 issue of Science, with the title, ‘The Hidden Codes That Shape Protein Evolution.’

This article addresses some details not often considered in how DNA directs the synthesis of proteins.

 I spend a lot of time in my classes discussing the basic mechanism by which DNA –>RNA –> Protein, known as the Central Dogma. A lot gets left out of these lectures in order to keep it simple, which sometimes keeps the way I think about the flow of information pretty simplified as well.

Fortunately, this article rattled my cage enough to open my mind to the myriad influences that go into the stuff of life. Here, Weatheritt and Babu, look at how DNA sequences may be under selective pressures independent of just the proteins they encode.

I’ve done a fair amount of molecular biology in my life, including cloning genes and moving them into other organisms for expression as drugs or drug components. One example of this was in a lab where we used live-vectors as immunogens in order to take advantage of the uniquely broad immune response this single-cell pathogen elicits. The immune responses we wanted to trigger / amplify were typically against human tumor proteins or the products of human viruses (e.g. HIV, HPV), however the organism we were using as a vaccine was a bacteria.

As I said above, I usually teach the Central Dogma in a way that omits many of the complications seen in the real world. So, when we look at a codon chart, we see the redundancy (multiple DNA codons make the same amino acid) to illustrate how a change in the DNA sequence can often fail to change the protein sequence at all. These are called ‘silent mutations’.

 Image

The way these codon charts work is by triangulating a position in the middle of the chart using the bases depicted along three of the four edges. For example, the codon AUG is read by locating the ‘A’ on the left margin, the ‘U’ on the top margin, and the ‘G’ on the right margin. The location this identifies is an amino acid called Methionine (abbreviated as met) on this chart.

Notice that if the first two bases in a codon are CU, then it does not matter what the third base is, no matter what, this codon will call for a Leucine (leu).. This means is the sequence of RNA is CUU originally, but mutates to CUC, there will be no change in the protein.

What codon optimization addresses is the fact that different organisms tend to prefer some codons over others, even if they encode the same amino acid. This has been appreciated for many years now so when a molecular biologist takes a protein (e.g. from a human tumor) that they want made by bacteria and they redesign the DNA sequence in a way that codon preferences are maximized in the organism that will express the protein.

This figure examines the percentage of times a gene uses a particular codon to make Leucine. In the bacteria, E. coli, CTG is used nearly 50% of the time. Meanwhile, in the yeast, S. cerevisea, TTA and TTG are preferred.

 (Note that T in DNA = U in RNA)

 Image

 

————

So, what does this mean? Consider a simplified example…

I want to clone this protein from a yeast and grow it up in bacteria:

Met – Leu – Leu   [stop]

ATG- TTA – TTG – TAG

 We would take this DNA from the yeast and then modify the sequence by changing the two Leucine codons into the preferred sequence in bacteria (CTG):

Met – Leu – Leu   [stop]

ATG- CTG – CTG – TAG

The result should be a sequence of DNA that the bacteria will be able to optimally translate into protein. 

———

This has worked out to be much longer and more technical than I intended – and I haven’t even addressed the new ideas brought up in the Science article.

Therefore, I’m going to stop here and continue tomorrow with part II

 
Leave a comment

Posted by on February 9, 2014 in Uncategorized

 

Tags: , , , , , , , ,

Cellular robotics? A cute video summarizing cellular functions from TedEd

Check out this video. I think I like it, but I’m not positive yet. It’s so well done that I’m kind of taken by the aesthetics, however, I’m not sure that this makes cell biology easier to understand. What’s your opinion?

 
2 Comments

Posted by on November 25, 2013 in Uncategorized

 

Tags: , , , , , , ,

Signal sequence and translation of secreted or membrane-bound proteins

I’ve been looking for a good animation illustrating how signal sequences of proteins are bound by signal recognition proteins (SRPs) that bring ribosomes into contact with the Endoplasmic Reticulum (ER) during translation, and I’ve finally found one. This particular animation has no narration, but it does show the process of translation fairly well. Note that once a signal sequence emerges from the ribosome, it is captured by SRPs and the whole system is taken to the membrane. This video illustrates the process in prokaryotes, which is very similar, except that prokaryotes don’t have ERs, but do secrete material through the plasma membrane. It turns out that just about everything else is the same though; just imagine this as being a larger Eukaryotic cell and the membrane being that of the ER, not the plasma membrane.

http://www.dnatube.com/video/1227/Signal-Recognition-Particle

A second video that I’ve come across at the same site  but much more recently, shows the process as it occurs in Eukaryotic cells. The animation is much less realistic, but the message is the same.

http://www.dnatube.com/nuevo/player.swf
 

enjoy.

 
Leave a comment

Posted by on September 4, 2013 in Uncategorized

 

Tags: , , , , ,