RSS

Tag Archives: analysis

Mozart and Harper Lee

I could really use some music to honor my wife on the anniversary of her death...

I could really use some music to honor my wife on the anniversary of her death…

In 1791 Mozart died while working on a beautiful piece of music, his Mass in D minor. I love much of Mozart’s work, but I think that this is probably my favorite (perhaps a tie with The Marriage of Figaro). Yet, there has always been discussion about how much Mozart, himself, completed and how much his friend and copyist (possibly student), Franz Sussmayr, wrote as he completed the manuscript for delivery to Count Franz von Wallsegg in 1792.

What is relevant here is that it does not really matter to me who wrote it. It’s attributed to Mozart, so I assume that he did the majority of the work in at least shaping it and providing hints as to how it would develop.

Similar accusations have been raised about Harper Lee’s authoring (or lack of authoring) To Kill a Mockingbird, and now Go Set a Watchman. Ms. Lee was good friends with another iconic writer, Truman Capote. The two were childhood friends, and she worked with him for some time as an research assistant for his opus, In Cold Blood.

To answer this, Maciej Eder and Jan Rybicki assembled a data analysis algorithm to analyze writing styles. With only a single novel, it is difficult to say much about the authorship (of Mockingbird), however, with the release of a second novel, a data analysis technique known as ‘cluster analysis’ becomes more meaningful. Using a number of analyses, the two data miners assert that Ms. Lee’s voice is her own, distinct from Capote’s. One of these analyses is presented below (taken from the Computational Stylistics Group website), examining most- frequent -word usages by Lee and a number of other Southern Authors.

Lee_and_others_consensus

We see that both of Lee’s books cluster together (as do other authors), and that her own style appears to more closely resemble authors that she professed were influential to her rather then that of her friend, Capote.

What is most important to me though is how I feel about the text. At this point I am nearing the end, but have not gone far enough that I can say definitively what my conclusions are. I admit that it took some time to get into the novel – the first chapter or so didn’t feel right to me – but most of the book has developed well in my opinion. I think what will make or break this book in terms of real importance to me is where things go with respect to the central question of race that it deals with.

Regardless of that conclusion, I have greatly enjoyed this book (as I did Mockingbird), for its ability to transport the reader into the mind and body of the protagonist, Scout. Taking us in a journey through time – twice! Once to Scout’s childhood, and again to her adulthood, still many years past now, just after World War II.

These books and Mozart’s Requiem Mass, they are what they are. And I intend to enjoy them by that standard.

The Requiem would be no less a masterpiece if it was written by Donald Duck. And Go Set a Watchman is what it is regardless of who wrote it or who wanted it published. The fact is, it’s out there and the whole world is devouring it this week. I say, discuss the politics of the book all you want, it’s all quite interesting too, but judge it on its own merits, irrespective of all these other questions.

That said… are you reading it? What do you think? I’ll probably be finished by the time anyone gets around to reading this, so answer as thoroughly as you like. Let’s consider this a SPOLIER ALERT for anything beyond this point – don’t read the comments if you have not finished the text. (OK, with all that lead up, I need some comments….)

 
Leave a comment

Posted by on July 17, 2015 in Uncategorized

 

Tags: , , , , , , , , , , ,

Flow Rate

I received an extra credit essay from one of my students based on a question from the textbook that I had to do a little modeling to understand. The question was one about patients with atherosclerosis that could be explained using Poiseuille’s Law. This Law describes the relationship between the flow rate, pressure, radius and viscosity of a liquid flowing through a vessel.

Basically, it is presented as:

Flow Rate = change in Pressure * pi * radius^4* Length of the vessel * viscosity

.                                                                    8

The question asks, ‘why symptoms of myocardial ischemia do not usually occur until ~75% of a vessel has been occluded.’

The easy answer is that that is the cutoff after which the amount of blood required to provide Oxygen sufficient for the heart’s metabolism is insufficient. However, this can be visualized qualitatively simply by graphing the equation. To do this, I made up a quick spreadsheet and just plugged in ‘1’ for all the variables, then solved for the flow rate. From here, I simply plugged in fractions into the radius variable.

Here’s the raw data:

Screen Shot 2015-05-08 at 5.00.16 PM

1.00 – 0.75 (i.e. a 75% blockage) = 0.25 is the number from the question. Here’s the analysis:

Screen Shot 2015-05-08 at 5.02.20 PM

Note how the Flow Rate has dropped to essentially ZERO when the radius is occluded 75%.

There may be more to this, but I think that just looking at this analysis of the equation answers a lot.

ps – I just spent a hell of a lot of time and effort messing around in the terminal of my mac changing the screen capture file type all to realize that it wasn’t my mac that was the problem at all – I simply was not using the largest image type available in wordpress and then tried to scale up my image after it was inserted – don’t do this. You lose all of your image quality.

 
Leave a comment

Posted by on May 8, 2015 in Uncategorized

 

Tags: , , , , , ,

BLyS Sequence Analysis

I’ve been playing with some sequence analysis and phylogentic tree construction programs recently because I would like to introduce these sorts of data analysis into my biology classes. As a sample protein, I decided to use BLyS / BAFF, a protein important in regulating B Cell numbers. I’ve always wondered about the origin of this kind of molecule, since working on it in grad school, and this seemed like a decent way to get some ideas about where it might come from.

The first thing I did was go to the NIH’s National Library of Medicine website: http://www.ncbi.nlm.nih.gov

It’s easy to search for any protein / gene / whole genome you are interested in examining. Knowing that BLyS is vital in humans and mice, I chose to start with the human sequence. I retrieved it as the following:

>gi|20196464|dbj|BAB90856.1| BLyS [Homo sapiens]
MDDSTEREQSRLTSCLKKREEMKLKECVSILPRKESPSVRSSKDGKLLAATLLLALLSCCLTVVSFYQVA
ALQGDLASLRAELQGHHAEKLPAGAGAPKAGLEEAPAVTAGLKIFEPPAPGEGNSSQNSRNKRAV

The easiest tool to find similar proteins in other animals is the Basic Local Alignment Search Tool for proteins, or BLASTp. Just using default settings, I pasted the sequence in the search field and hit go. (note, I actually just used the accession number, not the whole sequence)

Image

This retrieved tons of proteins with similar sequences from the vast database of sequence information, from which I chose several model species. One thing I wanted to do was to include several primates as a sort of internal calibration (assuming that they would all have very similar sequences compared to more distantly related species). I also wanted to get a few animals’ sequences who are quite distantly related to humans (frog and ground tit fir that bill)

Once I had a list, I put them all into a single text file and then used that in a second program. This time, I decided that the best ‘multiple alignment tool’ would be CLUSTALX. It’s been around for a while and can create data in a number of different forms. Besides, it’s free and versions are available for both mac and PC.

Again, for starters, I just accepted the default parameters and did a quick alignment:

Image

Obviously, there’s something odd about the canid familiars (dog) sequence, but before I did anything about that, I just wanted to see what a phylogenetic tree looked like. This is another thing that Clustal does well, it will export your sequence alignment as tree data in a number of formats, then I could plug that data into one final program. This last is a web based program that I access through a french site (but you can probably find it in a number of places). The program is called DRAWGRAM. It accepts alignment data and outputs a graphical tree representation of the alignment.

This is an important logical step… What I’m doing is asking for a family tree of sorts to be displayed that represents the relationship of the sequences I provided. We might want to assume that this also tells us how related the organisms that have these proteins are – and that’s not wrong, but it’s also not thorough as we’re only using ONE protein to make that assumption.

Here’s my first tree:

Image

Note how isolated Canis is on this representation.

Finally, I went back and truncated the Canis sequence to a place where I suspect the protein actually starts – my sequence from the NCBI gave me a string of Amino Acids at the front of the protein that I think are probably not there, but just got added by some computer algorithm without proper human oversight.

Once I did that Canis (by the way, I remained the sequence ‘DOG’ so I was sure it was the new one) fell in line with a sequence more similar to that seen in cats (felis):

ImageThat’s it for now. Although I expect that I will dig a little deeper with more animals to see if I can come closer to an ‘original BLyS’.

 References:

  1. Dereeper A., Audic S., Claverie J.M., Blanc G. BLAST-EXPLORER helps you building datasets for phylogenetic analysis. BMC Evol Biol. 2010 Jan 12;10:8. (PubMed)
  2. Dereeper A.*, Guignon V.*, Blanc G., Audic S., Buffet S., Chevenet F., Dufayard J.F., Guindon S., Lefort V., Lescot M., Claverie J.M., Gascuel O. Phylogeny.fr: robust phylogenetic analysis for the non-specialist. Nucleic Acids Res. 2008 Jul 1;36(Web Server issue):W465-9. Epub 2008 Apr 19. (PubMed) *: joint first authors
  3. Felsenstein J. PHYLIP – Phylogeny Inference Package (Version 3.2). 1989, Cladistics 5: 164-166
  4. Larkin,M.A., Blackshields, G., Brown, N.P., Chenna, R., McGettigan, P.A., McWilliam, H., Valentin, F., Wallace, I.M., Wilm, A., Lopez, R., Thompson, J.D., Gibson, T.J., Higgins, D.G. (2007) Clustal W and Clustal X version 2.0. Bioinformatics, 23:2947-2948.
  5. Thompson,J.D., Gibson,T.J., Plewniak,F., Jeanmougin,F. and Higgins,D.G. (1997) The ClustalX windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Research, 25:4876-4882.
 
Leave a comment

Posted by on March 7, 2014 in Uncategorized

 

Tags: , , , , , , , , , , , , , ,