## Saturday, April 9, 2016

Serum titers have inevitably been plaguing me since I started doing B cell research for my dissertation. Human serum is diluted two-fold down a plate, and I determine what dilution constitutes an EC50 determined by prior experiments. Overall, a fairly straightforward assay. When I finally sat down to graph the data, I did a quick Google search to see how this sort of data has typically been presented. With two-fold dilutions, serum titers are usually presented as a histogram of the geometric mean, with the axis transformed to the Log2 scale. Upon graphing, I shared the final figure with my PI who promptly asked me why I didn’t just graph mean instead of the geometric mean. Unfortunately, the answer of “everyone else is doing it” is not typically an accepted answer. So let’s take a step back and determine exactly how I should be graphing continuous data and what makes the most sense for my data set…

Luckily, Motulsky has the answer to everything. Looking at the set-up for my assay, I am working with two-fold dilutions. Inherently, this makes it such that my data is distributed logarithmically and NOT normally. This is obvious, especially when comparing my data graphed logarithmically (left) versus normally (right).

So now the question is, if I were to plot this as a histogram, do I show mean or geometric mean (ie. what everyone else is doing). I hadn't even tried to visualize my data as any other measure. So I tried several options. There the geometric mean (left), median (center), and mean (right).
Graphically, I can see that the mean is clearly much higer than any of the raw titer values that I had measured. It’s not a practical summary of my data, which is presumably due to the multiplicative nature of the two-fold dilutions of my serum. However, I noticed that the geometric mean and median values were about the same. Unfortunately, Motulsky doesn’t address this, so I had to turn to a quick Google search. The answer wasn’t quite as exciting and groundbreaking as I expected, with geometric mean just being the preferred measurement for logarithmic data, such as mine is. Although it is accepted that the median value will more or less result in a similar “average” of sorts, the median should really only be needed where I to have a couple of zeros in my data set.

So, at the end, I guess the lesson I have learned is we just want to graph our data in the most palatable way possible, and in the most informative way possible. For my data set, it is the geometric mean that provides the best model for the serum titer data set. And I really shouldn’t be so intimidated by having to transform my data logarithmically. And also, that the Moltulsky text book is pretty informative.