This is not an easy subject.
Here's an email thread from the other day between Rick Kahn, TJ Murphy and Nancy Bliwise. Since both TJ and Nancy direct courses at Emory for PhD students and undergraduates, respectively, Rick consulted each with his question.
I've injected a final thought at the end.
Rick: An issue arose at lab meeting yesterday when I suggested to a student that they include a second set of samples, same as the first, to provide a biological replicate. This is a control experiment demanded by a reviewer that is very unlikely to change the point of the paper, but of course that is not really the point here. You were quoted in our discussion as saying that it is not a biological replicate unless performed on a different day - which is pretty much in agreement with this article, which I like a lot but disagree on this point:
Given that the cells are all "the same" or come from one vial or even separate vials all frozen down together, I see no value in adding extra days and it is impractical to obtain multiple aliquots of cell lines from different vendors for example. Certainly if doing rats or mice or human patients you gather from different animals. But the assumption in cell line experiments is that they are clonal and therefore supposed to be identical.
I think it is a discussion worth having and I am thinking of soliciting views across the GDBBS, for those who use cell culture in experiments. Thoughts?
TJ: A sample is comprised of replicates. Statistically, here are the two unbreakable rules on replicates within a sample:
1. Each replicate should be independent from every other replicate.
2. They should be randomized (in some way, shape, or form).
The problem with continuous cultured cells is that they are immortalized clones. Another problem is that it is hard to randomize culture wells to treatments given the need to be efficient with pipetting, etc.
Yes, you can argue that cultured cells are so similar from day to day that it seems absurd to consider a replicate taken on Monday as independent from one taken on Tuesday. I see that point.
It is harder to argue there is anything different at all between two replicates that are taken on the same day, side-by-side.
The day-to-day rule introduces a modicum of random chance into the design, given a system where the homogeneity is so heavily stacked against random chance.
Monday and Tuesday are more independent form each other more than are the left side and right sides of the bench on Monday.
Rick: I think we agree in principles here. As the article says it is incrementally better. But I would argue impractical or not worth the increase in terms of time and resources. I suspect if put to a vote of others my way is the overwhelmingly popular solution, though admittedly incrementally less independent.
TJ: I always made cell culture passage number the mark of independence. In my own system, a continuous primary cell culture line, there was good variation from passage to passage, much more so than within passage.
I think one of the big reasons we're in the unreliable research mess we're in is because when its been a choice between sound, unbiased statistical practice vs efficiency/cost, the latter wins too often. You may not like to hear that, it is surely an unpopular viewpoint for the go-go-go PI set, but I don't have any doubt that is what underlies the problem.
"I'm not cutting corners, I'm saving money."
The thumbs are on the scales in all kinds of ways in the name of efficiency.
It's like the culture of conducting unbiased scientific research has been replaced by a manufacturing culture. We should have a few beers just to hammer out the latter point.
Nancy: Rick, Do you know if the reviewer was requesting a biological replicate (e.g., lines from different animals) or a technical replicate to show that the procedure/phenomenon can be reproduced by others? While the article you provided suggests that replicating over different days or weeks, depending on what is appropriate for the experiment provides a purer technical replicate, I am not particularly convinced by that argument. I would argue that any replicate depends on the conditions/issues that you think need to be demonstrated. For example, if the experimental manipulation is highly technical and requires great skill, I would think a replication by different technicians/experimental staff would be important. If lab conditions are important, then different days/weeks might be meaningful. If it requires particular equipment calibrated carefully, then having another lab replicate it might be important. If the concern is just that this might be a fluke of a particular set of circumstances, then a single replication that shows the same phenomenon would be ok.
TJ was addressing independence and "n". It is possible to do a replicate on the same day/week and model the analysis to show that there were multiple cultures created nested within the original line. Doing the experiment on different days/week produces some independence but only if the lab conditions across days/weeks are relevant to the question.
TJ: I'd argue that statistical independence is only critical when doing significance testing. Not every experimental observation needs a significance test. For example, a central observation related to the main scientific argument should be designed for significance testing. Observations that are ancillary, adding only texture to the central argument, should be repeated to ensure reliability.
Let's distill this down to something of a more general take away.
We want to know either or all of three things about our data: Is the observation accurate? Are the findings precise? How reliable are the results?
It's up to the researcher to decide which of these is the most important objective of an experiment and then to go about collecting their information accordingly.
When accuracy is a key finding, replication has the effect of improving the estimate since standard error reduces with more replicates.
We standardize our techniques and analyze variance when precision is important.
When reliability of the observation is important, then the biggest threat are biases, both expected and unexpected. The device invented to deal with that problem is randomization.