Unbiased Research: January 2016

Saturday, January 30, 2016

Quick thought on randomization and normalization

@ereinerts @rchenmit plenty of good tricks to 'normalize' non-normal random variables, no good way to fix a non-random variable
— TJ Murphy (@TJ__Murphy) January 29, 2016

Tuesday, January 26, 2016

Another Podcast

Hi Everyone.

I'm stealing Arielle's idea and sharing this Freakonomics podcast about how to become a "super-forecaster" that aired last week. It's an interesting take on how important understanding probabilities and statistics can be outside our hard-science world.

I particularly appreciate the comments that what sets apart a bad or overconfident "forecaster" from a "superforcaster" is dogmatism. In the context of this podcast, dogmatism is discussed as a personal desire to come up with reasons to support a preferred prediction with a tendency to disregard reasons that go against the preferred prediction. I think this idea can be extrapolated quite well to the scientific community, and community in general, as a whole. Often, what has been done previously or what is discussed with the most passion steers decision making just as much as-- if not more than-- the direction most evidence is pointing.

Listening to this podcast (for the third time, now) reinforced in my mind that we should embrace open mindedness and flexibility while allowing data-- be it in a scientific context or not-- to drive our opinions and change them throughout time.

The podcast closes by campaigning for more accountability in public debate following up on all predictions rather than simply choosing to discuss those that are convenient to discuss at a later date. I completely agree that this strategy could eliminate that tendency of public personas to make sweeping, dramatic promises and predictions that can rile up the populace with little (or no) basis or consequence. If we are going to continue as (or return to) a civil society it is important to remember that thinking and effort are critical components.

Friday, January 22, 2016

The problem is in the design of experiments

Here's another well-written lay article, this time from The New Yorker, on irreproducible findings, the way science is being conducted, and our inherent biases as human beings.

Money quote:

That’s why Schooler argues that scientists need to become more rigorous about data collection before they publish. “We’re wasting too much time chasing after bad studies and underpowered experiments,” he says. The current “obsession” with replicability distracts from the real problem, which is faulty design. [....] “Every researcher should have to spell out, in advance, how many subjects they’re going to use, and what exactly they’re testing, and what constitutes a sufficient level of proof. We have the tools to be much more transparent about our experiments.”

Indeed, "We have the tools." We've chosen not to use them for the last couple of generations.The design of experiments is a concept that was invented decades ago to minimize the very problem we're grappling with today. We've simply not taught it well or learned to use it or chosen not to use it, perhaps because our great ideas couldn't possibly be wrong.

h/t Ken Liu

Thursday, January 21, 2016

Jackpot!!

Have you ever had a double-yolk egg?

This is what one looks like...the pair of smaller yolks flanked by two large ones. It's always a fun little surprise. Just another mundane morning, frying up some eggs for breakfast, and then...boink! Something you don't expect to see.

I say always a fun surprise, but it has only happened to me twice. Just the other morning and then once before about a year or so ago.

Back in grad school I worked about a half-year rotation on a project measuring vitamin D receptors in the chorioallantoic membrane of japanese quail embryos. I cracked open hundreds of eggs on that project, if not thousands, and never once saw a twin. Though I did see a couple of monsters; sad little misshaped embryos with strange developmental defects.

In any event, since this is "statistics semester" and I was thinking about upcoming probability lectures, I wondered how lucky I must be to have witnessed two twin yolks in my lifetime!

The internet, which never disappoints and which I have no reason to doubt, says the random chance of seeing one of these even once is 1 in 1000.

When I run this little script

[pDY <- dbinom(2, size=10,000, prob=0.001) #Yes, I seem to like to eat a lot of eggs]

through the R machine to see how lucky I must have been to witness this twice it says my luck is 1 in 22,400. Which seems pretty nice.It makes me feel lucky.

I have also done something else that is is considered a reasonably rare feat, I have scored a hole-in-one in golf not just once, but 3 times!

The internet says the "risk" of a random single hole-in-one for somebody like me is 1 in 12,500. I actually have better reason to believe this probability than I do for the double yolk frequency, because the value of the hole-in-one probability comes from a company that makes a living betting that people WON'T score one. So they probably have a good idea what the probability of a random hole-in-one truly is, since their livelihood depends on it.

I ran a similar script through the R machine to calculate the probability that I could have had 3 holes-in-one. I've been playing golf for 30 years, and guesstimate I've played 20 rounds per year over that period. Some years more than others.

[pHOI <- dbinom(3, size=600, prob=0.00000008)]

The R machine says the probability of me accomplishing this three hole-in-one feat is actually very, very, very low. About 1 in 54 trillion.

Now I'm feeling even luckier!

So what are the chances that somebody like me would have seen 2 double yolk eggs AND have 3 holes-in-one playing golf? That's a simple joint probability( pHOI * pDY ) and that value comes out to a whopping 1.2 octodecillion!!

Had I known I was actually this lucky I would have bought one lottery ticket last week. This all assures me that I had a very good chance at winning the $1.5 billion jackpot.

Data phenotypes..a prelude

They are made up of numbers, and if you think of them as just numbers then they look a lot alike. Almost like twins, but not exactly. Yet, discrete and random variables really are not the same...at all.

Planet Money Podcast

Hey everyone, after our blog posts earlier this week I found an interesting podcast regarding scientific bias and replication errors. It's about 20 minutes long and a good overview of the subject.

NPR’s planet money, episode 677: The Experiment Experiment

http://www.npr.org/sections/money/

Tuesday, January 19, 2016

To not be a fraud: notes from critiques on scientific research

I was nervous when I was reading the assigned articles about bias and irreproducibility. Because the word “fraud” appeared many times, it seems not hard for researchers to be categorized into that group. As an undergraduate student doing independent research, I am on two minds about these critiques on unreliability of scientific researches. On the one hand, I understand the eagerness to prove the worth of one’s work and the results can never be perfect. On the other hand, I think that what makes scientific research special is its objectiveness as we claim. Beside all the critiques or alerting comments or concerns, I found those articles are constructive in two aspects.

Researchers need to give enough details. Anecdotes, highlights in experimental design, etc. can reduce irreproducibility. In other words, honesty does not simply mean report one’s finding as it is, but also get close to share full-scale of it. I think right attitude to public research is to share like show your dairy, as Jeremy Berg suggested in “The reliability of scientific research”: “[investigators]made comments indicating that the experiment ‘worked’ only one out of 10 times but that successful result is the result that they published”. Adequate information does indicate the presence of unperfected result even failure.

We do need a broader scale of peer review. Recent studies show that people omit or hardly admit their own experimental mistakes, but often can catch mistakes by others colleagues; this show the necessity of attention from peers. But at what point of the study would it be helpful? Currently, post-publication peer review becomes more and more popular, in addition to the traditional pre-publication one, which many articles pointed out to be more or less perfunctory. Comment sections in public, such as the one on eLife journal, PubMed Commons, etc. are chances for people in the related field. PubPeer is another platform for post-publication peer review, which in my view is more public and not that field-specific, which can be helpful for future studies. I found the soundest way of peer review to be reproduce data from the same experiment. Failures to reproduce original experiments were used as alerting evidences for claims, but a systematic sharing of result on those attempts could be more helpful.

Taking Science Public

As I was reading through the posted articles, two in particular caught my attention, because their topics were near and dear to my interests. I am interested in how the general population interacts with science, scientists, and how this perception can be shaped or warped. The articles posted on Vox and written by Julia Belluz took different angles at examining or negotiating the relationship between the lay and the lab. While I absolutely don’t deny the prevalence of irreproducibility in “science”, I’m not really ready to hop on the hype train like some others who seem happy to disregard the myriad advances which have been made by science despite it apparently being broken. One thing I liked from the articles was the frankness of the interviewee in the Why you can't always believe what you read in scientific journals piece. (S)he spoke candidly about the politics of science, specifically in his last comment, which was tangentially related to irreproducibility or “bad science” but actually highlighted one of the actual problems with science: it’s done by humans. This seems like a far more foundational or central issue, and a more interesting conversation to have. It’s not “peer review” or “good stats” (as someone who is good in lab and terrible at stats, I am herein making the assumption that someone who is as good at stats as I am in lab would find it just as easy to “massage” the data and obtain the answer they desire), it’s that we are prideful, ambitious, and defensive in perfectly normal, human levels.

I honestly have no segue into what my next thoughts were, but I didn’t want to go on for too long about something that wasn’t really that related to the dialogue we’re trying to have. One thing I will note about Ms.Belluz (herself a decorated science journalist) is that she seems a little quick to redirect the attention away from the individuals who disseminate the majority of these falsehoods or exaggerated claims, the journalists themselves. While I don’t believe it is ethical to continue portraying science to the public as this infallible discipline with participants entirely unfazed by their own human emotions, I also think it is important to think about how we phrase and shape arguments out of what statistics or data we have. The example to which I alluded earlier is a clear instance of presentation dictating the takeaway. Belluz doesn’t say “while it is true that about 30% of cases where specific phrases were used are cases where doctors are using these terms, and it is sometimes unjustified. However, dwarfing the 30% of cases which pertained to doctors, 55% of cases pertained to journalists using these phrases.”, a short paragraph I could write in a manner that seems a little biased in favor of scientists. Instead she begins the paragraph by getting that 55% out in the open, then focuses the remaining paragraph elaborating on the smaller percentage of cases which are perpetrated by doctors, leaving readers with the notion of doctors who “medically overhyping”, not journalists. This is then of course laid to rest with a warning statement addressing the grave danger that is medical overhype. The order in which we present data, the careful phrasing we use, and the overall presentation of specific data all have a significant effect on the takeaway message a reader gets. One question I struggle with after reading some of the articles is: how can we have an honest discussion about the realities of science and how reliable or unreliable studies are, without creating a million tiny Jenny McCarthys? Is the nature of public debate and discussion nuanced enough to handle the realities of scientific research, many of which have been true for centuries? How can the largest proportion of cases, perpetrated by journalists, be policed? Should they be?

Intrinsically Intertwined

After reading the posted articles on irreproducibility and bias in science, I am surprised that there are not further measures in place to combat these issues. The use of anonymous post-publication peer review and Bayesian statistics to justify redoing an experiment seem like common sense measures. Why have these not become standard practice among the scientific community?

As the article “Trouble in the Lab” states, “more than half of positive results could be wrong.” This was revealed by John Ionnidis’ 2005 paper, which proved the cost of a seemingly small number of false positives. When I connect this thought to my own research, I am horrified. What if the claims that helped me to develop my experimental theory are unreliable? Though they were published in peer-reviewed journals, perhaps their results do not reflect actuality. These “discoveries” might have not been discoveries at all, but simply instances in which the data told an incorrect story. Because research builds on previously published results, a false published result could lead to a chain reaction on incorrect assumptions. How can this chain of events be halted?

Statistics proved that irreproducibility is a rampant issue among many life science research investigations. I believe that statistics can similarly be used to combat this problem. Even though “most scientists are not statisticians,” acceptance of the explanation laid out by Ionnidis should be a prerequisite for performing research. It should guide scientists to perform more experiments and to not be fooled by false positives. Perhaps, then, greater care will be made to distinguish discoveries made by statistical anomaly from discoveries that represent the laws of science. Because of the nature of their work, scientists should take it upon themselves to become as versed as possible in statistics. The two are so intrinsically intertwined—this fact can no longer be ignored in the scientific community.