## Saturday, April 9, 2016

### The Power of Continuous Variables

Ever since elementary school, we have learned how to perform mathematical functions, from simple addition and subtraction to more complex multiplication and division. Included in our early math studies was how to determine the values of the mean, median, mode, etc. For years, we were asked to calculate these values at all levels of academia; from a simple sixth grade homework assignments, to determining the mean fluorescence level of viral entry in an undergraduate thesis. As a result, we have taken for granted the power that is contained from the values we use in these calculations: the power of continuous variables.

Before this class, I was one of those people who didn’t appreciate the power of continuous variables in science. However, by simply flipping through Harvey Motulsky’s Intuitive Biostatistics, it is clear that I was missing out. A quick glance at the table of contents shows that nine chapters are included under the continuous variables subsection in the book. Since the book has 49 chapters, that means that the study of continuous variables makes up 18% of the textbook. Upon looking more closely at these chapters, you see that continuous variables make quite a contribution to the study of statistics. These variables are used to make scatter plots, determine confidence intervals, make a Gaussian distribution, and so forth. The use of continuous variables opens the door for a huge range of different possible statistical tests and possible analyses.

What impresses me the most about continuous variables is not the vast number of analyses that can result, but rather the ability of these variables to connect the scientific community. As described in the lecture, a continuous variable is a scalar physical property, such as mass, concentration, etc. Within each of the different measurements, a standard unit of measurement has been established which is used by scientists all over the world. This allows for scientists around the world to compare data easily and without any blocks in data analysis or losing anything in translation. As a result, a scientist in the United States can have a collaborator in Russia and be able to share data back and forth without giving a second thought to possible data conversion. This ability is achieved by the power of continuous variables.

1. Your comment on collaboration between a Russian and American scientist reminded me of a psychology seminar I attended on Self-Determination Theory (SDT), which is a theory of human motivation. In brief, SDT states that three universal, innate psychological needs are necessary to achieve psychological need satisfaction: competence, autonomy, and psychological relatedness. This seminar discussed a study that sought to explore the role of autonomy-support on self-motivation and well-being across Russian and U.S. adolescents. My point is that the data collected were largely… not discrete, but ordinal variables! They were in the form of questionnaire answers chosen from a list of rankings, and they were also seamlessly communicated between cultures. There are different ways to think of how continuous and ordinal variables are related. You could divide variables into (1) numerical and (2) categorical, with continuous and discrete variables being numerical and ordinal and nominal being categorical. Or you could divide variables into (1) continuous and (2) discrete, with interval-scale and ratio-scale variables being continuous and ordinal and nominal being discrete. I’m sure there are even more ways to categorize variables (and thus divide the scientific community). At the end of the day, though, all of them connect the scientific community and allow us to measure, study, and present what we observe.

2. I admire the zeal towards continuous variables you have presented in your post. I agree about the power they have and the role they have/can/do play in the field of statistics and science as a whole.

But it is of course worth mentioning how absent true continuous variables are from the research that we conduct. Most of the data we collect is often discretized to some degree. Even if we're collecting electrophysiological data from neurons every 0.00001 second, we're still discretizing it's activity, which inherently is continuous.

True continuous variables are rare in observation, and they are incredibly difficult to perform mathematics on. Lets say we have a perfect Gaussian curve – a continuous one. And we want to find the area under this curve. Most of our statistical methods will require us to actually discretize this curve, and find the area under something that looks very much like a Gaussian curve, but technically isn't one.

And of course, the area under this continuous curve is equal to exactly 1. Discretizing it and finding the area under it will give us answer pretty close to 1, but not exactly 1. Thankfully, it's close enough that most of the time, when it comes to being practical with your results, I guess it doesn't really matter.