Personal Data

As a field of study, Statistics is essentially the study of how to make sense of data. It is full of formal, quantitative methods for dealing with uncertainty. There are fundamental questions that a successful analysis should touch upon. What do we know (the data)? What do we expect (our assumptions)? What should we expect given what we know (what does the data look like)? How does what we know affect the validity of the assumptions about what we expect (do the data seem excessively implausible given our assumptions)? The goal typically takes the form of setting up a fall guy (or gal) named Null Hypothesis and checking to see whether you can knock him (or her) down with your data. Traditionally, the null hypothesis is the enemy. If you are a drug company, the null hypothesis is that your drug is worthless (compared to the current standard of treatment).

On a structurally similar but less formal level, we’re all statisticians. We all compute our own personal statistics on a daily basis, using incredibly biased data derived from our muddled thoughts and our 5 senses. For instance, when I look myself over before I leave the house in the morning, I implicitly create an average rating of my appearance based on any number of specific observable data points. How’s my hair (when I have hair)? Are my teeth still crooked? Is my forehead doing that flaky, dry skin thing? Is my shirt suitably ironed? Does this stomach fat make it look like I have stomach fat?

All these things are weighted by importance and aggregated to create an overall “this is how I look” score. I then take this score and compare it to a null hypothesis about my appearance. My personal appearance null is usually “I look good” and I’m perfectly happy not knocking it down at all. Choosing a null for your personal appearance is a complicated process that incorporates your current state of mind and whole slew of conscious and unconscious thoughts based on a lifetime’s worth of accumulated experience.

So my perception of my appearance is my data. I use it to create an overall “this is how I look score”, which is then compared to what I’d expect the score to be if I looked okay (my null). This process is very sensitive to outliers. So if I looked like this guy, except that I had a giant, bulbous blister on my forehead, that blister would likely dominate my “this is how I look” score despite being a small, atypical portion of an otherwise magnificent package and I’d probably reject the null hypothesis that I look good.

The results of my personal appearance test have ramifications. Whether I decide that I look okay or not will partially determine how I interact with the world with me. It will inform my interactions with other people. During these interactions, I will be subconsciously gathering and aggregating more data and performing more informal statistical tests: Is this person listening to me? Do they understand what I’m talking about, or am I completely incoherent? Was that joke I told actually funny or was that laugh just a polite acknowledgment of attempted comedy? For each of these questions, I’d compare the person’s behavior (the data) with how I’d expect them to behave (the null) and draw conclusions accordingly.

If I feel especially dashing, then perhaps I’m slightly more apt to feel confident in my interactions with other people and this confidence can result in more positive ad hoc estimates of how well these interactions are going. These ad hoc estimates are data points that go into a running calculation of how well my day is going.

At the end of the day, I have all these internal data points describing the quality of my day. These too are then aggregated into a relative rating describing the overall quality of my day. If the average quality of my day seems especially good or bad (compared to my expectations), I might spend some time going through the data trying to figure out whether there was a specific event that made my day especially good or bad. Or, I’ll probably be asleep within moments of laying down.

I bet you do this too.

A lot of the time, this process is completely subconscious. One interesting thing you can do is try and figure out what your null hypotheses are and how they got to be that way. It can be enlightening to see how much of an effect things that happened ages ago can have in providing context for the decisions you make now.

Internships and Measurements

My first nonretail job was as a research assistant. I was finishing up bachelor’s degree in mechanical engineering right about the same time that the financial sector was figuring out that subprime mortgage backed securities might not be as sound as they first appeared. I felt fairly lucky to have scored an internship as a research assistant prior to my senior year at an energy-focused nonprofit. It paid $14/hr, which was more than I had ever made before.

I would primarily be working on projects whose aim was to quantify secondhand smoke exposure in various settings. The research was funded by an organization whose sole purpose was to spend 3% of Minnesota’s tobacco settlement fund on ameliorating the effects of tobacco smoke. My bosses were extremely busy people, and my role was essentially to fill in the gaps between their obligations and their time-constrained abilities. The work primarily took the form of equipment babysitting: offloading data, applying grease, and checking that dates and times were synchronized. Occasionally, my boss’ requirements resulted in odd tasks, like estimating the outer surface area of an oddly shaped five-story building or testing building code interpretation software. I remember several instances of cooking and eating various delicacies in a sealed room while our equipment measured the particulates generated by my culinary skillz.

After I graduated, I kept the job, lost the intern title and added field work to my job description. This posed a whole new set of challenges for me. The field work consisted of hauling around large, heavy, sturdy plastic cases full of 7 or so different devices with total retail value around $18,000. We’d take the ominous, occasionally humming, intermittently beeping box to randomly chosen (from a pool of agreeable participants) apartments and leave them there for a week. One thing about apartments that I learned: all apartments built after 1980 are essentially the same. They might have different layouts, but they all have off-white carpet and smell like your neighbor’s dinner every night around 6:30. You pay for location, not quality. I also learned that conspiracy theorists love a captive audience and that other people’s lives are occasionally fairly depressing: I got to do an install and removal here, though when I went the tenant I was visiting only mentioned the bedbug infestation. Incidentally, bedbugs can live for a year and a half without eating and one way to rid a big box of fancy equipment of bedbugs is to leave it in an unheated garage while the daily highs drop below zero for a week.

The participants in this study were nonsmokers who had reported smelling smoke in their apartments on a survey that my organization had distributed. The purpose of the box, which was left in place for a week, was to collect a sort of indoor air quality fingerprint for that person’s apartment. The idea being that we could then use the data to estimate nonsmokers’ level of secondhand exposure. The boxes collected data on temperature, humidity, and CO2 concentration. We collected three different sets of airborne particulate concentration data and week-averaged polycyclic aromatic hydrocarbon concentration. Participants were also required to fill out a log of daily activities. We measured the hell out of everything we could.

Measuring things is easy. Measuring things in an accurate and consistent manner gets really difficult really quick. Throughout my short career as a research assistant studying secondhand smoke, and also during every lab of my undergraduate experience (in Biostatistics you’re typically just given the data and the trick is to reformat is so that it can be appropriately fit to a statistical model), the trickiness of proper data collection is something that was constantly in the back of my mind. There is always some distinction between what you want to measure and what you’re actually measuring. For instance, when you take real-time measurements of airborne particulate concentration in a restaurant, you are probably actually measuring varying levels of voltage induced in a photodiode from the concentration dependent scattering of a laser as it’s pointed through a sample of air from that restaurant. You’re not even necessarily measuring tobacco smoke, because there are other constituents of restaurant air that scatter laser light identically to tobacco smoke, like ambient air pollution from outside and byproducts of cooking. In order to estimate the actual concentration of pollutants due to tobacco smoke, you need to estimate the proportion of your laser scatter that is solely due to tobacco smoke. Estimates of this proportion are available in the literature and this method seems to be acceptable for publication purposes.

Even if your measurement scheme is all nailed down, there can still be complications that induce bias. The measurement mechanism outlined above is essentially the mechanism utilized by a device called a Sidepak. For a device to work as it is supposed to, it must be deployed correctly. You must know how your equipment can fail to measure what you think it’s measuring. On example: If you want to properly estimate the overall concentration of secondhand smoke particulates in a restaurant using a Sidepak, you shouldn’t sit near anyone who is smoking because you’ll then be measuring the concentration of secondhand smoke particulates around that person and not the whole room, and your resulting data will be biased towards higher concentrations.

The process of deriving meaningful information via measurement requires at least three things. First, you must know what you want to know – you must have your goal clearly defined. You have to be able to determine what data is sufficient to answer your question of interest. If your goal is misspecified here you’ll likely end up answering a question you didn’t intend to and that question will likely be a completely uninteresting one. Second, you must know how to measure what you need to measure to be able to estimate what you want to know. If you can’t actually relate your desired estimate to collectable data then you shouldn’t collect data. Maybe you could simulate your data instead. Third, you must measure in a way that ensures the least amount of measurement error possible. For instance, if your goal is to estimate the number of people who are committing vote fraud and your data collection method is to seek the expertise of knowledgeable people (not necessarily my first choice, but probably not a worthless exercise if done right), don’t rely solely on politicians for your information.

This last example, while kind of a joke, shows that mistakes in measurement aren’t solely problems in the sciences. Indeed, the human brain is very susceptible to failing all three criteria listed above resulting in erroneous beliefs based on biased measures of irrelevant information. For more concrete examples, see any PAC-funded political ad.