“Pulling an All-Nighter”

Whether you stocked up on coffee and snacks and holed up in the library to finish programming that project due tomorrow, or stayed up all night playing video games until your eyes can only squint, perhaps you are familiar with what the next day feels like after a night without sleep. Unfortunatlely, studies suggest that sleep deprivation is cognitively comparable to being drunk, with impairments to memory, reasoning, reaction time, and decision making. I think we can all agree that everyone would prefer that critical software was not made by a drunk programmer. But is it really that bad? Especially given the trope that programmers are up all night shouting into their gaming headsets or hacking the CIA, how bad could it really be to lose a little sleep? Answer: bad. It’s really bad. This lesson will walk us through just how bad it is to be sleep deprived while trying to write code, and teach you how to interpret experiment design and results of an academic paper. Hopefully by the end, you’ll understand statistical concepts like: experimental design, hypothesis testing, mean, variance, standard deviation, the normal distribution, Type I and Type II errors, Bonferroni correction, correlation, Cliff’s delta (effect size), and the Kruskall Wallis Test. And hopefully you’ll realize how abysmal your code will be if you forgo those extra hours to order 3am pizza.

# there is actually A LOT to unpack here. All mean/variance/sd/normal distribution stuff should be covered before this lesson, including basic comparison between groups; perhaps in the holiday lesson
# we get a little messy when we get to nonparametric tests, effect size, and particularly the bonferroni correction. not sure how deeply I'd like to touch on that, but basically gonna say "when you start fishing around in your data, you're bound to find something significant and that's bad science. so we use this special correction to make sure it's actually *less* likely we find significance, so that we are conservative and therefore more confident in our results*

Research Question

This lesson will discuss and follow the methods of Need for Sleep: the Impact of a Night of Sleep Deprivation on Novice Developers’ Performance.(Fucci et al. 2018) This will allow you to get practice reading academic research, while also testing out their analysis methods in R.

Given that we know sleep deprivation affects cognitive functioning, it wouldn’t necessarily be the most interesting question to ask if sleep deprivation affects programming, but rather, how much? The authors form their first research question as the following:

“To what extent does sleep deprivation impact developers’ performance?”

Immediately, you can tell that we are not going over a “Yes” or “No” response, but some kind of measure of effect size, and some kind of operationalization of what “performance” really is. Remember that every measure must be carefully defined; and no measure will perfectly capture a larger concept like “performance”, “creativity”, or “skill”. Before we begin describing the methods in the paper, try to think of how you might measure developer performance. What would you have the participants do? How would you determine success or failure? Next, how would you check how much sleep deprivation was affecting that measure?

Who are the participants?

Let’s start by getting to know our participants, particularly between the conditions (Sleep Deprivaton vs. Regular Sleep)


subTable <- function(data, nameCol, val){
  return (data[which(data[nameCol] == val),])


# loading data including the sleep deprived set where we remove some particpants

loadAllData <- function(fileName = "../data/sleepDepr/piglatin.xlsx"){

  Exp <<-   read_xlsx(fileName)
  Exp_Cleaned <<- subTable(Exp, "PVT-remove", "NO")
  SlD <<- subTable(Exp, "METHOD", "SD")
  NOSD <<- subTable(Exp, "METHOD", "RS")
  SD_Cleaned <<- subTable(SlD, "PVT-remove", "NO")
  NOSD_Cleaned <<- subTable(NOSD, "PVT-remove", "NO")

# this is the post-questionnaire
post <- read_xlsx("../data/sleepDepr/post-questionnaire.xlsx")

#merging things together so we can work with one dataframe
data <- merge(Exp,post,by="ID")
SD_Cleaned$METHOD= "SD Cleaned"
SD_Cleaned <- merge(SD_Cleaned,post,by="ID")
data <- rbind(data,SD_Cleaned)

#who is in what condition?
plt = ggplot(data,aes(METHOD,fill=METHOD))+
  ggtitle("Distribution of Participants to Conditions")+

# 22 in RS, 22 in SD, 14 in cleaned set (different from in the paper, but just by 1)
Var1 Freq
RS 22
SD 22
SD Cleaned 14
#density plot showing age distribution. We see that age is probably significantly different between those who could forgo an entire night of sleep and those who could not. That's expected, and ethically non-controllable anyways. The authors conducted tests to make sure there was no interactions of age on the software quality measures
plt = ggplot(data,aes(Age,fill=METHOD))+
  geom_density(position="dodge",bins = 30,alpha=.5)+
  ggtitle("Age by Condition")+

nas <- which(is.na(as.numeric(as.character(data$`During your education, how many years of experience did you have with the Object Oriented Paradigm?`))))
data$`During your education, how many years of experience did you have with the Object Oriented Paradigm?` <-as.numeric(as.character(data$`During your education, how many years of experience did you have with the Object Oriented Paradigm?`))

#were participants in the different groups experienced differently?
plt = ggplot(data,aes(METHOD,data$`During your education, how many years of experience did you have with the Object Oriented Paradigm?`,fill=METHOD))+
  ylab("Years experience with OOP (educational)")+
  ggtitle("Educational OOP Experience by Condition")+

plt = ggplot(data,aes(METHOD,as.numeric(as.character(data$`During your education, how many years of experience did you have with programming?`)),fill=METHOD))+
  ggtitle("Educational Experience Programming by Condition")+
  ylab("Years experience with programming (educational)")+