Digging Into Data
At DataFest ’24, teams of students spent a weekend sifting through information to ask questions and glean insights
A buzz of productivity filled a classroom in the F.W. Olin Science Center on an early spring Friday afternoon as teams of students dug into a large, real-world data set on a quest to find useful insights and figure out how to best share those insights with others.
And they had to race the clock to finish by the end of the weekend.
The 40 or so students were participating in the American Statistical Association’s DataFest, an annual international undergraduate event that turns the often solitary pursuit of data analysis into a fast-paced team activity.
Associate Professor of Statistics Jerzy Wieczorek said that DataFest brings together the data science community and provides students with a hands-on, practical experience that differs in important ways from what they learn in the classroom. A three-judge panel awarded prizes to teams for categories that included best insight, best modeling, best visualization, and best presentation, and faculty and staff mentors were available during the weekend to help students with brainstorming and polishing their ideas.
“We think it’s such a great opportunity for students to see a real, large, messy data set. The kind of data that isn’t a simple textbook exercise,” he said. “Here, they don’t know what’s coming or what tools they’re going to need. They don’t even have a specific question assigned to them. It’s very open-ended, which is the way real data analysis is.”
This year’s data set was gleaned from an online statistics textbook that logs student activity and interaction. The DataFest teams explored the data and focused on questions about the statistics they read, the problems they tackled, and the challenges and struggles they experienced with the information, Wieczorek said.
“It’s really rich data and a really nice variety of data,” he said. “Are there ways that things seen in the data can inform how this course should be taught, or how the textbook could be redesigned, or how students could rethink their study strategies? It’s open-ended—and pretty exciting.”
That resonated with students like Maddie Zullow ’24, an economics and statistics double major who took a statistical graphics course this spring with Wieczorek and was looking forward to making data visualizations with the information shared that weekend. “In some ways, this is probably more effective than showing what you’ve learned in a 50-minute exam where you’re copying a lot of code that you’ve already had,” she said.
Jack Nguyen ’24, an economics and math double major who was part of the team that won an award for best modeling, agreed. “I think it’s a pretty hands-on experience, to apply what I’ve learned in class,” he said. “It is very exciting.”
Students worked through the weekend to sift through the data, figure out interesting questions to ask, make graphs and models, and find insights. On Sunday, they finished their presentations and submitted a short slideshow to show what they discovered, with one team’s findings indicating that students who took study breaks tended to perform better on end-of-chapter quizzes than the students who rushed through the material without pausing.
DataFest started more than a decade ago at UCLA when a group of students spent 48 hours analyzing five years of arrest records from the Los Angeles Police Department. It’s grown since, with more than 2,000 students taking part each year from colleges and universities around the world. Colby first participated in 2021, when the data set had to do with prescription drug abuse across countries, years, and demographics.
At that time, alumni who volunteered as judges and mentors for DataFest were so impressed by the skills of a couple of participating students that they ultimately offered them jobs, Wieczorek said.
The College hasn’t hosted DataFest since then, the professor said, adding that he was pleased to bring it back. The timing made sense—Colby now offers a data science major and has a full slate of five professors in the Statistics Department, marking the College’s commitment to the field.
“We want students who are excited about data to not be siloed and sitting in their own dorm rooms, doing their own homework assignments,” the professor said. “We’d love them to see how much of an engaging interpersonal interaction component there is to doing this and to see how many other people on campus are also excited about it. So hopefully, we can continue growing interest and enthusiasm about this.”