Data is everywhere and is growing at a dizzying speed. The World Economic Forum predicts that our entire digital universe will be 44 zettabytes by 2020. (What’s a zettabyte? It’s 10007 bytes.) If that’s not eye-popping enough, here’s how the WEF breaks that down on a daily basis:
- 500 million tweets are sent
- 294 billion emails are sent
- 4 petabytes (1 petabyte = 10006 bytes) of data are created on Facebook
- 4 terabytes of data are created from each connected car
- 65 billion messages are sent on WhatsApp
- 5 billion searches are made
Data affects our everyday lives in ways we often take for granted. Google and Facebook analyze the content of our e-mails, searches, and posts, and then they use that data to target ads relevant to our interests. Amazon and Netflix track our online behavior, compare it to the behavior of other users, and recommend products and movies that suit our tastes—often with uncanny accuracy. And some schools are tracking potential student’s online interactions with the college or university to determine their level of interest and geotracking via Instagram if a student is visiting their campuses. Once in college, some schools turn to big data to determine if a student is struggling academically. Underlying it…and more…is data science.
What is Data Science?
According to Tech Terms, data science is the study of data. It involves developing methods of recording, storing, and analyzing data to effectively extract useful information. The goal of data science is to gain insights and knowledge from any type of data — both structured and unstructured. Data science is related to computer science, but is a separate field. Computer science involves creating programs and algorithms to record and process data, while data science covers any type of data analysis, which may or may not use computers.
So how does data science differ from statistics? Yale’s Department of Statistics and Data Science helps clarify:
Statistics is the science and art of prediction and explanation. The mathematical foundation of statistics lies in the theory of probability, which is applied to problems of making inferences and decisions under uncertainty….. Data Science expands on Statistics to encompass the entire lifecycle of data, from its specification, gathering and cleaning, through its management and analysis, to its use in making decisions and setting policy. It is a natural outgrowth of Statistics that incorporates advances in Machine Learning, Data Mining and High-Performance Computing along with domain expertise in the Social Sciences, Natural Sciences, Engineering, Management, Medicine and Digital Humanities.
Five years ago, a bachelor’s degree in data science was nearly nonexistent. Today, according to Discover Data Science, over 50 colleges and universities now offer a data science major. These majors typically include courses in computer science, mathematics, and statistics courses. No surprise – the growing career opportunities for students with a background in data science is fueling the surging interest. In fact, the Harvard Business Review labeled data scientist as the “sexiest job of the 21st century.”
Is It All About Tech?
New interdisciplinary fields are springing up on college campuses that are designed to merge data science with all disciplines in the liberal arts. Among our favorites: Dartmouth, Northwestern, Emory, Carnegie Mellon, and Wesleyan. Called Quantitative Social Sciences, Mathematical Methods in Social Sciences, Quantitative Analysis, or something similar, these unique interdisciplinary approaches enable you to learn quantitative theory and methods that apply directly to your academic and career interests—whatever they happen to be. Yes, you could always pair majors in applied math with a social science or humanities field, but we think these new programs offer you exciting opportunities to explore your questions, whether the frequency of hat tricks in the NHL, the temporal effects and evidence of racial prejudice in traffic stops in Illinois, or whether first names can signal political party affiliation. These are just a few examples of the kinds of questions Dartmouth students tackled as part of their Quantitative Social Sciences major.
Want to use statistics to explore your passion? Two Wesleyan University professors created Passion-Driven Statistics—a course described as statistics in the service of your own research – in the service of your passion. It is a multidisciplinary, project-based curriculum that supports students in conducting original research, asking original questions, and communicating methods and results using the language of statistics. The program includes 15 hours of video lessons and supporting materials. It’s adapted for virtual learning on EdX and is offered free of charge for learners worldwide.
Ready to Tackle Real-World Problems?
Inspired? Use your passion and curiosity now to tackle the Mathworks Math Modeling competition. Last year’s problem focused on substance use and abuse. Student teams were challenged to create a mathematical model to predict the spread of nicotine use due to vaping over the next 10 years and compare vaping to cigarette use, and then to build a second model to simulate the likelihood that a given individual will use a given substance, taking into account social influence and characteristic traits as well as characteristics of the drug itself, and predict how many high school seniors will use these substances.
You could grab a few friends and tackle the Modeling the Future Challenge. This year’s theme is Agriculture, Water, and Climate Change. Your team will be tasked with analyzing historic data and making your own models to project how changes to climate and water access could affect the agricultural industry in a region of the country you select. Then you will make recommendations on how the industry and government can respond to help mitigate and manage the potential risks.
What’s your passion? You can use data in all sorts of amazing ways to change the world!