The Importance of Data Science

  • The Importance of Data Science

    April 2024

    As data increases and more decisions are being based on data, students must develop a deeper understanding of the methods and ethics associated with collecting, analyzing, visualizing, and communicating data. The growing trend of offering data science courses in high school and integrating elements of data science throughout the PK-12 curriculum is commendable. When high school data science skills are incorporated into teaching, it can help students see the relevance and utility of the mathematics they are learning.

    Too often, students view mathematics as that dreaded subject in which they need to memorize a series of unrelated procedures. However, by building data science into the math curriculum and integrating more datasets relevant to students’ lived experiences, we can transform this perception and inspire more interest in the subject as a whole. It also presents an opportunity for us as educators to innovate for our students, especially now when they are counting on us to do so.

    Data science is generally recognized as the field of study that integrates statistics with computer science and domain knowledge to understand the context. Many people associate data strictly with numbers; data science helps broaden this understanding and recognizes that data can also incorporate text, images, sound, and video. Data science incorporates a wide range of data representations to help make informed decisions. The technology available to students now allows them to work with larger data sets that have multiple categorical and numerical variables.

    When we think of statistics in the PK-12 setting, we often think of measures of central tendency, which we ask students to calculate with a small data set because we recognize it’s impractical to calculate the mean of a set of 100 numbers manually. By using technology to generate those measures, we allow students to analyze the results to make predictions. Data science also involves working with “messy” multivariate data sets in which some data may be missing. This creates an opportunity to teach students some practical skills, such as how to ethically and effectively clean and preprocess data to ensure robust analyses and accurate conclusions. It allows them to explore more complex mathematical and statistical ideas such as modeling with multiple variables or assessing statistical significance.

    As data science is a rapidly evolving field, some educators are uneasy about offering these courses at the high school level. As a mathematics education community, we must work together to assuage those fears by clearly defining what data science at the high school level should look like. I suspect that if today you ask ten mathematics educators to describe data science, you are likely to get ten different answers! Clearly providing some commonalities around descriptions and course content will lead to a shared recognition of the deep mathematical concepts involved. Data science is not the course for those who have often been inaccurately deemed incapable of understanding mathematics. Instead, a clearly defined data science course can offer a meaningful and challenging option for all learners to engage deeply with mathematical thinking and problem solving in real-world contexts.

    Regardless of whether data science concepts are incorporated into the PK-12 curriculum, as mathematics educators, we must continue to develop deeper data literacy skills with our students. Data literacy is a critical global citizenship skill that requires a strong conceptual understanding of statistics. This requires learning opportunities for every student throughout the year, not only at the end of the school year. When students are learning these concepts, educators should make sure that the statistical question they are posing is prominent. Students must be able to recognize that the question they are answering, and the data they are analyzing are addressing an authentic domain problem.

    Making sense of our world requires that our students be able to use data to make informed decisions and predictions. As a mathematics education community, we have an obligation to continue having conversations about the importance of integrating data science concepts in PK-12 and offering high school data science courses. Consistently imbuing our students with these concepts will help them see the relevance of the mathematics they are learning, fostering a sense of engagement and proficiency.

    Kevin Dykema
    NCTM President
    @kdykema