Introduction to Data Science

  9-12 graders

  Credits awarded on transcript  

  Algebra I completed with B- or better

  UC A-G approved for [C] Mathematics credits

  90 minutes per class

  8-10 students per class

  Twice per week over 36 weeks

  1149 per student, per semester  

  Self paced instructor-guided  

  Online community

  Office hours on-demand

  795 per student, per semester  

Data is everwhere around us. We generate more data every 40 minutes than all of the data generated since the dawn of civilization until 2003. The ability to work with data, understand what it tells us, and use it in your communication has become an essential life and career skill.

90%

of all the world's data was created in just the last two years

1000x

computing power in a smartphone vs. a 1970s mainframe computer

11%

of all U.S. high school students complete any statistics coursework

Decisions that used to be straightforward are increasingly more complex and driven by data. Individuals across all disciplines need to constantly separate fact from friction. The need to analyze and interpret data has permeated every discipline — across engineering, business, finance, social sciences, humanities, and even journalism. Several leading academics now agree that the mathematics we teach in high school is rooted in the 1950s space race and needs to be updated to reflect the realities of the digital and information age of today.

2Sigma School takes an interactive approach to data exploration, rather than a lecture based approach. Our classes are hands-on and use several tools that are used by leading data scientists as well as higher education universities, as illustrated by the following video clip of a live session in a small cohort.

In order to maximize our time together during the live sessions, we use a flipped classroom model that includes pre-work for every class. This allows students to program with the support of an instructor during the class. The pre-work includes pre-recorded videos, online reading, and some programming practice.

  University of California A-G approved for [C] Mathematics credits.

Course Outline

  1. Data Tells a Story
  2. The Data of Our Community: Learning from Data Distributions
  3. Water In Your Life: Bivariate Data and Causality vs. Correlation
  4. Shuffling Songs: Probabilistic Modeling
  5. Skin Tones and Representation: Categorical Data and Linear Algebra
  6. What’s the Best Place for Me? Modeling with Data and Understanding Bias
  7. Predicting My Preferences: Introduction to Machine Learning
  8. Being a Data Scientist

This is a high-school level course that introduces students to the exciting opportunities available at the intersection of data analysis, computing, and mathematics. In this course students will learn to understand, ask questions of, and represent data through project-based units. The units will give students opportunities to be data explorers through active engagement, developing their understanding of data analysis, sampling, correlation/causation, bias and uncertainty, modeling with data, making and evaluating data-based arguments, and the importance of data in society. At the end of the course, students will have a portfolio of their data science work to showcase their newly developed knowledge and understanding.

This is a beginner course and no prior experience with programming is required. During the first half of the course we cover key programming concepts that include variables, data types, comparisons and boolean operators, functions, control structures, and iteration. We will be using industry standard tools like Jupyter Notebooks, Python, and Data Commons. Students will get the chance to explore data sets in areas that they are familiar with. The course ends with a capstone project where the student get to apply what they have learned and round out their portfolio of data science work to showcase their newly developed abilities.

Some key differences between a traditional statistics course and the data science course include:

  • Larger data sets (Big Data) that can only be analyzed programmatically vs small, tailored data sets.
  • Use of modern statistical analysis and simulation tools vs a formula-based approach.
  • Use of Python programming for data analysis vs pen and paper based computations.

Our technology requirements are similar to that of most Online classes.

A desktop or laptop computer running Windows (PC), Mac OS (Mac), or Chrome OS (Chromebook).
Students must be able to run a Zoom Client.
A working microphone, speaker, and webcam.
A high-speed internet connection with at least 10mbps download speed (check your Internet speed).

Students must have a quiet place to study and participate in the class for the duration of the class. Some students may prefer a headset to isolate any background noise and help them focus in class.

Most course lectures and content may be viewed on mobile devices but programming assignments and certain quizzes require a desktop or laptop computer.

This course includes several timed tests where you will be asked to complete a given number of questions within a 1-3 hour time limit. These tests are designed to keep you competitively prepared but you can take them as often as you like. We do not proctor these exams, neither do we require that you install special lockdown browser.

In today's environment, when students have access to multiple devices, most attempts to avoid cheating in online exams are symbolic. Our exams are meant to encourage you to learn and push yourself using an honor system.

We do assign a grade at the end of the year based on a number of criteria which includes class participation, completion of assignments, and performance in the tests. We do not reveal the exact formula to minimize students' incentive to optimize for a higher grade.

We believe that your grade in the course should reflect how well you have learnt the skills, and a couple of timed-tests, while traditional, aren't the best way to evaluate your learning.

Top