Smith College Botanic Garden
Course Catalog 2025-2026

Statistical and Data Sciences

SDS 100 Laboratory: Reproducible Scientific Computing with Data (1 Credit)

The practice of data science rests upon computing environments that foster responsible uses of data and reproducible scientific inquiries. This course develops students’ ability to engage in data science work using modern workflows, open-source tools and ethical practices. Students learn how to author a scientific report written in a lightweight markup language (e.g., markdown) that includes code (e.g., R), data, graphics, text and other media. Students also learn to reason about ethical practices in data science. S/U only. Students who have completed SDS 100 in a previous semester need not repeat it. Prerequisite: Concurrent registration in any SDS 192, SDS 210, SDS 290 or SDS 291. Enrollment limited to 30.

Fall, Spring

SDS 192 Introduction to Data Science (4 Credits)

An introduction to data science using Python, R and SQL. Students learn how to scrape, process and clean data from the web; manipulate data in a variety of formats; contextualize variation in data; construct point and interval estimates using resampling techniques; visualize multidimensional data; design accurate, clear and appropriate data graphics; create data maps and perform basic spatial analysis; and query large relational databases. Students who have completed SDS 100 in a previous semester need not repeat it. Corequisite: SDS 100. Enrollment limited to 40. Mathematics

Fall, Spring

SDS 210 Introduction to Statistics (4 Credits)

(Formerly SDS 201). An application-oriented introduction to statistical modeling, covering topics of descriptive statistics, data visualization, point and interval estimates, bivariate and multiple regression modeling, and inferential hypothesis tests using both distributional and resampling methods. Lectures include “hands on” demonstrations of statistical phenomenon, with labs and assignments that emphasize analysis of real data. Students who have completed SDS 100 in a previous semester need not repeat it. Corequisite: SDS 100. Restrictions: Students do not normally earn credit for more than one course on this list: ECO 220, GOV 203, MTH 220, PSY 201, SDS 201, SDS 210, SDS 220 or SOC 204. Enrollment limited to 40. Mathematics

Fall, Spring

SDS 236 Data Journalism (4 Credits)

Data journalism is the practice of telling stories with data. This course focuses on journalistic practices, interviewing data as a source, and interpreting results in context. The course discusses the importance of audience in a journalistic context and focuses on statistical ideas of variation and bias. The course includes hands-on work with data, using appropriate computational tools such as R, Python, and data APIs. In addition, the course explores the use of visualization and storytelling tools such as Tableau, plot.ly, and D3. No prior experience with programming or journalism is required. Prerequisites: An introductory statistics course (including SDS 220, SOC 204, GOV 203, ECO 220, PSY 201). Enrollment limited to 20. WI Mathematics

Fall, Spring, Variable

SDS 237 Data Ethnography (4 Credits)

This course introduces the theory and practice of data ethnography, demonstrating how qualitative data collection and analysis can be used to study data settings and artifacts. Students will learn techniques in field-note writing, participant observation, in-depth interviewing, documentary analysis and archival research and how they may be used to contextualize the cultural underpinnings of datasets. Students will learn how to visualize datasets in ways that foreground their sociopolitical provenance in R. Students will also learn how ethnographic methods can be leveraged to improve data documentation and communication. The course will introduce debates regarding the politics of technoscientific fieldwork. Recommended prerequisite: SDS 192. Enrollment limited to 40. Social Science

Fall, Spring

SDS 238 Community-Based Data Science (4 Credits)

This course introduces concepts in human-centered design and design justice, considering how their principles can be applied in the context of community-based data science work. Students learn how to define social problems, engage stakeholders, design data science solutions, and evaluate social impact. Students also learn techniques in collaborative data science project planning and execution, engaging best practices (e.g. version control and code review) in the context of a community-based data science project. Strategies for effectively communicating project approach, outcomes, and impact are addressed throughout the course. Enrollment limited to 24.

Fall, Spring, Alternate Years

SDS 239 Colloquium: Data Science Goes to the Movies (4 Credits)

Movies tell stories with data and about data. How is the understanding of data, data science, and the power of data science influenced and reinforced by popular media? Students explore the social, ethical, and cultural dimensions of data and data science using contemporary film and TV shows. Through close reading of visual media, students develop critical thinking about data provenance, data integrity, and the social stakes of data science. Students develop social, relational, and ethical analyses of contemporary uses of data science across domains like healthcare, law, and environmental science, and articulate the ways that data science is used to influence society. Prerequisite: SDS 192 or FMS 150. Enrollment limited to 25. Arts; Social Science

Fall, Spring

SDS 270 Programming for Data Science in R (4 Credits)

This course is not about data analysis—rather, students learn the R programming language at a deep level. Topics may include data structures, control flow, regular expressions, functions, environments, functional programming, object-oriented programming, debugging, testing, version control, documentation, literate programming, code review and package development. The major goal for the course is to contribute to a viable, collaborative, open-source, publishable R package. Prerequisites: SDS 192 and CSC 110, or equivalent. Enrollment limited to 40. Mathematics

Fall, Spring

SDS 271 Programming for Data Science in Python (4 Credits)

This course covers the skills and tools needed to process, analyze and visualize data in Python and work on collaborative projects. Topics include functional and object oriented programming in Python, data wrangling in Pandas, visualization in Matplotlib in seaborn, as well as creating a reproducible workflow: debugging, testing and documenting programs, and effectively using version control. The major goal for the course is to create a viable, open-source Python package like those in the Python Package Index (PyPI). Prerequisites: SDS 192 and CSC 110. Enrollment limited to 40. Mathematics

Fall, Spring, Variable

SDS 290 Research Design and Analysis (4 Credits)

(Formerly MTH/SDS 290). A survey of statistical methods needed for scientific research, including planning data collection and data analyses that provide evidence about a research hypothesis. The course can include coverage of analyses of variance, interactions, contrasts, multiple comparisons, multiple regression, factor analysis, causal inference for observational and randomized studies and graphical methods for displaying data. Special attention is given to analysis of data from student projects such as theses and special studies. Statistical software is used for data analysis. Students who have completed SDS 100 in a previous semester need not repeat it. Corequisite: SDS 100. Prerequisite: One of the following: PSY 201, SDS 201, GOV 203, ECO 220, SDS 220 or a score of 4 or 5 on the AP Statistics examination or the equivalent. Enrollment limited to 25. Mathematics

Fall, Spring

SDS 291 Multiple Regression (4 Credits)

(Formerly MTH 291/ SDS 291). Theory and applications of regression techniques: linear and nonlinear multiple regression models, residual and influence analysis, correlation, covariance analysis, indicator variables and time series analysis. This course includes methods for choosing, fitting, evaluating and comparing statistical models and analyzes data sets taken from the natural, physical and social sciences. Students who have completed SDS 100 in a previous semester need not repeat it. Corequisite: SDS 100. Prerequisite: SDS 201, PSY 201, GOV 203, SDS 220, ECO 220 or equivalent or a score of 4 or 5 on the AP Statistics examination. Enrollment limited to 40. Natural Science; Mathematics

Fall, Spring

SDS 293 Modeling for Machine Learning (4 Credits)

In the era of “big data,” statistical models are becoming increasingly sophisticated. This course begins with linear regression models and introduces students to a variety of techniques for learning from data, as well as principled methods for assessing and comparing models. Topics include bias-variance trade-off, resampling and cross-validation, linear model selection and regularization, classification and regression trees, bagging, boosting, random forests, support vector machines, generalized additive models, principal component analysis, unsupervised learning and k-means clustering. Emphasis is placed on statistical computing in a high-level language (e.g. R or Python). Prerequisites: SDS 291 and MTH 211 (MTH 211 may be concurrent). Enrollment limited to 25. Mathematics

Fall, Spring, Annually

SDS 300di Seminar: Topics in the Applications of Statistics and Data Science-Disability Inclusion and Data Analytics (4 Credits)

Students learn the social model of disability and critical disability theory as well as research design and process, and work on a research project analyzing disability inclusion public data. The statistical methods covered in this course may include logistic regression, multivariate analysis, factor analysis, etc. Students are expected to submit their final projects to a journal, conference or competition by the end of the semester. Prerequisite: SDS 201, SDS 220 or ECO 220. Restrictions: Juniors and seniors only. Enrollment limited to 15. Instructor permission required. Mathematics

Fall, Spring, Variable

SDS 300fs Seminar: Topics in the Applications of Statistics and Data Science-Understanding Food Systems through Engaging the Data (4 Credits)

This course examines the global Food System, with a focus on the US, through the examination of established data used to study the system and recommend food policy. In the United States, the US Department of Agriculture (USDA) oversees much of the food system, both promoting food products and regulating health impacts. In general, the operation of these systems generates a vast amount of data, much of which is in large open or semi-open online databases. Researchers and policy makers draw on these databases to aid in their decision making. This course aims to familiarize students with the data and its uses. Prerequisite: SDS 201 or SDS 220. Restrictions: Juniors and seniors only. Enrollment limited to 15. Instructor permission required. Social Science; Mathematics

Fall, Spring, Alternate Years

SDS 355 Seminar: Sports Analytics (4 Credits)

This course applies methods from the statistical and data sciences to sports to address fundamental questions of interest to players, coaches, team executives, journalists, and fans alike. Simple questions (e.g., who are the best players?) are complicated by the interdependent nature of team sports, the omnipresence of randomness (i.e., luck), and frequent changes to personnel, rules, equipment, league alignments, and other structures. However, in many ways sports provides an ideal laboratory for applied statistical analysis, as many sports generate copious amounts of data under regularized conditions. Students explore the big ideas in sports analytics (e.g., expected points, win probabilities, team strengths, etc.) and how they manifest across a variety of different sports. They develop a working knowledge of the most prominent statistical models for sports analytics and apply them to a variety of public sources of sports data. Prerequisites: SDS 192 and (SDS 201 or SDS 210). Restrictions: Juniors and seniors only. Enrollment limited to 12. Instructor permission required. Mathematics

Fall, Spring, Variable

SDS 390 Topics in Statistical and Data Sciences (4 Credits)

Topics in statistics and data science. Statistical methods for analyzing data must be chosen appropriately based on the type and structure of the data being analyzed. The particular methods and types of data studied this in this course vary, but topics may include: categorical data analysis, time series analysis, survival analysis, structural equation modeling, survey methodology, Bayesian methods, resampling methods, spatial statistics, missing data methods, advanced linear models, statistical/machine learning, network science, relational databases, web scraping and text mining. Prerequisites: MTH/SDS 290 or MTH/SDS 291 or MTH/SDS 292. Restrictions: SDS 390 may be taken a total of 3 times with different topics. Enrollment limited to 30.

Fall, Spring, Annually

SDS 390be Topics in Statistical and Data Sciences-Methods in Biostatistics and Epidemiology (4 Credits)

Epidemiology concerns the distribution and determinants of disease in human populations, while biostatistics focuses on the development and application of statistical methods to a wide range of topics in biology, medicine and public health. This course focuses on foundational concepts in epidemiology, including measures of association and common epidemiological study designs, and statistical methods for public health data. Discussions include categorical data analysis (contingency table analysis, multinomial regression, ordinal regression and Poisson regression) and survival analysis (Kaplan-Meier estimators and Cox proportional hazards models). No background in biology is expected or required. Prerequisites: SDS 291 and [MTH 112 or (MTH 111 and MTH 153)]. Restrictions: SDS 390 may be taken a total of 3 times with different topics. Enrollment limited to 30. Mathematics

Fall, Spring, Variable

SDS 390cd Topics in Statistical and Data Sciences-Categorical Data Analysis (4 Credits)

Theory and applications of statistical methods for the analysis of categorical data. The course includes an overview of statistical methods for analyzing discrete data including binary, multinomial and count response variables. Nominal and ordinal responses are considered. Discussions may include contingency table and chi-squared analyses, logistic, Poisson and negative-binomial regression models. R statistical software is used. Prerequisites: SDS 291 or SDS 290 or equivalent. Restrictions: SDS 390 may be taken a total of 3 times with different topics. Enrollment limited to 30.

Fall, Spring, Variable

SDS 400 Special Studies (1-4 Credits)

Normally for juniors and seniors. Instructor permission required.

Fall, Spring

SDS 410 Capstone in Statistical & Data Sciences (4 Credits)

This one-semester course leverages students’ previous coursework to address a real-world data analysis problem. Students collaborate in teams on projects sponsored by academia, government or industry. Professional skills developed include: ethics, project management, collaborative software development, documentation and consulting. Regular team meetings, weekly progress reports, interim and final reports, and multiple presentations are required. Open only to Statistical and Data Science majors. Prerequisites: SDS 192, SDS 291 and CSC 111. Restrictions: Statistical and Data Science majors only. Enrollment limited to 20. Instructor permission required. Mathematics

Fall, Spring

SDS 430D Honors Thesis (4 Credits)

Department permission required.

Fall, Spring