DATA170
Exploring Data

Syllabus

A summary of the course objectives, content, policies, and schedule.

Instructor: Dr. Durell Bouchard
Office Hours: via Zoom by appointment
Office: Trexler 365-C
E-Mail:
Phone: 375-4901

Course Objectives


The amount of data we generate when interacting with the digital world is staggering. In one second, users create 347,222 posts on Instagram, upload 500 hours of video to YouTube, and order 6,659 packages from Amazon. To understand the vast quantities of information generated, we need sophisticated algorithms to simplify, analyze, and visualize the data. And the companies collecting the digital information need data scientists who understand the algorithms. This class will introduce you to the tools and techniques required to become a data scientist. You will learn to apply machine learning to large datasets to explore data and make predictions.

Intended Learning Outcomes: At the end of the course, the successful student will be able to

  1. write programs that use machine learning to make predictions.

  2. correctly format and manage data.

  3. report and interpret the performance of prediction models.

Course Content


Prerequisites: CPSC120

Text: How to Think like a Data Scientist: Second Edition, by Brad Miller, Jacqueline Boggs, and Janice L Pearce, Runestone Interactive.

Project: The course will culminate in a project that uses machine learning to create a model that can make predictions from data. This project is designed to allow you to put together all of the skills and techniques you learn throughout the semester to explore a dataset that interests you.

Assignments: We will have regular small programming assignments that are designed to reinforce class concepts. These assignments are an opportunity for you to demonstrate that you are ready to apply what you have learned to the project.

Activities: Programming activities during class give you a structured experience in data cleaning and analysis. The activities connect the reading and lectures to the practice of data science and prepare you for assignments.

Co-curricular: The Department of Mathematics, Computer Science, and Physics is offering a series of lectures designed to engage the campus community in discussions of ongoing research, novel applications, and other issues that face these disciplines. You are invited to attend all of the events but participating in at least two is mandatory. Within one week of attending an event you must submit a one page, single-spaced, paper (to Inquire) reflecting on the discussion. If you do not turn the paper in within the one week time frame you may not count that event as one you attended.

Grading: Course grades are assigned based on the following weights and scale:

Grade Weights
Category Weight
Project 38%
Activities 30%
Assignments 30%
Co-curricular 2%
Grade Scale
Grade Range Grade Range
A 93-100 C 73-76
A- 90-92 C- 70-72
B+ 87-89 D+ 67-69
B 83-86 D 63-66
B- 80-82 D- 60-62
C+ 77-79 F 0-59

Course Policies


Zoom: Our class will be meeting synchronously, during our scheduled time block, via Zoom. The following are some suggested best practices based on student feedback from previous courses:

Attendance: Class attendance is vital to your success in this course. Conversations held in class illuminate the class materials and are necessary for completing the activities and assignments. If you anticipate being unable to attend class, email me before class to be excused.

Late Work: If you anticipate being unable to meet a deadline, email me before the deadline to request an extension. Unexcused late work will receive no credit.

Academic Integrity: Collaboration is a fundamental part of learning. You are encouraged to discuss and learn from one another on the activities. However, unless expressly stated otherwise, all work on assignments and the project should be solely your own. It is accepted that you have read and understood the standards for academic integrity at Roanoke College. If you are ever uncertain about how the policy pertains to any assignments in this course, please ask me for clarification.

Subject Tutoring: Subject Tutoring, located on the lower level of Fintel Library (Room 5), is open 4 pm – 9 pm, Sunday – Thursday. We are a Level II Internationally Certified Training Center through the College Reading and Learning Association (CRLA). Subject Tutors are friendly, highly-trained Roanoke College students who offer free, one-on-one tutorials in a variety of general education and major courses such as: Business, Economics, Mathematics, INQ 240, Modern Languages, Lab Sciences, INQ 250, and Social Sciences (see all available subjects at <www.roanoke.edu/tutoring>). Tutoring sessions are available in-person or online in 30 or 60-minute appointments (please specify if you prefer to meet with a tutor online or in-person when you make your appointment). All in-person appointments will maintain at least 6 feet of physical distance, desks will be cleaned between appointments, and masks must be worn in all indoor, public spaces. Schedule an appointment at <www.roanoke.edu/tutoring>, or contact us at 540-375-2590 or . We hope to see you soon!

Accessible Education Services: Accessible Education Services (AES) is located in the Goode-Pasfield Center for Learning and Teaching in Fintel Library. AES provides reasonable accommodations to students with documented disabilities. To register for services, students must self-identify to AES, complete the registration process, and provide current documentation of a disability along with recommendations from the qualified specialist. Please contact Laura Leonard, Assistant Director of Academic Services for Accessible Education, at 540-375-2247 or by e-mail at to schedule an appointment. If you have registered with AES in the past and would like to receive academic accommodations for this semester, please contact Laura Leonard at your earliest convenience to schedule an appointment and/or obtain your accommodation letter for the current semester.

Diversity: I consider this classroom to be a place where you will be treated with respect, and I welcome individuals of all ages, backgrounds, beliefs, ethnicities, genders, gender identities, gender expressions, national origins, religious affiliations, sexual orientations, ability – and other visible and nonvisible differences. All members of this class are expected to contribute to a respectful, welcoming and inclusive environment for every other member of the class.

Preferred Name/Pronoun: I will gladly honor your request to address you by an alternate name or gender pronoun. Please advise me of this preference early in the semester so that I may make appropriate changes to my records.

Course Schedule


This course expects you to spend at least 12 hours of work each week inside and outside of class.

Week Topic
1 Numpy
2 Pandas
3 Scitkit Learn
4 K-nearest Neighbors
5 Cross Validation
6 Naive Bayes
7 Feature Selection
8 Support Vector Machines
9 Feature Reduction
10 Decision Trees
11 Neural Networks
12 Projects
13 Project Presentations

In place of a spring break this year, we will have the following days off from class:

Thu Apr 15
Thu May 6