About CMP202

Course Description: An introduction to programming using C++ for students with no previous computer programming experience. Includes introduction to algorithms and object-oriented programming techniques.

Required Texts and Materials: None. All materials will be provided on the course website.

Optional Texts:
Python Standard Library
Python Tutorials
How to Think Like a Computer Scientist (Interactive Edition)
Dive Into Python
Google's Python Class

Contact Details

Stephanie Rosenthal
Chatham University
Falk 116C
Office Hours:

s.rosenthal@chatham.edu

Assignments

In this class, we will focus on programming techniques that are useful for data analysis. I have been using the Pittsburgh Port Authority's Developer's Guide to record the locations of each bus in Pittsburgh every 5 minutes since June 23, 2017. Each assignment will build up your ability to compute new information based on this data.

Assignment 1

Out 9/1, Due 9/8

Due 9/8 at 11:59pm There are three types of data in the files I'm collecting: routes, vehicles, and errors. Every 5 minutes, I have code that asks the Pittsburgh Port Authority which routes have buses that are running. For each route that has a bus running, the Port Authority reports the route number, name, color, etc (Developer Guide bottom of page 16). I then created code that reads the route inforamtion to ask for a report of all the buses (vehicles) that are running for each route. Each vehicle reports information about where it is located, where it is heading, etc. This information is found in the Developer Guide (page 13). The routes that have no buses running report an error.

Your job is to use this data snippet to determine the data types that are represented in each of the route, vehicle, and error variables. Submit a file [yourchathamid]-1.txt that contains the name of each variable in the route, vehicle, and error messages, and 1) the current data type that the Port Authority reports and 2) the true data type that is contained in the variable. Note, I'm not asking you to create this list for every route and vehicle, I'm asking you to use all the information provided to determine the data types of the "rt", "pid", "vid", etc variables. You may use a Word document and create a table or you may create a text file.

Select file to upload ([your-chatham-id]-1.docx or .txt):

Assignment 2

Out 9/8, Due 9/15

Due 9/15 at 11:59pm Now that you have learned about data types and conversion between them, you will create a small program to compute the distance between two GPS coordinates. The distance between GPS coordinates represents (approximately) how far the bus has traveled in the last 5 minutes. You will compute the distance in kilometers and in miles.

Let's refresh your knowledge about GPS. GPS coordinates are given in degrees on the sphere that is the Earth. You can assume that the Earth is a perfect sphere of radius 6371km or 3959mi. There is an equation, known as the Haversine Equation, that computes the distance between two GPS coordinates (lat1,lon1) and (lat2,lon2):

a = sin^2({\textstyle\frac{\text{lat2-lat1}}{2}}) + \big(cos(\text{lat1}) cos(\text{lat2}) sin^2\big({\textstyle\frac{\text{lon2-lon1}}{2}}\big)\big)
\text{distance} = \text{radius} * 2 * arctan(sqrt(a)/sqrt(1-a))

Code: Your job is to create a python file called [yourchathamid]-2.py that has two functions in it: convertToKM(lat1,lon1,lat2,lon2) and convertToMiles(lat1,lon1,lat2,lon2). The algorithm you should write as code is the following:

  • 1) Each function takes two GPS coordinates as strings and converts to the proper data type for degrees,
  • 2) then it converts those no-longer-string GPS coordinates to radians (radians = degrees*PI/180),
  • 3) and then computes the variable "a" and the Haversine distance with the radius in km or miles respectively (see value in the paragraph above).
You may include other helper functions if you want (if you are writing the same code more than once, you probably should include a helper function that can be called more than once). Note: At the top of your file, you will need to "import math" in order to use the sin/cos/tan/etc functions.

Style: Remember to comment your code with #'s and name your variables useful names so that I can read it. Comments should include summaries of what the code does, what each variable represents, and what it computes in the end. You should not be showing other people your working code. Use whiteboards to write equations and collaborate, but write your own code. However, if you referred to another person's code, please cite them in the comments.

Testing: You may (and should) test your code using print statements and other code in the file, but they should not run when I run your code (i.e., you should remove or comment out all print and test statements). You can use websites like this one to ensure you are computing the equation correctly.

Select file to upload:

Assignment 3

Out 9/15, Due 9/29

Due 9/29 at 11:59pm I have collected bus data every 5 minutes, so now I wonder how many and which buses do the following: 1) stop running (are in the first file but not in the second), 2) start running (are in the second file but not in the first), 3) change routes, 4) change route patterns, and 5) stay on the same route. For extra credit, I additionally additionally wonder for the buses that continue on the same route how different our distance computation is from the "pdist" distance computed by the Port Authority.

Code: Your job is download (and rename yourchathamemail-3.py) this file that outputs the counts and bus ID's of each category listed above. I will run your code with the command "python yourchathamemail-3.py file1.json file2.json". The two files are called command line inputs. You should "import sys" and get the names of the files using sys.argv[1] and sys.argv[2] to pass into your main function (where I open and read the files for you).

If you choose to complete the extra credit, for the buses that continue on the same route, you should 1) use your code from Assignment 2 to compute and print the distance in miles that each bus has traveled and 2) compute the difference between the "pdist" values (the distance traveled along the bus route in feet) and convert and print that number in miles (hint: how many feet are in a mile?). You should use page 13 of the Developer Guide to indicate which values you should be using for your computations.

The file I have provided you contains the outline of the code you should write, and the exact print statements that I expect to see. Please do not alter the print statements provided in this file. I will be checking this output to determine your grade on the assignment. Because we will not be covering JSON early enough, I have provided the code to open the file, read it, and parse the json file for you. The output of my json code provides you a dictionary for each file where the keys are the vehicle IDs and the values are the dictionaries of information on page 13 of the developer guide. You should access these values for the same vehicle in each dictionary to determine if they have changed.

You may include other helper functions if you want (if you are writing the same code more than once, you probably should include a helper function that can be called more than once). If you do the extra credit, you can import your assn2 code or copy the code into your file. Note: At the top of your file, you will need to "import math" in order to use the sin/cos/tan/etc functions, "import sys" to access the command line arguments, and "import json" to input the json files.

This time, you should also include error checking code. The main section or function should print an appropriate and helpful error and exit if the wrong number of arguments are included on the command line, if the first file was collected later than the second (hint: because the files are dated, you can just compare whether the first date is less than the second date).

Style: Remember to comment your code with #'s and name your variables useful names so that I can read it. Comments should include summaries of what the code does, what each variable represents, and what it computes in the end. You should not be showing other people your working code. Use whiteboards to write equations and collaborate, but write your own code. However, if you referred to another person's code, please cite them in the comments.

Testing: You may (and should) test your code using print statements and other code in the file, but they should not run when I run your code (i.e., you should remove or comment out all unnecessary print and test statements). I have provided several test files for you to test with located here to test your code.

Select file to upload:

Assignment 4

Out 10/11, Due 10/23 (NOTE: YOUR PROJECT WRITEUP IS DUE 10/27)

START EARLY!!! In this assignment, we'll practice reading in different files in different formats and printing parts of them. First, download each of the three files located here. Each of these files has data in a different format. The first contains one sentence per line. The second contains lists of keywords per line. The third contains a formatted data file.

Part 1 Code: (25%) Your job is to create a file with three functions: readSentences(filename), readKeywords(filename), and readData(filename). Each function should 1) open the file named filename, 2) read the file line by line (you can test this by printing the line), and 3) close the file before returning. At the bottom of your file, you should call the functions with their appropriate filename. Extra credit +1 point if you call the functions within a main function and call the main function from "if __main__".

Part 2 Code: (25%) Modify your readData function to print ONLY the number in the 5th column (starting at 0) in each line. To do this for each line in the file (you're already printing it), you will need to call line.split(" ") to divide the line into an array of numbers, and index the 5th number, and print only the number instead.

Part 3 Code: (25%) Modify your readSentences function to print ONLY the ith letter on the ith line (i.e., the first letter on the first line, the second letter on the second line, etc). To do this, you will need to create a counter variable, initialize it to 0 outside of your read lines loop, and then for each line print the appropriate letter and then increment the counter. When you run your code, it will print one letter per line all the way through the file.

Part 4 Code: (25%) Modify your readKeywords function to print ONLY the keywords after the word "find_me" on each line. To do this, for each line in the file, you will need to call line.split(" ") to divide the line into an array of words, iterate through the array of words to find "find_me" and then print the word in the list after that one (you'll have to figure out how to do the last part, but as a hint you could use a counter variable again).

Select file to upload:

Project

Out 10/23, Due 12/8

You have now learned enough concepts that you should be able to create some of your own code for a project of your choosing. That project could be anything from data analysis to calculator functionality to user interface design. You should aim to be writing about 75-150 lines of code - roughly three times the length of assignment 4. You may work in pairs if you would like and complete 150-300 lines of code instead.

Task 1 - Proposal: Due 10/27. Please write a 2-3 paragraph proposal detailing the project you would like to create and who you will be working with, if anyone. It should include what materials you'll need (a robot, a dictionary of words, and how you will find the data if you need any). And it should include a rough description of the algorithms you will need to write (not exact pseudocode). If you need help thinking of a project, please ask. I can supply you with project ideas and also more of the bus data if you would like to use that.

Task 2 - Mid-Point Check In: Due 11/17. Please submit your code (I expect roughly half of it to be completed) and a 2 page writeup detailing what you have accomplished, your algorithm design decisions, what you have remaining to do, and what you need help with. We will review both documents in meetings during our class period. I will read but not try to run your code unless asked.

Task 3 - Final Presentation: Due 12/4-12/8. Please prepare a 15 minute presentation including an introduction to the problem your code is trying to solve, the data/materials you used and how it was collected (if you used any), the algorithms you wrote and why you made the design decisions you did, the challenges you faced while writing the code, and lessons learned.

Task 4 - Final Code: Due 12/8. You should submit your final code, any data files that are needed to run it, documentation on how to run it, and commented code so that I can read it. Any place where you think I might wonder why you wrote code the way you did, please add extra comments. Also attach and comment any test functions/files you created.

Select file to upload:

-->

Schedule

Monday

Wednesday

Friday

8/28: Introduction

8/30: Data Types, Abstraction

9/1: Terminal and Python
Bring Computer to Class
Assignment 1 out

9/4: Labor Day (no class)

9/6: Variables, Conditionals, Booleans

9/8: Files and Input/Output
Bring Computer to Class
Assignment 1 due
Assignment 2 out

9/11: Main Functions, Scope

9/13: Loops

9/15: Command Line Input
Bring Computer to Class
Assignment 2 due
Assignment 3 out

9/18: File IO

9/20: Data Structures

9/22: No class

9/25: Data Structures

9/27: Search
Bring Computer to Class

9/29: Prepare for Midterm
Assignment 3 due
Assignment 4 out

10/2: Big-O Notation

10/4: Midterm

10/6: Go Over Midterm Answers

10/9: Long Weekend (no class)

10/11: Sorting Day 1

10/13: Sorting Day
Bring Computer to Class
Assignment 4 Due
Project Out

10/16: Data Structures

10/18: Recursion

10/20: File Compression
Bring Computer to Class

10/23: The Internet

10/25: Graphs

10/27: Meet about project proposals
Project Proposals Due

10/30: Web Programming

11/1: Events, User Input, UI

11/3: Python UI
Bring Computer to Class

11/6: Parallel Programming

11/8: Parallel Programming

11/10: Graph Traversal
Bring Computer to Class

11/13: Integer Overflow and Other Issue

11/15: How a Computer Interprets Code

11/17: Meet about project progress
Mid-Point Write-Up Due

11/20: Work in class
Bring Computer to Class

11/22: Thanksgiving
(no class)

11/24: Thanksgiving
(no class)

11/27: Computer Hardware

11/29: Data Science

12/1: Review for Final

12/4: Presentation Day 1

12/6: Presentation Day 2

12/8: Presentation Day 3

Get In Touch.

Contact Details

Stephanie Rosenthal
Chatham University
Falk 116C
Appointments required

s.rosenthal@chatham.edu