Skip to content

Latest commit

 

History

History
64 lines (50 loc) · 3.55 KB

File metadata and controls

64 lines (50 loc) · 3.55 KB

Homework 1

General assignment information. Note that this isn't a template notebook, hence there's no 🚀 above. You will create a blank notebook for this one.

Tutorials

Coding

You'll do the following in a notebook. Make it read like a blog post. Pretend you're explaining to a peer who hasn't taken this class. You don't need to teach them to code, but they should be able to follow what's going on.

Steps

  1. Find a dataset.
    • It must have at least one numeric column.
    • Don't spend too long on this step.
  2. If there's more than one numeric column, pick one.
  3. Create a new notebook.
  4. Using pandas:
    1. Read in the data.
    2. Compute:
      • The mean
      • The median
      • The mode
    3. Do a groupby() with an aggregation.
  5. Do the same thing, but with pure Python (without pandas).
  6. Write a conclusion, covering both:
    • The takeaways of the analysis
    • Reflecting on the process
  7. Did you use an external source, including generative AI? Please explain, or say that you didn't.

Now turn in the assignment.

Tutorials, continued

  1. Read The Joys (and Woes) of the Craft of Software Engineering
    • Note not everything in there is applicable to data analysis
  2. Filtering/indexing DataFrames
  3. Learn about functions
  4. Brackets in Python and pandas
  5. Coding Style Guides - Please skim these; I don't expect you to understand and follow everything in them. The most important guidelines to pay attention to are indentation and keeping each statement on its own line.
  6. Guide to commenting your code
  7. Quartz Guide to Bad Data

Optional

Participation

Reminder about the between-class participation requirement.