wk03.mp3Week 03 · Data and the outside worldOpen · due in 3 weeks

Research data summary tool

Either summarize a CSV with pandas, or call a public API, save it to CSV, and print three useful insights.

What you'll build

Either summarize a CSV with pandas, or call a public API, save it to CSV, and print three useful insights.

Requirements

The must-do parts. If any are missing, we'll ask you to take another pass.

  1. Pick one of two paths: (A) summarize a real CSV, or (B) fetch from a public API and save it to a CSV first.
  2. Use pandas: at minimum read_csv, head, one boolean filter, and one of sort_values, value_counts, or groupby.
  3. Print at least three findings in plain English (number of records, an average or count, a top category).
  4. End with a short paragraph of written insights — what surprised you in the data?
  5. Wrap the analysis in functions, not a single 200-line cell.
Bonus, if you're feeling brave
  • Plot one simple matplotlib chart.
  • Compare two slices of the data side by side.
  • Save the written insights to a report.md file.

Where to start

Copy this scaffold into a new file. You don't have to use it — it's just a friendly nudge.

summary.py
import pandas as pd

def load_data() -> pd.DataFrame:
    return pd.read_csv("scholarships.csv")

def top_countries(df: pd.DataFrame, n: int = 5) -> pd.Series:
    return df["country"].value_counts().head(n)

if __name__ == "__main__":
    df = load_data()
    print("Records:", len(df))
    print(top_countries(df))

How we'll grade it

Four checks, four points. Three or above is passing — we'll ask you to revise anything we can't tick.

CheckWhat we look forPt
It runsEnd-to-end on a fresh environment. Dependencies declared.1
Uses pandas honestlyAt least one filter and one summary operation, not just `head()`.1
Findings are realThree plain-English insights backed by the numbers you printed.1
ReflectionOne paragraph of what surprised you.1
Ready?
Hand it in
You can submit a draft and revise later if you're not done.
Begin submission →