What you'll build
Either summarize a CSV with pandas, or call a public API, save it to CSV, and print three useful insights.
Requirements
The must-do parts. If any are missing, we'll ask you to take another pass.
- Pick one of two paths: (A) summarize a real CSV, or (B) fetch from a public API and save it to a CSV first.
- Use pandas: at minimum
read_csv,head, one boolean filter, and one ofsort_values,value_counts, orgroupby. - Print at least three findings in plain English (number of records, an average or count, a top category).
- End with a short paragraph of written insights — what surprised you in the data?
- Wrap the analysis in functions, not a single 200-line cell.
Bonus, if you're feeling brave
- Plot one simple matplotlib chart.
- Compare two slices of the data side by side.
- Save the written insights to a
report.mdfile.
Where to start
Copy this scaffold into a new file. You don't have to use it — it's just a friendly nudge.
summary.py
import pandas as pd
def load_data() -> pd.DataFrame:
return pd.read_csv("scholarships.csv")
def top_countries(df: pd.DataFrame, n: int = 5) -> pd.Series:
return df["country"].value_counts().head(n)
if __name__ == "__main__":
df = load_data()
print("Records:", len(df))
print(top_countries(df))How we'll grade it
Four checks, four points. Three or above is passing — we'll ask you to revise anything we can't tick.
| Check | What we look for | Pt |
|---|---|---|
| It runs | End-to-end on a fresh environment. Dependencies declared. | 1 |
| Uses pandas honestly | At least one filter and one summary operation, not just `head()`. | 1 |
| Findings are real | Three plain-English insights backed by the numbers you printed. | 1 |
| Reflection | One paragraph of what surprised you. | 1 |