# Pandas Missing Values Handling – 30 Exercises
---
### Dataset 1: Retail Sales (CSV with
Date
, Product
, Sales
, Discount
)1. Load dataset and count total missing values in each column.
df.isnull().sum()
2. Find rows where
Sales
is missing.df[df['Sales'].isna()]
3. Drop all rows where
Product
is missing.df.dropna(subset=['Product'])
4. Replace missing
Discount
values with 0
.df['Discount'].fillna(0)
5. Fill missing
Sales
values with the mean of Sales column.df['Sales'].fillna(df['Sales'].mean())
6. Forward-fill (
ffill
) missing Sales
based on previous day’s value.df['Sales'].fillna(method='ffill')
7. Backward-fill (
bfill
) missing Sales
.df['Sales'].fillna(method='bfill')
8. Replace missing
Product
names with "Unknown Product"
.df['Product'].fillna("Unknown Product")
9. Interpolate missing
Sales
values (linear interpolation).df['Sales'].interpolate()
10. Drop rows where both
Sales
and Discount
are missing.df.dropna(subset=['Sales','Discount'], how='all')
---
### Dataset 2: Healthcare Patient Records (
PatientID
, Age
, Weight
, BloodPressure
)11. Count missing values per patient.
df.isna().sum(axis=1)
12. Remove patients with more than 2 missing fields.
df.dropna(thresh=df.shape[1]-2)
13. Fill missing
Age
values with the median age.df['Age'].fillna(df['Age'].median())
14. Fill missing
Weight
with the most frequent value (mode).df['Weight'].fillna(df['Weight'].mode()[0])
15. Interpolate missing
BloodPressure
using time-based interpolation.df['BloodPressure'].interpolate(method='time')
---
### Dataset 3: Finance Transactions (
Date
, CustomerID
, Amount
, PaymentMethod
)16. Find all rows where
PaymentMethod
is missing. Replace with "Cash"
.17. Fill missing
Amount
with rounded mean.df['Amount'].fillna(round(df['Amount'].mean(), 2))
18. Drop all customers who have no Amount values at all.
df.dropna(subset=['Amount'], how='all')
19. Check if any entire column is missing. Drop such columns.
df.dropna(axis=1, how='all')
20. Replace missing
PaymentMethod
by using forward fill grouped by CustomerID.df.groupby('CustomerID')['PaymentMethod'].ffill()
---
### Dataset 4: IoT Sensor Data (
Timestamp
, Temperature
, Humidity
, Pressure
)21. Find percentage of missing values for each sensor column.
(df.isna().sum()/len(df))*100
22. Replace missing
Temperature
values with the rolling mean of last 3 readings.df['Temperature'].fillna(df['Temperature'].rolling(3).mean())
23. Forward-fill missing values but limit to 2 consecutive fills.
df['Humidity'].fillna(method='ffill', limit=2)
24. Fill missing
Pressure
values with interpolation using "spline"
method.df['Pressure'].interpolate(method='spline', order=2)
25. Detect if there are any missing values at start/end of dataset.
---
### Dataset 5: HR Employee Records (
EmpID
, Name
, Department
, Salary
, JoinDate
)26. Replace missing
Department
with "Not Assigned"
.27. Fill missing
Salary
with median grouped by Department.df.groupby('Department')['Salary'].transform(lambda x: x.fillna(x.median()))
28. Drop employees who have both Name and Department missing.
29. Convert missing
JoinDate
to today’s date.df['JoinDate'].fillna(pd.to_datetime('today'))
30. Create a report of employees with missing fields (count per employee).
df[df.isna().any(axis=1)]
---
Data Files
Retail Sales
0 Comments