# Pandas Missing Values Handling – 30 Exercises
---
### Dataset 1: Retail Sales (CSV with
Date, Product, Sales, Discount)1. Load dataset and count total missing values in each column.
df.isnull().sum()2. Find rows where
Sales is missing.df[df['Sales'].isna()]3. Drop all rows where
Product is missing.df.dropna(subset=['Product'])4. Replace missing
Discount values with 0.df['Discount'].fillna(0)5. Fill missing
Sales values with the mean of Sales column.df['Sales'].fillna(df['Sales'].mean())6. Forward-fill (
ffill) missing Sales based on previous day’s value.df['Sales'].fillna(method='ffill')7. Backward-fill (
bfill) missing Sales.df['Sales'].fillna(method='bfill')8. Replace missing
Product names with "Unknown Product".df['Product'].fillna("Unknown Product")9. Interpolate missing
Sales values (linear interpolation).df['Sales'].interpolate()10. Drop rows where both
Sales and Discount are missing.df.dropna(subset=['Sales','Discount'], how='all')---
### Dataset 2: Healthcare Patient Records (
PatientID, Age, Weight, BloodPressure)11. Count missing values per patient.
df.isna().sum(axis=1)12. Remove patients with more than 2 missing fields.
df.dropna(thresh=df.shape[1]-2)13. Fill missing
Age values with the median age.df['Age'].fillna(df['Age'].median())14. Fill missing
Weight with the most frequent value (mode).df['Weight'].fillna(df['Weight'].mode()[0])15. Interpolate missing
BloodPressure using time-based interpolation.df['BloodPressure'].interpolate(method='time')---
### Dataset 3: Finance Transactions (
Date, CustomerID, Amount, PaymentMethod)16. Find all rows where
PaymentMethod is missing. Replace with "Cash".17. Fill missing
Amount with rounded mean.df['Amount'].fillna(round(df['Amount'].mean(), 2))18. Drop all customers who have no Amount values at all.
df.dropna(subset=['Amount'], how='all')19. Check if any entire column is missing. Drop such columns.
df.dropna(axis=1, how='all')20. Replace missing
PaymentMethod by using forward fill grouped by CustomerID.df.groupby('CustomerID')['PaymentMethod'].ffill()---
### Dataset 4: IoT Sensor Data (
Timestamp, Temperature, Humidity, Pressure)21. Find percentage of missing values for each sensor column.
(df.isna().sum()/len(df))*10022. Replace missing
Temperature values with the rolling mean of last 3 readings.df['Temperature'].fillna(df['Temperature'].rolling(3).mean())23. Forward-fill missing values but limit to 2 consecutive fills.
df['Humidity'].fillna(method='ffill', limit=2)24. Fill missing
Pressure values with interpolation using "spline" method.df['Pressure'].interpolate(method='spline', order=2)25. Detect if there are any missing values at start/end of dataset.
---
### Dataset 5: HR Employee Records (
EmpID, Name, Department, Salary, JoinDate)26. Replace missing
Department with "Not Assigned".27. Fill missing
Salary with median grouped by Department.df.groupby('Department')['Salary'].transform(lambda x: x.fillna(x.median()))28. Drop employees who have both Name and Department missing.
29. Convert missing
JoinDate to today’s date.df['JoinDate'].fillna(pd.to_datetime('today'))30. Create a report of employees with missing fields (count per employee).
df[df.isna().any(axis=1)]---
Data Files
Retail Sales

0 Comments