Data Analytics On Tips Dataset 

In this project, I analyzed the popular “Tips” dataset from the Seaborn library. This dataset contains information about restaurant bills, tips, and customer details.

The goal was to:

  1. Explore and visualize tipping patterns.

  2. Discover factors that influence tip amounts.

  3. Build a simple predictive model to estimate tips based on bill size and other factors



1. Dataset & Tools Dataset: 

    tips (Seaborn built-in dataset) 
    Tools Used: Python, Pandas, Seaborn, Matplotlib, Scikit-learn 
    Skills Demonstrated: Data Cleaning, EDA, Data Visualization, Linear Regression

2. Load and Explore the Dataset

tips.head()
total_billtipsexsmokerdaytimesize
016.991.01FemaleNoSunDinner2
110.341.66MaleNoSunDinner3
221.013.50MaleNoSunDinner3
323.683.31MaleNoSunDinner2
424.593.61FemaleNoSunDinner4

3.Apply Lamda Function

tips['total_bill'].apply(lambda bill: bill * 0.1)
0      1.699
1      1.034
2      2.101
3      2.368
4      2.459
       ...  
239    2.903
240    2.718
241    2.267
242    1.782
243    1.878
Name: total_bill, Length: 244, dtype: float64

Applying Discounts
def discount(tot_bill):
    discount = 0
    if tot_bill > 10:
        discount = 0.1
    else:
        discount = 0.05
    return discount    
tips['total_bill'].apply(discount)
0      0.1
1      0.1
2      0.1
3      0.1
4      0.1
      ... 
239    0.1
240    0.1
241    0.1
242    0.1
243    0.1
Name: total_bill, Length: 244, dtype: float64
Processing a Data Frame through a Loop
for i,r in tips.iterrows():
    total_paid,discount = 0,0
   
    total_paid = r.total_bill + r.tip
   
    if total_paid > 20 and r.day =='Sun' and r.smoker =='No':
        discount = total_paid * 0.20
    else:
        discount = total_paid * 0.005

       
    print(total_paid,discount)
       



for i,r in tips.iterrows():
    total_paid = 0
   
    total_paid = r.total_bill + r.tip
       
    if r.sex == 'Female' and total_paid > 25:
        print(i,r.day,r.time,total_paid)

Changing Data types to Data Frame
tips3_df['discount'] =  tips3_df['discount'].astype('int')
total_bill    float64
tip           float64
sex            object
smoker         object
day            object
time           object
size            int64
discount        int32
dtype: object

Comments

Popular posts from this blog

Coffee Sales Dashboard with Power BI: Daily Trends, Top Flavors, and Peak Hours Analysis

Data science blog