Exploring Zomato Restaurant Insights in Indian Cities with Python

Chapter 1: Introduction to Zomato Restaurant Analysis

In this blog post, we will delve into a dataset featuring 900 restaurants spread across 13 major Indian cities. By leveraging Python and its visualization libraries, we will uncover valuable insights into dining and delivery ratings, customer preferences, popular cuisines, and more. Our aim is to distill complex data into easily digestible insights, making it accessible to food enthusiasts, analysts, and business owners alike.

The first video provides an overview of how to locate top dining establishments in your city using the Zomato API and Python, showcasing practical applications of data science.

Section 1.1: Overview of the Dataset

This dataset from Kaggle encompasses a wide array of information about the restaurant landscape in 13 metropolitan areas of India. It includes customer ratings for dining and delivery services, reviews, the most popular dishes, and pricing data. This extensive dataset not only allows for an examination of dining trends but also facilitates comparisons among different restaurants and cuisines.

Section 1.2: Preparing for Data Analysis

To kickstart our analysis, we need to load several essential libraries:

import pandas as pd

import numpy as np

import datetime as dt

import matplotlib.pyplot as plt

import seaborn as sns

import warnings

warnings.filterwarnings('ignore')

Next, we'll load the Zomato dataset for our analysis:

df_zomato = pd.read_csv("zomato_dataset.csv", skipinitialspace=True)

Chapter 2: Data Exploration and Insights

In this chapter, we will explore the dataset, check for data types, and clean the data.

The second video dives into the analysis of the Zomato dataset using Python, demonstrating various programming techniques for data exploration.

Section 2.1: Data Cleaning and Preparation

Initially, we will perform a basic inspection of the data to identify any missing values and their types.

df_zomato.head()

df_zomato.info()

We will replace any missing values in the 'Dining Rating' and 'Delivery Rating' columns with zero:

df_zomato['DeliveryRating'] = df_zomato['DeliveryRating'].fillna(0)

df_zomato['DiningRating'] = df_zomato['DiningRating'].fillna(0)

Section 2.2: Extracting Insights from the Data

We will create a function to visualize data insights:

def get_plots(df, criteria, title):

list_cities = list(df.City.unique())

my_colors = ['#FA8072', '#6495ED', '#40E0D0', '#808080', '#28B463']

fig, axs = plt.subplots(17, 1, figsize=(10, 60), facecolor='w', edgecolor='k')

fig.subplots_adjust(hspace=.5, wspace=.001)

axs = axs.ravel()

for i, city in enumerate(list_cities):

df_filtered = df[df['City'] == city]

df_filtered.sort_values('Prices', ascending=False, inplace=True)

df_filtered = df_filtered.head(5)

bars = axs[i].barh(df_filtered[criteria], df_filtered['Prices'], color=my_colors)

axs[i].set_title(f'{title} {criteria} in {city}')

axs[i].bar_label(bars)

Insight 1: Most Expensive Cuisine by City

df_city = df_zomato.groupby(['City', 'Cuisine'], as_index=False)['Prices'].max()

idx = df_city.groupby('City')['Prices'].idxmax()

df_city_max_price = df_city.loc[idx]

Insight 2: Most Popular Restaurants

We will calculate a 'Total Rating' by summing the dining and delivery ratings:

df_zomato['Total_rating'] = df_zomato['DiningRating'] + df_zomato['DeliveryRating']

df_rating = df_zomato.groupby(['City', 'RestaurantName'], as_index=False)['Total_rating'].max()

Insight 3: Cities with the Most Restaurants

num_rest = df_zomato['City'].value_counts().nlargest(12).sort_values(ascending=False)

plt.figure(figsize=(12, 6))

ax = num_rest.plot(kind='barh', color='#6495ED')

plt.xlabel("Count of Restaurants")

plt.ylabel("City")

plt.title("Number of Restaurants in Various Cities", fontsize=8, weight='bold')

ax.invert_yaxis()

plt.tight_layout()

Insight 4: Best Sellers Analysis

df_bestseller = df_zomato['BestSeller'].value_counts().nlargest(5).sort_values(ascending=False)

plt.figure(figsize=(12, 6))

plt.pie(df_bestseller, labels=df_bestseller.index, autopct='%1.1f%%', pctdistance=0.85)

plt.title('% Share of Top 5 Best Sellers')

plt.show()

Insight 5: Top 5 Restaurants Post-Consolidation

To consolidate areas in Bangalore, we will combine certain neighborhoods:

BLR_areas = ['Banaswadi', 'Ulsoor', 'Malleshwaram', 'Magrath Road']

df_BLR = df_zomato.copy()

df_BLR['City'].replace(BLR_areas, 'Bangalore', inplace=True)

df_BLR_cnt = df_BLR.groupby(['City', 'RestaurantName'], as_index=False)['RestaurantName'].value_counts()

get_restaurant_plots(df_BLR_cnt, "Top 5 Restaurants in ")

Chapter 3: Conclusion

Through this analysis, we have unearthed valuable insights regarding the culinary environment in various Indian cities, shedding light on pricing, popular dishes, and customer ratings. Such information is invaluable for food lovers and business owners alike, enabling data-driven decisions in the ever-evolving food industry. By harnessing the power of Python, we gain a deeper appreciation of the rich tapestry that is Indian food culture. We hope you found this exploration insightful and engaging!

Connect with me on LinkedIn, GitHub, or Medium for more insights on data science using Python and R.

arsalandywriter.com