Airbnb Data Exploration

A comprehensive exploration of the Airbnb dataset, focusing on dataset overview, data types, missing values, zero values, and value distributions.


Table of Contents

  1. 1. Dataset Overview: A snapshot of the Airbnb dataset.

    Learn More:
  2. 2. Data Types Distribution: Analyzing the types of data in the dataset.

    Learn More:
  3. 3. Columns with Missing Values: Identifying columns with missing data.

    Learn More:
  4. 4. Columns with Zero Values: Identifying columns with zero data.

    Learn More:
  5. 5. Distribution of `room_type`: Analyzing the types of rooms available.

    Learn More:
  6. 6. Distribution of `burough`: Analyzing the distribution of boroughs in the dataset.

    Learn More:

1. Dataset Overview

Overview: Provides a general overview of the dataset.

Explanation: This function gives a snapshot of the dataset, including the number of rows, columns, missing values, and more.

pp.overview(airbnb)
DescriptionCount
Total Rows10,019
Total Columns22
Columns with Missing Values4
Total Duplicate Rows13
Most Frequent Data Typeobject
Columns with Binary Values0
Columns with Zero Values10
Unique Data Types3
Numeric Columns13
Non-Numeric Columns9

The table above provides a comprehensive overview of the Airbnb dataset, highlighting key metrics and characteristics.

2. Data Types Distribution

Overview: Analyzes the distribution of data types in the dataset.

Explanation: This function provides a breakdown of the different data types present in the dataset and their distribution.

pp.datatypes(airbnb)
Data TypeColumn Count% Distribution
object850.0
int64425.0
float64425.0

The table above showcases the distribution of data types in the Airbnb dataset.

3. Columns with Missing Values

Overview: Identifies columns with missing values in the dataset.

Explanation: This function lists columns that have missing values, along with the count and percentage of missing data.

pp.missing(airbnb)
Column NameMissing CountMissing %
price2382.0
estimated_revenue2382.0
name50.0
host_name20.0

The table above highlights columns with missing values in the Airbnb dataset.

4. Columns with Zero Values

Overview: Identifies columns with zero values in the dataset.

Explanation: This function lists columns that have zero values, along with the count and percentage of zeros.

pp.zeros(airbnb)
Column NameZero CountZero %
availability_365361436.07
number_of_reviews207520.71
last_review207520.71
reviews_per_month207520.71
rating207520.71
number_of_stays207520.71
5_stars207520.71
occupancy2902.89
estimated_revenue2882.87
price20.02

The table above highlights columns with zero values in the Airbnb dataset.

5. Distribution of `room_type`

Overview: Analyzes the distribution of values for the `room_type` column.

Explanation: This function provides a breakdown of the distribution of room types in the dataset.

pp.distribution(airbnb, 'room_type')
room_typeCount%
Entire Home/Apt518651.76
Private Room460745.98
Shared Room2262.26

The table above showcases the distribution of room types in the Airbnb dataset.

6. Distribution of `burough`

Overview: Analyzes the distribution of values for the `burough` column.

Explanation: This function provides a breakdown of the distribution of boroughs in the dataset.

pp.distribution(airbnb, 'burough')
buroughCount%
Manhattan444944.41
Brooklyn408640.78
Queens118211.80
Bronx2292.29
Staten Island730.73

The table above showcases the distribution of boroughs in the Airbnb dataset.