📊 Seaborn Technical Guide

Statistical Data Visualization Made Beautiful

Built on Matplotlib

Introduction to Seaborn

Seaborn is a Python data visualization library based on Matplotlib that provides a high-level interface for creating attractive and informative statistical graphics. It integrates seamlessly with pandas data structures and offers built-in themes for professional-looking plots.

🎨

Beautiful Defaults

Attractive default styles and color palettes

📈

Statistical Plots

Built-in support for complex visualizations

🔗

Pandas Integration

Works directly with DataFrames

High-Level API

Less code for complex visualizations

Installation & Setup

Install Seaborn

# Install via pip pip install seaborn # Install with conda conda install seaborn

Import and Basic Setup

import seaborn as sns import matplotlib.pyplot as plt import pandas as pd import numpy as np # Set the style sns.set_theme() # Or set specific style sns.set_style("whitegrid")
💡 Built-in Datasets Seaborn includes several built-in datasets perfect for learning and testing. Access them using sns.load_dataset('dataset_name').

Basic Plotting Concepts

Loading Data

# Load built-in dataset tips = sns.load_dataset("tips") iris = sns.load_dataset("iris") titanic = sns.load_dataset("titanic") # View first few rows print(tips.head())

Figure-Level vs Axes-Level Functions

Understanding Seaborn's Interface
  • Axes-level: Functions like scatterplot(), lineplot() - plot on a single matplotlib axes
  • Figure-level: Functions like relplot(), catplot() - create entire figure with multiple subplots
# Axes-level function sns.scatterplot(data=tips, x="total_bill", y="tip") plt.show() # Figure-level function sns.relplot(data=tips, x="total_bill", y="tip", hue="smoker", col="time") plt.show()

Distribution Plots

Histograms and KDE

# Histogram sns.histplot(data=tips, x="total_bill") # Histogram with KDE overlay sns.histplot(data=tips, x="total_bill", kde=True) # KDE plot sns.kdeplot(data=tips, x="total_bill") # Multiple distributions sns.kdeplot(data=tips, x="total_bill", hue="time") # 2D KDE (bivariate) sns.kdeplot(data=tips, x="total_bill", y="tip")

Distribution Plot (displot)

# Figure-level distribution plot sns.displot(data=tips, x="total_bill", hue="time", kind="kde", fill=True) # Histogram with facets sns.displot(data=tips, x="total_bill", col="time", row="sex", kde=True) # ECDF plot sns.displot(data=tips, x="total_bill", kind="ecdf")

Rug Plot

# Add rug plot to show individual observations sns.kdeplot(data=tips, x="total_bill") sns.rugplot(data=tips, x="total_bill")

Categorical Plots

Box and Violin Plots

# Box plot sns.boxplot(data=tips, x="day", y="total_bill") # Box plot with hue sns.boxplot(data=tips, x="day", y="total_bill", hue="smoker") # Violin plot (shows distribution) sns.violinplot(data=tips, x="day", y="total_bill") # Split violin plot sns.violinplot(data=tips, x="day", y="total_bill", hue="sex", split=True)

Strip and Swarm Plots

# Strip plot (scattered points) sns.stripplot(data=tips, x="day", y="total_bill") # Swarm plot (non-overlapping points) sns.swarmplot(data=tips, x="day", y="total_bill") # Combine box plot with swarm plot sns.boxplot(data=tips, x="day", y="total_bill") sns.swarmplot(data=tips, x="day", y="total_bill", color="black", alpha=0.5)

Bar and Count Plots

# Bar plot (shows mean with confidence interval) sns.barplot(data=tips, x="day", y="total_bill") # Count plot (counts occurrences) sns.countplot(data=tips, x="day") # Count plot with hue sns.countplot(data=tips, x="day", hue="sex")

Point Plot

# Point plot (shows mean and CI as points and lines) sns.pointplot(data=tips, x="day", y="total_bill", hue="sex")

Categorical Plot (catplot)

# Figure-level categorical plot sns.catplot(data=tips, x="day", y="total_bill", kind="box", col="time") # Violin plot with facets sns.catplot(data=tips, x="day", y="total_bill", kind="violin", hue="sex", col="time")

Relational Plots

Scatter Plots

# Basic scatter plot sns.scatterplot(data=tips, x="total_bill", y="tip") # With hue (color) sns.scatterplot(data=tips, x="total_bill", y="tip", hue="time") # With size sns.scatterplot(data=tips, x="total_bill", y="tip", size="size", hue="day") # With style (marker) sns.scatterplot(data=tips, x="total_bill", y="tip", style="smoker", hue="day")

Line Plots

# Load time series data flights = sns.load_dataset("flights") # Basic line plot sns.lineplot(data=flights, x="year", y="passengers") # Multiple lines with hue sns.lineplot(data=flights, x="month", y="passengers", hue="year") # Aggregated line plot with CI fmri = sns.load_dataset("fmri") sns.lineplot(data=fmri, x="timepoint", y="signal", hue="event")

Relational Plot (relplot)

# Figure-level relational plot sns.relplot(data=tips, x="total_bill", y="tip", hue="smoker", col="time", row="sex") # Line plot with facets sns.relplot(data=flights, x="year", y="passengers", kind="line", col="month", col_wrap=4)

Matrix and Heatmaps

Heatmap

# Create correlation matrix corr = tips.corr(numeric_only=True) # Basic heatmap sns.heatmap(corr) # Heatmap with annotations sns.heatmap(corr, annot=True, fmt=".2f", cmap