Create Scatter Plot with Linear Regression Line of Best Fit in Python
To add title and axis labels in Matplotlib and Python we need to use plt.title()
and plt.xlabel()
Steps
- Import libraries
- Prepare Data
- Select columns X and Y
- Map values to categories
- Select chart type
- Set Style
- chart height
- aspect
- colors
- points decoration
- Add title
- Show plot
More information can be found: Creating multiple subplots using
Data
We are loading the seaborn dataset for flights. Additionally we are mapping the month to season in order to get categorical data.
year | month | passengers | season | |
---|---|---|---|---|
0 | 1949 | Jan | 112 | 1 |
1 | 1949 | Feb | 118 | 1 |
2 | 1949 | Mar | 132 | 1 |
3 | 1949 | Apr | 129 | 2 |
4 | 1949 | May | 121 | 2 |
Example
To plot scatter plot with best fit line we use:
sns.lmplot()
- provide X and T
plt.ylabel('Y')
Full example:
import numpy as np; np.random.seed(0)
import matplotlib.pyplot as plt
import seaborn as sns; sns.set_theme()
flights = sns.load_dataset("flights")
flights['season'] = flights['month'].map({"Jan":1, "Feb":1, "Mar":1, "Apr":2, "May":2, "Jun":2, "Jul":2, "Aug":2, "Sep":2, "Oct":1, "Nov":1, "Dec":1, })
sns.set_style("white")
gridobj = sns.lmplot(x="year", y="passengers", hue="season", data=flights,
height=9, aspect=1.6, robust=True, palette='tab10',
scatter_kws=dict(s=100, linewidths=.9, edgecolors='black'))
plt.title("Scatterplot with line of best fit", fontsize=20)
plt.show()