In Data Science, individual data points are less important than the overall distribution. Visualizing distributions helps us understand the central tendency, spread, and symmetry of our data. It allows us to detect outliers and verify if our data follows a Normal distribution before we begin modeling.
1Histograms & Density
The Histogram (histplot) discretizes quantitative data into 'bins'. While effective, bin size is critical. To see a smoother representation, we use Kernel Density Estimation (kdeplot), which estimates the probability density function as a continuous curve.
2Joint & Bivariate Analysis
Sometimes we need to see how two distributions interact. jointplot shows the bivariate relationship between two variables, while also providing univariate marginal plots for each variable on the sides.
