This notebook contains an excerpt from the Python Data Science Handbook by Jake VanderPlas with additional 3D visualization examples. The text is released under the CC-BY-NC-ND license, and code is released under the MIT license.
Run in Google Colab
Open this tutorial in Google Colab to execute the code interactively.
Setup
Generating 3D Gaussian Data
First, let’s create a 3D Gaussian distribution with a specific covariance structure:Visualizing 3D Data
Computing Principal Components via SVD
We can find the principal components by performing Singular Value Decomposition on the covariance matrix:2D Projection
Verifying Decorrelation
A key property of PCA is that the projected data has uncorrelated components:Visualizing the Principal Plane in 3D
We can visualize how the 2D projection relates to the original 3D space:Interactive 3D Visualization with Plotly
For interactive exploration, we can use Plotly:Using Scikit-Learn PCA
We can also use scikit-learn’s PCA implementation:Visualizing Principal Axes
Input vs Principal Components Comparison
Key Insights
- Decorrelation: PCA transforms correlated variables into uncorrelated principal components
- Variance Maximization: The first principal component captures the direction of maximum variance
- Orthogonality: Principal components are orthogonal to each other
- Dimensionality Reduction: We can project high-dimensional data onto a lower-dimensional subspace while preserving maximum variance
References
- McInnes, L., Healy, J., Melville, J. (2018). UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction.
- Zeng, A., Song, S., Nießner, M., Fisher, M., Xiao, J., et al. (2016). 3DMatch: Learning Local Geometric Descriptors from RGB-D Reconstructions.

