Skip to content

Clustering & PCA Tutorial

Learn how to use clustering and principal component analysis to identify conformational states in MD trajectories.

Prerequisites

  • An MD trajectory loaded in CatGo

Step 1: Select Features

Choose the structural descriptor for clustering:

  • Atomic positions (after alignment)
  • Pairwise distances
  • Dihedral angles

Step 2: Run PCA

Dimensionality Reduction

PCA projects the high-dimensional trajectory data onto principal components.

Scree Plot

Examine the explained variance to choose the number of components.

Step 3: Cluster Frames

Algorithm Selection

Choose clustering method: K-means, DBSCAN, or hierarchical.

Number of Clusters

For K-means, select the number of clusters (use elbow method for guidance).

Step 4: Visualize Results

PCA Scatter Plot

Frames are projected onto PC1 vs PC2, colored by cluster assignment.

Representative Structures

View the centroid structure of each cluster in the 3D viewer.

Step 5: Export

Export cluster assignments, PCA coordinates, and representative frames.

Released under the MIT License.