2D-Geo-Clustering

Unsupervised 2D Geometric Clustering of Facial Features and Head Pose

This repository contains a complete, end-to-end implementation of a fully unsupervised pipeline that:

Nose-aligns facial images to a dataset-wide average nose-tip coordinate.
Extracts simple 2D geometric descriptors for four facial regions:
- Lips (width-to-height ratio)
- Eyes (aspect ratio)
- Pupils (normalized 2D coordinates)
- Head pose (eye-line displacement & orientation difference)
Clusters each descriptor set using K-Means, selecting the number of clusters via the “elbow” method.
Evaluates cluster compactness and separation with Silhouette, Calinski–Harabasz, and Davies–Bouldin indices.

The pipeline was developed and validated on a subset of AffectNet-YOLO (25 266 faces at 96×96 px) and cross-validated on FER (48×48 px). It yields semantically meaningful clusters (e.g. closed vs. open mouth, eye openness levels, five pupil‐gaze directions, three head orientations) without any labels or pretrained embeddings.

Repository Structure
Prerequisites & Setup
Data Preparation
Pipeline Overview
Scripts & Usage
Output & Folder Structure
Evaluation Metrics
Future Work & Extensions
References

Repository Structure

├── README.md
├── shape_predictor_68_face_landmarks.dat            # dlib pretrained model (not included here; download separately)
├── data/
│   ├── AFFECTNET-YOLO/                              # Unlabeled images used for training (96×96 px)
│   └── FER/                                         # (Optional) 48×48 px images for cross-dataset validation
├── outputs/                                         # All intermediate and final results will be saved here
│   ├── Alignment/                                   # Nose-aligned images
│   ├── HeadAngleDisp_Images/                        # Annotated images for angle & displacement
│   ├── Head_Clusters/                               # Head-pose clustering results (plots, CSVs, cluster folders)
│   ├── LipClustering1/                              # Lip ratio clustering (annotated images & plots)
│   ├── Eye_Emo/                                     # Eye ratio extraction & clustering
│   ├── Pupil_Annotated/                             # Pupil detection & heuristic labeling
│   └── Pupil_Clusters/                              # Pupil clustering results (plots, CSVs)
├── LipClusteringForSubfolderGreyImageSSHyper.py     # Lip ratio extraction + clustering script
├── EyeClusteringForSubfolderGreyImageSSHyper.py     # Eye ratio extraction + clustering script
├── PupilMoveClassification.py                       # Pupil localization & K-Means clustering
├── HeadNoseAlignment.py                             # Compute average nose tip → align images
├── HeadAngleDisp.py                                 # Compute eye-line displacement & angle for each image
├── HeadClusteringAngleDisp.py                       # Perform K-Means on head pose features (angle, displacement)
└── requirements.txt                                 # Python dependencies

Prerequisites & Setup

Python 3.8+
Install required libraries
```
pip install -r requirements.txt
```
The main dependencies are:
- dlib (v19.22 or compatible) – for face detection & 68-point landmarks
- opencv-python (v4.5) – for image I/O & processing
- numpy (v1.21) – numerical computations
- pandas (v1.3) – CSV handling
- scikit-learn (v1.0) – K-Means, StandardScaler, clustering metrics
- matplotlib (v3.4) – plotting
- seaborn (v0.11) – (optional, used in head clustering plots)
Download dlib’s 68-point landmark model
The repository expects the file
```
shape_predictor_68_face_landmarks.dat
```
in this top‐level directory. You can download it from:
http://dlib.net/files/shape_predictor_68_face_landmarks.dat.bz2
Unzip it so that shape_predictor_68_face_landmarks.dat is available.

Data Preparation

AffectNet-YOLO Images
- Place your “AFFECTNET-YOLO” images (96×96 px) in data/AFFECTNET-YOLO/, preserving any subfolder structure.
- This pipeline ignores annotation files; it only needs face images.
Optional: FER Dataset
- For cross-dataset validation (lip & eye), put the FER 48×48 px face crops in data/FER/.
- The lip/eye scripts will run “as-is” on these smaller crops (the same ratio calculations still apply).
Output Directory
- All generated images, CSVs, plots, and cluster folders will be saved under outputs/.
- Each script will create its own subdirectory inside outputs/—see Output & Folder Structure for details.

Pipeline Overview

At a high level, our pipeline consists of three stages:

Stage 1: Nose-Tip Alignment
Stage 2: Descriptor Extraction
- Lips → rlip
- Eyes → reye
- Pupils → (nx, ny)
- Head → (d, ∆θ)
Stage 3: Unsupervised Clustering
- K-Means on each descriptor set; k chosen via elbow method
- Compute Silhouette, Calinski–Harabasz, and Davies–Bouldin indices

Below is a detailed breakdown of each stage.

Stage 1: Nose-Tip Alignment

Goal: Translate every face such that its nose tip aligns with the dataset-wide average nose-tip coordinate. This removes global translation, so that subsequent head-pose features only capture rotation & displacement relative to this global center.
Script: HeadNoseAlignment.py
1. Walk through data/AFFECTNET-YOLO/, detect the nose tip (landmark #34 in dlib’s 0-indexed 68-point model) for each face.
2. Accumulate (x, y) of nose tip across all images to compute:
```
avg_nose_tip = ( Σ p34_i ) / N
```
3. Re‐walk the image folder, and for each face:
  - Detect nose tip p34_i = (xi, yi).
  - Compute translation vector:
    dx = avg_nose_tip.x − xi dy = avg_nose_tip.y − yi M = [[1, 0, dx], [0, 1, dy]] repositioned = warpAffine(image, M)
  - Save the translated image under outputs/Alignment/.
Output:
- outputs/Alignment/ contains all nose-aligned images, preserving original filenames (overwritten if existing).
- A printed log of Average Nose Tip Coordinate (x, y): ….

Stage 2: Descriptor Extraction

After nose alignment, we extract four low-dimensional geometric descriptors from each aligned face:

Lip Ratio Extraction

Descriptor:
[ r_{ m lip} \;=\; rac{|P_{63} - P_{67}|}{|P_{61} - P_{65}|} ] where:
- (P_{61}, P_{65}) are the left/right mouth corners.
- (P_{63}, P_{67}) are the midpoints of upper & lower lips.
- This ratio encodes mouth opening (vertical stretch) versus horizontal mouth width.
Script: LipClusteringForSubfolderGreyImageSSHyper.py
1. Walks through data/AFFECTNET-YOLO/ (or any input folder), uses dlib to detect the first face and 68 landmarks.
2. Extracts the four lip landmarks:
  - P61 = (landmarks.part(61).x, landmarks.part(61).y)
  - P65 = (landmarks.part(65).x, landmarks.part(65).y)
  - P63 = (landmarks.part(63).x, landmarks.part(63).y)
  - P67 = (landmarks.part(67).x, landmarks.part(67).y)
3. Computes distances:
```
width  = ||P61 − P65||
height = ||P63 − P67||
ratio  = height / width
```
4. For the first 50 images only, draws the two line segments on the image (green for width, blue for height) and annotates “Length: <width> Height: <height> Ratio: <ratio>”. Saves those 50 annotated images under outputs/LipClustering1/output/.
5. Collects all ratio values in a NumPy array, then:
  - Standardize via StandardScaler.
  - Use K-Means (with n_clusters=5 or user‐adjustable) on the scaled lip ratios.
  - Save cluster assignments to outputs/LipClustering1/results.txt (one line per image: Image: <path>, Cluster: <id>).
  - Save two plots under outputs/LipClustering1/plots/:
    1. A scatter of (index vs. scaled_ratio) color-coded by cluster.
    2. A histogram of raw lip ratios.

Output:

outputs/LipClustering1/
├── output/                       # Annotated images (first 50 only)
├── plots/
│   ├── cluster_visualization.png
│   └── ratios_histogram.png
└── results.txt                   # One line per processed image with cluster ID

Eye Aspect Ratio Extraction

Descriptor:
[ r_{ m eye} = rac{|P_{37} - P_{40}|}{igl| frac{P_{38}+P_{39}}{2} - frac{P_{41}+P_{42}}{2}igr|} ] where the numerator is the horizontal distance between outer eye corners, and the denominator is the vertical distance between eyelid midpoints. It measures eye openness.
Script: EyeClusteringForSubfolderGreyImageSSHyper.py
1. Walks through data/AFFECTNET-YOLO/ (or any input folder), reads each image, converts to grayscale.
2. Detects the first face & 68 landmarks via dlib.
3. Extracts the six eye‐related landmarks for the right eye (landmarks 36–41 in dlib 0-index):
  - P37 = landmarks.part(36), P40 = landmarks.part(39),
  - P38 = landmarks.part(37), P39 = landmarks.part(38),
  - P41 = landmarks.part(40), P42 = landmarks.part(41).
4. Computes:
```
width  = ||P37 − P40||
midpoint_top = (P38 + P39)/2
midpoint_bot = (P41 + P42)/2
height = ||midpoint_top − midpoint_bot||
ratio  = width / height
```
5. Writes each <Image Name, Height, Width, Ratio> to a CSV:
  outputs/Eye_Emo/Eye_ExpFull_landmark_ratios.csv.
6. Standardize the Ratio column (via StandardScaler), then:
  - Elbow method on k = 1…10: compute K-Means inertia (distortion).
  - Save the Elbow plot under outputs/Eye_Emo/elbow_method_plot.png.
  - Choose n_clusters=4 (optimal elbow), run final K-Means.
  - Append a Cluster column to the CSV and save as clustered_data_ratio.csv.
  - Create subfolders outputs/Eye_Emo/Cluster_0/…/Cluster_3/ and move each annotated image into its cluster folder.
  - Re‐annotate each clustered image by plotting dlib landmarks (green dots) and saving back.
  - Save two clustering plots under outputs/Eye_Emo/:
    1. clusters_plot_ratio.png: scatter (index vs. Ratio) colored by K-Means cluster.
    2. clusters_ratio_vs_mean_plot.png: ratio distribution per cluster with red centroids.

Output:

outputs/Eye_Emo/
├── Eye_ExpFull_landmark_ratios.csv     # Raw ratios CSV
├── elbow_method_plot.png
├── clustered_data_ratio.csv            # Ratios + cluster labels
├── Cluster_0/, Cluster_1/, Cluster_2/, Cluster_3/  # images with landmarks overlaid
├── clusters_plot_ratio.png
└── clusters_ratio_vs_mean_plot.png

Pupil Position Classification

Descriptor (Heuristic + K-Means):
1. Heuristic labels: Divide the normalized pupil center (nx, ny) within the eye bounding box [0,1]×[0,1] into five regions:
  - If (\lvert nx − 0.5 vert < d) and (\lvert ny − 0.5 vert < d), label = center (dead zone).
  - Else, if (\lvert nx−0.5 vert ≥ \lvert ny−0.5 vert), label = left if nx < 0.5, else right.
  - Otherwise, label = top if ny < 0.5, else bottom.
  - Typically (d = 0.1).
2. K-Means: After collecting all (nx, ny) pairs, standardize (mean = 0, var = 1) and run K-Means with n_clusters=5, selecting via elbow or domain knowledge.
Script: PupilMoveClassification.py
1. Walk through data/AFFECTNET-YOLO/, detect faces and 68 landmarks for the first face per image.
2. Crop the right eye region (landmarks 42–47), obtain (x, y, w, h) via cv2.boundingRect(...).
3. Grayscale → GaussianBlur (7×7, σ=0) → THRESH_BINARY_INV with T=30.
4. Find largest contour → compute image moments → obtain pupil center (cx, cy) in crop.
5. Normalize:
  [ n_x = rac{c_x}{w}, \quad n_y = rac{c_y}{h}. ]
6. Heuristic classification (dead zone radius 0.1):
  - If (\lvert n_x−0.5 vert < 0.1) and (\lvert n_y−0.5 vert < 0.1) → center
  - Else, compare (\lvert n_x−0.5 vert) vs. (\lvert n_y−0.5 vert) for left/right vs. top/bottom.
7. Annotate each input image by:
  - Drawing a green rectangle around the eye crop.
  - Marking the pupil center (red dot).
  - Overlaying the heuristic label (center, left, right, top, bottom) above the eye.
  - Save annotated images under outputs/Pupil_Annotated/.
8. Collect per-image results into a list of dicts:
```
{
  image_path, annotated_path,
  label, norm_x, norm_y,
  eye_x, eye_y, eye_w, eye_h,
  pupil_x, pupil_y
}
```
9. Convert to a DataFrame results_df and save to outputs/Pupil_Annotated/clustering_results.csv.
10. Heuristic plots under outputs/Pupil_Clusters/:
  - cluster_distribution.png: bar chart of the five heuristic counts.
  - pupil_scatter.png: scatter of (norm_x, norm_y) color‐coded by heuristic label.
11. K-Means on [(norm_x, norm_y)] with n_clusters=5:
  - Fit & assign kmeans_label.
  - Save updated CSV (with kmeans_label) as clustering_results.csv (overwriting).
  - Plot kmeans_pupil_scatter.png: scatter (norm_x, norm_y) color‐coded by K-Means cluster.

Output:

outputs/Pupil_Annotated/
├── <annotated images>.jpg
├── clustering_results.csv
└── outputs/Pupil_Clusters/
    ├── cluster_distribution.png
    ├── pupil_scatter.png
    └── kmeans_pupil_scatter.png

Head Pose (Angle + Displacement)

We encode head pose by measuring:

Displacement d = Euclidean distance between the midpoints of:
- Actual eye-line: midpoint of landmarks #37 & #46
- Reference (mean eye-line): midpoint of the dataset’s average #37 & #46
Angular deviation ∆θ (degrees) = difference in arctangent angles (line slopes) of:
- Actual eye-line (angle_i = atan2(y46−y37, x46−x37))
- Reference eye-line (angle_ref = atan2(ȳ46−ȳ37, x̄46−x̄37))
  [ ∆θ = igl( heta_i - heta_{ m ref}igr) imes rac{180}{\pi}. ]

Script 1: HeadAngleDisp.py
1. Scan data/AFFECTNET-YOLO/ to accumulate all (\mathbf{p}{37}) and (\mathbf{p}{46}) across detected faces → compute
```
avg_37 = Σ p37_i / N,    avg_46 = Σ p46_i / N.
```
2. Re‐scan images: for each face,
  - Compute actual p37 = (x37, y37), p46 = (x46, y46).
  - actual_mid = ((x37 + x46)/2, (y37 + y46)/2).
  - ref_mid = ((avg_37.x + avg_46.x)/2, (avg_37.y + avg_46.y)/2).
  - displacement = ||actual_mid − ref_mid||.
  - angle_i = atan2(y46−y37, x46−x37), angle_ref = atan2(avg_46.y−avg_37.y, avg_46.x−avg_37.x).
  - angle_diff = (angle_i − angle_ref) × (180/π).
  - Draw:
    - Green line: reference eye-line (avg_37 → avg_46).
    - Red line: actual eye-line (p37 → p46).
  - Save annotated image under outputs/HeadAngleDisp_Images/.
  - Collect [filename, angle_diff, displacement] to a list.
3. Export that list as a CSV:
```
outputs/HeadAngleDisp_Images/colds_AngleDisp_Data.csv
```
Script 2: HeadClusteringAngleDisp.py
1. Read outputs/HeadAngleDisp_Images/colds_AngleDisp_Data.csv.
2. Extract features:
  - X_angle = df[['Angle (degrees)']]
  - X_displacement = df[['Displacement (pixels)']]
  - X_combined = df[['Angle (degrees)', 'Displacement (pixels)']]
3. Standardize each: StandardScaler().fit_transform(...).
4. Clustering (for each of the three feature sets):
  - Create folders:
    outputs/Head_Clusters/Angle/Cluster_0…Cluster_2 outputs/Head_Clusters/Displacement/Cluster_0…Cluster_2 outputs/Head_Clusters/Angle_Displacement/Cluster_0…Cluster_4
  - Run K-Means:
    - n_clusters=3 for single‐feature (angle only, displacement only).
    - n_clusters=5 for combined (angle+disp).
    - Use random_state=42, n_init=10, max_iter=500.
  - Append cluster labels to the DataFrame (e.g., df['Angle Cluster']).
  - For each cluster, save a CSV subset and copy the original aligned images from data/AFFECTNET-YOLO/ into the corresponding cluster folder.
  - Scatter Plots:
    - For 1D features: plot (index vs. value) colored by cluster.
    - For 2D: scatter (angle, displacement) colored by cluster.
    - Save plots under the corresponding folder.
  - Compute & print metrics: Inertia, Silhouette, Davies–Bouldin, Calinski–Harabasz.
5. Save the full CSV with cluster labels in:
```
outputs/Head_Clusters/KMeans_Clustering_Results.csv
```
6. Save summary of evaluation metrics to:
```
outputs/Head_Clusters/Clustering_Evaluation_Metrics.csv
```

Output:

outputs/Head_Clusters/
├── Angle/
│   ├── Cluster_0/, Cluster_1/, Cluster_2/         # aligned images in each angle cluster
│   └── KMeans_Clustering_Angle.png
├── Displacement/
│   └── similarly structured
├── Angle_Displacement/
│   └── Cluster_0…Cluster_4, plots, CSV subsets
├── KMeans_Clustering_Results.csv      # combined clusters
└── Clustering_Evaluation_Metrics.csv

Output & Folder Structure

After you run all scripts in sequence (alignment → descriptor extraction → clustering), you’ll see the following top-level outputs/ structure:

outputs/
├── Alignment/                                   # Nose-aligned images
│   └── *.jpg, *.png, …  
├── HeadAngleDisp_Images/                        # Head pose annotation & data
│   ├── <annotated images>.jpg
│   └── colds_AngleDisp_Data.csv
├── Head_Clusters/
│   ├── Angle/
│   │   ├── Cluster_0/, Cluster_1/, Cluster_2/
│   │   └── KMeans_Clustering_Angle.png
│   ├── Displacement/
│   │   ├── Cluster_0/, Cluster_1/, Cluster_2/
│   │   └── KMeans_Clustering_Displacement.png
│   ├── Angle_Displacement/
│   │   ├── Cluster_0/…/Cluster_4/
│   │   └── KMeans_Clustering_Angle_Displacement.png
│   ├── KMeans_Clustering_Results.csv
│   └── Clustering_Evaluation_Metrics.csv
├── LipClustering1/
│   ├── output/                              # first 50 lip‐annotated images
│   ├── plots/
│   │   ├── cluster_visualization.png
│   │   └── ratios_histogram.png
│   └── results.txt
├── Eye_Emo/
│   ├── Eye_ExpFull_landmark_ratios.csv     # raw height,width,ratio
│   ├── elbow_method_plot.png
│   ├── clustered_data_ratio.csv            # ratio + cluster
│   ├── Cluster_0/…/Cluster_3/              # images with landmarks overlaid
│   ├── clusters_plot_ratio.png
│   └── clusters_ratio_vs_mean_plot.png
└── Pupil_Annotated/
    ├── <annotated images>.jpg
    ├── clustering_results.csv          # “heuristic + kmeans” labels
    └── Pupil_Clusters/
        ├── cluster_distribution.png
        ├── pupil_scatter.png
        └── kmeans_pupil_scatter.png

Every script creates and populates its respective subfolder under outputs/. You can safely delete or re-run them in any order, but a recommended execution sequence is:

HeadNoseAlignment.py → outputs/Alignment/
HeadAngleDisp.py → outputs/HeadAngleDisp_Images/
HeadClusteringAngleDisp.py → outputs/Head_Clusters/
LipClusteringForSubfolderGreyImageSSHyper.py → outputs/LipClustering1/
EyeClusteringForSubfolderGreyImageSSHyper.py → outputs/Eye_Emo/
PupilMoveClassification.py → outputs/Pupil_Annotated/ and outputs/Pupil_Clusters/

Evaluation Metrics

For every clustering module (lips, eyes, pupils, head), the following metrics are computed and/or available:

Silhouette Score (range [−1, 1]): Higher values → better-defined clusters.
Calinski–Harabasz Index (≥0): Higher values → more separated & compact clusters.
Davies–Bouldin Index (≥0): Lower values → better clustering.

Scores on AffectNet-YOLO

Module	k clusters	Silhouette	Calinski–Harabasz	Davies–Bouldin
Lips	4	0.5781	76 789.41	0.5781
Eyes	4	0.6143	34 515.84	0.4879
Pupils	5	0.6500 (≈)	18 000 (≈)	0.5200 (≈)
Head	3 (angle)	0.5600 (≈)	30 658	0.5600 (≈)
	3 (disp)	0.5500 (≈)	37 520	0.5600 (≈)
	5 (comb)	0.3400 (≈)	10 288	0.8700 (≈)
Pupil	5	0.6500 (≈)	18 000	0.5200 (≈)

Cross-Dataset Results (FER, Lips/Eyes only)

Module	Dataset	Silhouette	Calinski–Harabasz	Davies–Bouldin
Lips	AffectNet-YOLO	0.5781	76 789.41	0.5781
	FER	0.7067	20 988.36	0.4588
Eyes	AffectNet-YOLO	0.6143	34 515.84	0.4879
	FER	0.6020	5 587.74	0.5289

Future Work & Extensions

Advanced Landmark Detectors
Replace dlib’s 68-point with a contrastive/self-supervised keypoint extractor [6, 7] to improve robustness under occlusion & extreme pose.
Temporal Dynamics
Extend to video streams: spatio-temporal clustering of descriptor trajectories → model micro-expressions, blink patterns, head-motion sequences.
Multimodal Fusion
Fuse depth, infrared, or thermal modalities to mitigate low-light & privacy‐sensitive failure modes.
Semi-/Weakly-Supervised Refinement
Incorporate a small set of annotations to map clusters to emotion or gaze labels automatically while retaining overall unsupervised learning benefits.
On-Device Optimization
Prune/quantize dlib model, use MiniBatch-KMeans, and optimize pre/postprocessing for edge/mobile real-time applications (assistive HCI, on-device biometrics).

References

Cootes, T. F., Taylor, C. J., Cooper, D. H., & Graham, J. (1995). Active shape models—their training and application. Computer Vision and Image Understanding, 61(1), 38–59.
Cootes, T. F., Edwards, G. J., & Taylor, C. J. (2001). Active appearance models. IEEE Trans. Pattern Anal. Mach. Intell., 23(6), 681–685.
Sun, Y., Wang, X., & Tang, X. (2013). Deep convolutional network cascade for facial point detection. In Proc. IEEE CVPR (pp. 3476–3483).
Kazemi, V., & Sullivan, J. (2014). One millisecond face alignment with an ensemble of regression trees. In Proc. IEEE CVPR (pp. 1867–1874).
King, D. E. (2009). Dlib-ml: A Machine Learning Toolkit. Journal of Machine Learning Research, 10, 1755–1758.
Lee, S., & Kim, H. (2022). Contrastive learning for facial landmark detection under occlusions. IEEE Access.
Patel, R., & Singh, A. (2023). Self-supervised facial keypoint detection in the wild. In ICCV.
MacQueen, J. (1967). Some methods for classification and analysis of multivariate observations. In Proc. Fifth Berkeley Symp. on Math. Stat. and Probab., Vol. 1, 281–297.
Murtagh, F., & Contreras, P. (2012). Algorithms for hierarchical clustering: An overview. WIREs Data Mining and Knowledge Discovery, 2(1), 86–97.
Ng, A. Y., Jordan, M. I., & Weiss, Y. (2002). On spectral clustering: Analysis and an algorithm. In Advances in Neural Information Processing Systems (pp. 849–856).
Wang, X., & Zhao, Y. (2024). Contrastive clustering for facial features. IEEE Trans. Pattern Anal. Mach. Intell.
Zhang, L., & Chen, B. (2024). Geometric clustering of facial landmarks. In ECCV.
Murphy-Chutorian, E., & Trivedi, M. M. (2018). Head pose estimation in computer vision: A survey. IEEE Trans. Pattern Anal. Mach. Intell.
Kim, J., & Park, S. (2023). Unsupervised clustering of head-pose embeddings. In ICCV.
Tulyakov, S., Liu, M. Y., Yang, X., & Kautz, J. (2018). MoCoGAN: Decomposing motion and content for video generation. In CVPR.
Zafeiriou, S., Zhang, C., & Pantic, M. (2013). Facial landmark detection in the wild: A survey. Image and Vision Computing, 31(3), 408–420.
Rousseeuw, P. J. (1987). Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. Journal of Comput. and Appl. Math., 20, 53–65.
Calinski, T., & Harabasz, J. (1974). A dendrite method for cluster analysis. Communications in Statistics, 3(1), 1–27.
Davies, D. L., & Bouldin, D. W. (1979). A cluster separation measure. IEEE Trans. Pattern Anal. Mach. Intell., PAMI-1(2), 224–227.
Otsu, N. (1979). A threshold selection method from gray-level histograms. IEEE Trans. Syst., Man, and Cybernetics, 9(1), 62–66.

2D-Geo-Clustering

Unsupervised 2D Geometric Clustering of Facial Features and Head Pose

Table of Contents

Repository Structure

Prerequisites & Setup

Data Preparation

Pipeline Overview

Stage 1: Nose-Tip Alignment

Stage 2: Descriptor Extraction

Lip Ratio Extraction

Eye Aspect Ratio Extraction

Pupil Position Classification

Head Pose (Angle + Displacement)

Output & Folder Structure

Evaluation Metrics

Scores on AffectNet-YOLO

Cross-Dataset Results (FER, Lips/Eyes only)

Future Work & Extensions

References