UMPIRE: Unsupervised Temporal Action Localization via Deep Clustering
Developed a fully label-free TAL framework that learns spatio-temporal graph embeddings from 3D skeleton sequences using ASTGCN and transformer-based temporal pooling, followed by DBSCAN clustering with adaptive -estimation. Achieved state-of-the-art performance on the BABEL dataset (51.53% mAP@IoU, 37.2 F1), surpassing prior unsupervised and weakly supervised methods.
