Publication Type:

Journal Article


IET Computer Vision, Institution of Engineering and Technology, Volume 10, p.392-397(5) (2016)



action class, action video, bag-of-word framework, dimensionality reduction technique, feature projection, file sharing sites, Fisher vectors, high-dimensional feature vectors, HMDB51 dataset, human action recognition, human behaviour automatic analysis, large-class benchmark database, UCF101 dataset, untrimmed videos, YouTube


Automatic analysis of human behaviour in large collections of videos is rapidly gaining interest, even more so with the advent of file sharing sites such as YouTube. From one perspective, it can be observed that the size of feature vectors used for human action recognition from videos has been increasing enormously in the last five years, in the order of ∼100–500K. One possible reason might be the growing number of action classes/videos and hence the requirement of discriminating features (that usually end up to be higher-dimensional for larger databases). In this study, the authors review and investigate feature projection as a means to reduce the dimensions of the high-dimensional feature vectors and show their effectiveness in terms of performance. They hypothesise that dimensionality reduction techniques often unearth latent structures in the feature space and are effective in applications such as the fusion of high-dimensional features of different types; and action recognition in untrimmed videos. They conduct all the authors’ experiments using a Bag-of-Words framework for consistency and results are presented on large class benchmark databases such as the HMDB51 and UCF101 datasets.

Cite this Research Publication

R. Goecke and Dr. Oruganti Venkata Ramana Murthy, “Dimensionality reduction of Fisher vectors for human action recognition”, IET Computer Vision, vol. 10, pp. 392-397(5), 2016.