Several techniques have been proposed for human action recognition from videos. It has been observed that incorporating mid-level viz. human body and/or high-level information viz. pose estimation in the computation of low-level features viz. trajectories yields the best performance in action recognition where full body is presumed. However, in datasets with a large number of classes, where the full body may not be visible at all times, incorporating such mid- and high-level information is unexplored. Moreover, changes and developments in any stage will require a recompute of all low-level features. We decouple mid-level and low-level feature computation and study on benchmark action recognition datasets such as UCF50, UCF101 and HMDB51, containing the largest number of action classes to date. Further, we employ a part-based model for human body part detection in frames statically, thus also investigating classes where the full body is not present. We also track dense regions around the detected human body parts by Hungarian particle linking, thus minimising most of the wrongly detected body parts and enriching the mid-level information.
Dr. Oruganti Venkata Ramana Murthy, Radwan, I., and Goecke, R., “Dense body part trajectories for human action recognition”, in 2014 IEEE International Conference on Image Processing (ICIP), 2014.