Back close

Efficient Gait Recognition with SEGait-ConViT: A Vision Transformer Model Enhanced by Sparse Edge Information

Publication Type : Journal Article

Publisher : Elsevier BV

Source : Procedia Computer Science

Url : https://doi.org/10.1016/j.procs.2025.04.146

Keywords : CNN, Convolutional Vision Transformer, Gait Recognition, Gait Energy Image, Deep Learning

Campus : Amritapuri

School : School of Computing

Year : 2025

Abstract : Accurate identification of a person from huddled scenarios like surveillance camera feeds is one of the arduous techniques in the domain of computer vision. The gait of a person is a reliable biometric recognition technique, enabled by its capability to perceive a target from a few meters away. Gait analysis can recognize a person from 5 to 10 meters away, depending on environmental factors. Vision Transformers, an alternative approach that can overcome the shortcomings of convolutional neural networks, mandate time-intensive, pricey pre-training on massive datasets. This research proposes a Vision Transformer model for gait-based human recognition. The proposed Vision Transformer model uses gait energy images as input, combined with edge information and sparse regions within those edges. The SEGait-ConViT integrates random convolutions into the self-attention layers and leverages optimized hyperparameter values to improve performance. Gait3D and CASIA-B, two popular public datasets, were utilized to assess the model’s efficacy. The suggested methodology achieves state-of-the-art achievement, as per the empirical results which is 66.5% accuracy on Gait3D and 98.8% accuracy on CASIA-B dataset.

Cite this Research Publication : Hrudya P, Prabaharan Poornachandran, Efficient Gait Recognition with SEGait-ConViT: A Vision Transformer Model Enhanced by Sparse Edge Information, Procedia Computer Science, Elsevier BV, 2025, https://doi.org/10.1016/j.procs.2025.04.146

Admissions Apply Now