Back close

YOLOX-TD-Plus: An accurate and fast text detection model

Publication Type : Journal Article

Publisher : Elsevier BV

Source : Systems and Soft Computing

Url : https://doi.org/10.1016/j.sasc.2026.200437

Keywords : YOLO, Scene text detection, ACE-CSP, PAFPN, CSPNet, Deep learning, Convolutional neural network, Attention mechanism

Campus : Coimbatore

School : School of Computing

Department : Computer Science and Engineering

Year : 2026

Abstract : The YOLO series of object detection algorithms has become a standard in a wide range of object detection applications. However, their application to text detection in the wild remains relatively unexplored. This paper presents a new convolutional neural network (CNN)-based model aimed at improving text detection performance through the introduction of a newly designed attention-concentrated enhanced cross-stage partial network (ACE-CSP) layer. The proposed model is built on the path aggregation feature pyramid network (PAFPN) architecture and incorporates ACE-CSP layer blocks, which we developed to facilitate improved information flow through the network and enhance its learning capability. The integration of channel and spatial attention in the ACE-CSP layers enables the network to focus more precisely on relevant text regions. This helps suppress irrelevant background activations, even in cluttered scenes. This design helps to reduce the imbalance in contributions from different feature pyramid layers, resulting in more consistent detection across varying text sizes. The proposed model, YOLOX-TD-Plus, shows significant improvements in text detection performance. We evaluated the model on the COCO-Text-v2.0 dataset, which includes multilingual and multi-oriented text instances. The experimental results show the effectiveness of the proposed architecture in solving text detection challenges in real-world scenarios. Specifically, YOLOX-TD-Plus-t improves Average Precision (AP) from 0.136 to 0.186 (a 36.8% relative improvement), and YOLOX-TD-Plus-l reaches a top AP of 0.341, surpassing the baseline’s 0.317.

Cite this Research Publication : Deepak C.R., Padmavathi S., YOLOX-TD-Plus: An accurate and fast text detection model, Systems and Soft Computing, Elsevier BV, 2026, https://doi.org/10.1016/j.sasc.2026.200437

Admissions Apply Now