Back close

YOLO-ETD: An Enhanced Detector for Text and Small Objects

Publication Type : Conference Paper

Publisher : Springer Nature Switzerland

Source : Lecture Notes in Networks and Systems

Url : https://doi.org/10.1007/978-3-032-14038-8_24

Campus : Coimbatore

School : School of Computing

Department : Computer Science and Engineering

Year : 2026

Abstract :

Text and object detection in the wild using real-time detectors, such as the YOLO series, typically uses smaller input image resolutions for faster and real-time inference. Input image resizing and other image spatial information reduction techniques to reduce processing time adversely affect the detection of small objects in an image. This paper proposes a lightweight convolutional neural network framework (YOLO-ETD), which is a faster and enhanced detector for text and small objects. The proposed detection architecture is designed to accommodate more spatial information from the input image while maintaining almost identical processing time during inference. Experiments on the new architecture demonstrate that employing our light-weight convolutional neural network framework results in an approximately 8% improvement in AP@[IoU = 0.50] performance on the COCO-Text-v2 validation dataset without any increase in FLOPs or model parameters compared to the baseline model. The proposed model is able to achieve significant improvement in AP@[IoU = 0.50:0.95 | area = small] on MS-COCO dataset compared to state-of-the-art YOLO series object detectors.

Cite this Research Publication : C. R. Deepak, S. Padmavathi, YOLO-ETD: An Enhanced Detector for Text and Small Objects, Lecture Notes in Networks and Systems, Springer Nature Switzerland, 2026, https://doi.org/10.1007/978-3-032-14038-8_24

Admissions Apply Now