Publication Type : Conference Paper
Publisher : Springer Nature Switzerland
Source : Lecture Notes in Networks and Systems
Url : https://doi.org/10.1007/978-3-032-14038-8_24
Campus : Coimbatore
School : School of Computing
Department : Computer Science and Engineering
Year : 2026
Abstract :
Text and object detection in the wild using real-time detectors, such as the YOLO series, typically uses smaller input image resolutions for faster and real-time inference. Input image resizing and other image spatial information reduction techniques to reduce processing time adversely affect the detection of small objects in an image. This paper proposes a lightweight convolutional neural network framework (YOLO-ETD), which is a faster and enhanced detector for text and small objects. The proposed detection architecture is designed to accommodate more spatial information from the input image while maintaining almost identical processing time during inference. Experiments on the new architecture demonstrate that employing our light-weight convolutional neural network framework results in an approximately 8% improvement in AP@[IoU = 0.50] performance on the COCO-Text-v2 validation dataset without any increase in FLOPs or model parameters compared to the baseline model. The proposed model is able to achieve significant improvement in AP@[IoU = 0.50:0.95 | area = small] on MS-COCO dataset compared to state-of-the-art YOLO series object detectors.
Cite this Research Publication : C. R. Deepak, S. Padmavathi, YOLO-ETD: An Enhanced Detector for Text and Small Objects, Lecture Notes in Networks and Systems, Springer Nature Switzerland, 2026, https://doi.org/10.1007/978-3-032-14038-8_24