Back close

Detecting Plant Diseases at Scale: A Distributed CNN Approach with PySpark and Hadoop

Publication Type : Conference Paper

Publisher : Elsevier BV

Source : Procedia Computer Science

Url : https://doi.org/10.1016/j.procs.2024.04.099

Keywords : Convolutional Neural Networks (CNN), PySpark, Hadoop

Campus : Bengaluru

School : School of Computing

Department : Computer Science and Engineering

Year : 2024

Abstract : In modern agriculture, detecting and diagnosing plant diseases are pivotal for crop health and yield optimization. This transformative initiative leverages cutting-edge technology to reshape the field. Data integrity is paramount for our machine learning model's success. We meticulously collect verified data from Kaggle, then preprocess the dataset of 2000 diverse images. Uniformly resizing them to 150px and batching them in groups of 32 optimizes training speed and memory usage. The model's innovation lies in its Convolutional Neural Networks (CNNs) powered by PySpark for efficient distributed data processing. Integration of Tkinter bridges technology and user-friendliness, offering farmers intuitive disease detection solutions through a graphical interface. This triad of CNNs, PySpark, and Tkinter achieves an outstanding average accuracy of 95.76%, adept at distinguishing diseased from healthy leaves. Embracing this innovation promises to elevate agricultural practices and enhance crop productivity, marking a milestone in modern agriculture.

Cite this Research Publication : Vishwash Sharma, Srinidhi Kannan, Simhadri Tanya, Niharika Panda, Detecting Plant Diseases at Scale: A Distributed CNN Approach with PySpark and Hadoop, Procedia Computer Science, Elsevier BV, 2024, https://doi.org/10.1016/j.procs.2024.04.099

Admissions Apply Now