Publication Type : Conference Paper
Publisher : Elsevier BV
Source : Procedia Computer Science
Url : https://doi.org/10.1016/j.procs.2024.04.099
Keywords : Convolutional Neural Networks (CNN), PySpark, Hadoop
Campus : Bengaluru
School : School of Computing
Department : Computer Science and Engineering
Year : 2024
Abstract : In modern agriculture, detecting and diagnosing plant diseases are pivotal for crop health and yield optimization. This transformative initiative leverages cutting-edge technology to reshape the field. Data integrity is paramount for our machine learning model's success. We meticulously collect verified data from Kaggle, then preprocess the dataset of 2000 diverse images. Uniformly resizing them to 150px and batching them in groups of 32 optimizes training speed and memory usage. The model's innovation lies in its Convolutional Neural Networks (CNNs) powered by PySpark for efficient distributed data processing. Integration of Tkinter bridges technology and user-friendliness, offering farmers intuitive disease detection solutions through a graphical interface. This triad of CNNs, PySpark, and Tkinter achieves an outstanding average accuracy of 95.76%, adept at distinguishing diseased from healthy leaves. Embracing this innovation promises to elevate agricultural practices and enhance crop productivity, marking a milestone in modern agriculture.
Cite this Research Publication : Vishwash Sharma, Srinidhi Kannan, Simhadri Tanya, Niharika Panda, Detecting Plant Diseases at Scale: A Distributed CNN Approach with PySpark and Hadoop, Procedia Computer Science, Elsevier BV, 2024, https://doi.org/10.1016/j.procs.2024.04.099