G L S VAISHNAVI REDDY

Posted on Apr 27

Anaylsis of farmer query pattern for crop advisory.

#python #mongodb #framing #machinelearning

By Vaishnavi Reddy, Siri Reddy, Hasini, Joshi Gayatri. This project was developed under the guidance and mentorship of Professor Chanda Rajkumar

The Idea Behind the Project:
What if thousands of farmers are asking similar questions every day—but no one is truly analyzing them?
This was the thought that led to our project.

Farmers frequently raise queries about crop diseases, fertilizers, irrigation, and weather conditions through helplines, apps, and messaging platforms. These queries often contain valuable insights, but they are usually scattered, unstructured, and underutilized.

Instead of approaching this with a highly complex solution, we focused on a simpler idea:

Can we design a system that automatically analyzes farmer queries and identifies patterns to provide smarter crop advisory support?

Why Do We Think This Problem Matters?

In real-world agricultural systems, a large amount of farmer interaction data exists as unstructured text—short questions, voice-to-text inputs, or incomplete descriptions.

Within this data:

Problems are often described vaguely
Local languages and mixed dialects are used
Critical issues like pest attacks or nutrient deficiencies may be hidden in simple sentences

Manually analyzing such data is not only time-consuming but also inefficient at scale.

As the number of farmers using digital platforms grows, it becomes essential to build systems that can:

Understand queries automatically
Identify recurring issues
Provide timely and relevant advisory

This project explores how a lightweight NLP-based system can help bridge this gap in a practical and scalable way.

How We Set Up Our Project

The goal of the project was to create a complete system that:

Accepts farmer queries as input
Cleans and processes the text
Extracts meaningful patterns
Classifies the type of query (e.g., pest, fertilizer, irrigation)
Stores and analyzes the data for future insights

Technology Stack

To keep the system simple yet effective, we used a lightweight and practical tech stack.

Python — Core Engine

The entire system is built using Python due to its strong support for NLP and data processing.

Data Handling
Pandas → Dataset processing
NumPy → Numerical operations

NLP with NLTK

We used NLTK to preprocess farmer queries by:

Removing stopwords
Cleaning text
Normalizing input

Machine Learning — Scikit-learn

Scikit-learn was used to build the classification model.
TF-IDF → Feature extraction
Logistic Regression → Query classification

Backend — Flask API

We developed a simple backend using Flask to make the model accessible.

MongoDB Integration for Database

Farmer query data is highly unstructured and varies significantly in format.

We used MongoDB because:
It supports flexible schemas (no rigid tables)
It efficiently stores document-based data
It is scalable and suitable for real-time applications

The system workflow:
User submits a query
The model processes and classifies it
Results are stored in MongoDB for analysis
Data Stored in MongoDB

*Each query is stored as a document containing:
*
Farmer query text
Predicted category (pest, irrigation, fertilizer, etc.)
Crop type (if identified)
Location (if available)
Timestamp

System Architecture

The system follows a modular pipeline:

1. Input

Farmer query (text or voice-converted text)

2. Preprocessing

Cleaning, normalization, stopword removal

3. Feature Extraction

TF-IDF vectorization

4. Prediction

Classification using Logistic Regression

5. Storage
Results stored in MongoDB
Pipeline Flow
Input → Preprocessing → Feature Extraction → Model → Output

System Demonstration

The system allows users to input a farmer query such as:

_"Why are my crop leaves turning yellow?"
_
The model processes the query and predicts the category (e.g., nutrient deficiency).

The output is displayed as:

“Fertilizer-related issue”
“Pest-related issue”

This demonstrates how the system can assist in decision-making.

Results and Insights:

From our analysis:

Frequent queries were related to pests and fertilizers
Seasonal trends were clearly observed
Similar problems were reported across different regions

This shows that analyzing query patterns can help in:

Predicting upcoming issues
Providing proactive advisory
Improving agricultural decision-making
Future Improvements

Future enhancements can include:

Using advanced models like BERT for better accuracy
Supporting regional languages
Integrating weather and soil data
Building a real-time farmer advisory app
Conclusion

This project demonstrates how NLP and machine learning can transform unstructured farmer queries into meaningful insights.

By integrating intelligent models with MongoDB, the system evolves from a simple classifier into a scalable crop advisory solution.

Ultimately, this approach helps in:

Understanding farmer needs better
Delivering timely recommendations
Supporting smarter and more sustainable agriculture