A Guide to Sourcing Data for AgTech Computer Vision

Hanan Othman

Hanan Othman

Content Writer | 2022/12/7 | 7 min read

It's thanks to traditional industrial farming practices that livelihoods are supported, not only through food production, but managing and providing other raw materials to manufacturers, which significantly contribute to the global supply chain and economy.

With that wide-reaching impact in mind, these conventional methods are undergoing transformative adaptation to keep up with shifting market demands from one side of the world to the other.

Innovative technologies like AI systems or applications equipped with machine learning (ML) and computer vision (CV) capabilities play a starring role In how the agriculture industry is planning to satisfy those demands.

The Data Needs of AgTech AI

To many agricultural specialists open to adopting emerging technologies like CV and ML, the adoption is worthwhile if it boosts an important metric: yield, or how much a farm can produce in a growing season, whether it be crops or products produced by livestock.

There are plenty of obstacles to achieving a high yield, but most can be fit into two separate buckets or categories:

  • Environmental factors.

  • Amplifying existing yield.

Environmental factors include weather, regional climate, soil composition, and the presence of pests or weeds affecting crop development and quality. The other bucket -- dealing with the dilemma of how farmers can enhance crop yield by knowing what certain plants need and when it's the opportune time to harvest.

This strategic approach is also commonly referred to as precision agriculture. To drive precision ag efforts, many prominent AI applications currently active in the field are designed to monitor and provide valuable, data-based insights on crop conditions to farmers.

Based on those two bucket concerns that link to high-yield outcomes, the major computer vision applications in agriculture are trained through image data featuring a variety of crops and other plant matter in a broad multitude of conditions and scenarios.

Since these applications need to be prepped for a real-world setting, the data also needs to incorporate the unpredictable and continue to perform in a dynamic environment, subject to unexpected weather patterns and new considerations to plant health maintenance as they react to changes in their environment and develop from seed to a harvest-ready plant.

Sourcing Data for AgTech Applications

Computer vision models are ideally suited to crop recognition and analysis tasks because they enable applications like intelligent robots and drones, as well as automated farming equipment; to relieve the heavy load of labor tasks and focus on creating higher quality products in less time and more efficiently.

In recent years, the use of CV in agriculture has made promising strides in being accepted and even welcomed by the industry, but there is still a major roadblock that one should be mindful of when attempting to acquire data for AgTech builds. That main deterrent is that many publicly available data sources or repositories aren't the most reflective of a production environment, which poses a concern for performance once the model is deployed and active; especially for carrying out precision ag functions.

There's also a deeper layer to the issue; the fact that there might not be many available cases yet to pull data from, seeing that a number of open-source options are by a small team of developers acquiring images for a short amount of time and from a specific and limited area. This makes it difficult to match any data that isn't custom to a model or particular project to its needs.

This brings us to the question of sourcing training data for custom CV model use cases. The answer to that question seems obvious when breaking down the most popular methods employed in the AgTech development community:

  • Manually collecting the data through UAVs and sensor systems.

  • Recruiting a data provider or service that specializes in AgTech AI.

  • Using a portion of the general data required for a build from public repositories (including more extensive and reputable archives that are provided by governmental authorities like the USDA).

Using Drones for Agriculture Data Collection

In any discussion related to sourcing data in the ML agriculture industry, it’s a must to mention the vital role drones play in propelling that effort.

Each year, drones in agriculture are becoming more common and a promisingly viable solution for agriculturalists to enhance and multiply their crop yields. With the modern integration and capability of CV and ML in these aerial devices, farms now have the means to analyze and influence crop health and harvest potential in a modern and convenient way.

Before the growing and more recent prevalence of unmanned aerial vehicles (UAVs) or unmanned ground vehicles (UGVs), expensive and inefficient methods were used like helicopters to survey crops from above, which comparably are a much more practical option to lightweight and more energy-efficient equipment like drones.

Although, in a general sense, drones are undeniably a great asset to furthering farming capabilities and particularly suited to delivering AI solutions, not every drone is built the same. Not only by physical appearance or design standpoint but also by function and how effective they are in performing that unique function.

To evaluate a well-designed and efficient drone means looking for features like high-resolution image and video capture that can be used for both manual and automated assessment, the ability to accurately phenotype crops, have mapping terrain capabilities and if needed for planting or spraying tasks, to be built with both those functions in mind.

The following are a range of reputable options to consider for UAVs that can be employed in various data-sourcing use cases in the agriculture industry.

1. Pix4D

As mapping is an important benefit of the use of technology in precision farming, the Pix4D’s RGB mapping software provides valuable high-res views for users that are capable of scouting 200 acres in minutes. Giving farmers an accurate topographic representation of a field conveniently and whether they’re on-site or not.

2. DJI Agras MG-1

Widely considered one of the leading and cutting-edge products on the market, the DJI Argas MG-1 is equipped with a radar sensing system that delivers reliable flight performance as well as a spring system and flow sensor that ensures accurate application when spraying crops.

3. DJI Phantom 4

Designed specifically for agriculture use, DJI’s Phantom 4 serves to make field mapping as efficient and convenient as possible with a flight duration of up to half an hour. It’s capable of capturing original satellite observation data as well as image position data, reducing time in validating the data and achieving a more efficient post-processing workflow with minimal manual adjustment.

4. Parrot Anafi Thermal

In terms of thermal imaging capability, the Parrot Anafi is top-of-the-line. This drone can be flown at night using an optional thermal module and is extremely portable. It can be ready to work in seconds and thanks to a quality optical sensor, can take the best picture possible in the dark.

5. DJI Mavic 2 Pro

DJI’s Mavic 2 puts safety first through its obstacle avoidance technology and is ideal for surveillance because of its transmission range of up to 18 kilometers. Featuring four sensors that keep it reliably stable and reduces the chances of any crashes or damage to the product. With its longer flight time of half an hour, 45 mph speed, and range, it can get to where it needs to be and take the aerial and mapping shots it needs to in less time and risk.

Moving on from UAVs to other recommended products in AgTech data acquisition, below are listed some of the most common use cases for AgTech CV applications, along with specific recommendations on sourcing the ideal grade of quality and relevant data necessary to develop a powerful model; capable of working alongside modern farmers as an invaluable tool to extract insights for a bigger and more sustainably productive yield.

Monitoring Crops and Livestock

The ability to monitor crops, livestock, and other items of interest, considered vital assets to a farm and preferred to be kept in the best condition possible, has game-changing potential in boosting production and agricultural yield. Because if you stop to think about it, manually monitoring crop fields and pastured cattle isn't just time-consuming, it's fairly inefficient.

When considering what type of data might be ideal for crop and livestock monitoring applications, aim for crop and livestock-specific image data to help systems like drones or various types of sensor systems pick up on the target objects and properly analyze them to detect problematic conditions like dehydration or illness, as a few functions these applications might serve.

Evaluating Crop and Soil Health

The most accurate way to measure the core health of crops involves going right to the root and the soil surrounding it. For crop detection applications, it's often expected that they also provide readings that track the condition of the crop; its various growth stages and are capable of detecting any issues that aren't surface-level.

Pest and Disease Prevention

Similar to the concept of monitoring and evaluating crops and providing insights to users, pest and disease prevention AgTech solutions should be designed to be evaluative. An example of an application for this use case is UAVs that can discern unhealthy plants from healthy ones.

Pest and disease prevention CV models should be intuitive enough to recognize the signs of an unwanted visitor in a field environment automatically, like a parasite or invasive weed, and apply the correct dosage of herbicide or insecticide to resolve or prevent further harm to crop development.

Sourcing data for these operations can come from a good mix of sources, depending on the weed and pest types that need to be identified and the crop variety that a client or application user will need to protect.

For more common or prevalent pests like cutworms, sawflies, midges, and grasshoppers, there are a number of public databases, like the USDA's Agricultural Research Service, that could be a practical resource for the most usual input data like images or video.

Predictive Analytics

Weather condition predictions or readings go a long way in aiding crop development, and forecasting and prediction CV software help make that so. These models are typically designed to evaluate, track, forecast, and predict the different environmental scenarios or events that may significantly impact agricultural yields.

They also should be able to guide planting decisions by making farmers aware of notable changes in weather patterns in their area and provide an analytical view of how those changes affect crop production.

Satellite or aerial imagery is a popular option for predictive analytical applications, not only because they fit the ideal input data requirements, but because they're also cost-effective, provide high-quality, detailed imagery, and are accurate - accuracy that makes all the difference to model performance.

Maximizing Yield Through Data Intelligence

A more productive and modernized agriculture industry is only possible through next-gen AgTech applications that work harmoniously alongside farmers to maximize their production and crop yields. It all starts with gathering the right type of data to equip those applications with the features most useful to human handlers.

Oftentimes, this data is only as beneficial to ML models as its heterogeneous details and contents; the time of year it was collected, the geographic location, and the effect of time and location on the objects featured within the data. These factors all contribute to the quality and accuracy of data currently in use, leading to the evolutionary shift in farming practices.

The Ground Truth is a weekly community newsletter featuring computer vision news, research, learning resources, MLOps, best practices, events, podcasts, and much more.

Subscribe to our newsletter

Stay updated latest MLOps news and our product releases

About Superb AI

Superb AI is an enterprise-level training data platform that is reinventing the way ML teams manage and deliver training data within organizations. Launched in 2018, the Superb AI Suite provides a unique blend of automation, collaboration and plug-and-play modularity, helping teams drastically reduce the time it takes to prepare high quality training datasets. If you want to experience the transformation, sign up for free today.

Join The Ground Truth Community

The Ground Truth is a community newsletter featuring computer vision news, research, learning resources, MLOps, best practices, events, podcasts, and much more. Read The Ground Truth now.


Designed for Data-Centric Teams

We’ve built a platform for everyone involved in the journey from training to production - from data scientists and engineers to ML engineers, product leaders, labelers, and everyone in between. Get started today for free and see just how much faster you can go from ideation to precision models.