On April 5th, 2023, the Facebook Engineering team at Meta released its Segment Anything Model (SAM) for masking objects in images. SAM is unique, because it has a high “zero-shot” generalization capability, meaning that it can accurately detect objects the algorithm has not seen or been specifically trained on.
The most obvious use case for SAM, given Meta’s focus on AR and VR applications, is detecting objects in real time video feeds so that the user can be presented with interface visualizations on objects in the user’s field of view.
Fundamentally, the mask is just a polygon around a distinct object. Further processing must be done to provide some utility, once the object is segmented from the rest of the image.
The team at Lofty has been testing SAM in our target industries of Agriculture, AEC, and Energy to understand how the model can be leveraged on imagery collected via remote sensing, such as satellite imagery and aerial photography, and we’ve found the following three use cases to be promising.
Agricultural Field Boundary Digitization
Precision agriculture and agricultural analytics tools fundamentally rely on digitizing the boundaries of specific fields. This is because agricultural operations tend to be defined at the field level—plantings, soil enrichment, crop rotations, and improvements (like leveling and field shaping) are all managed on a field by field basis.
Remote sensing data can be used as inputs to these applications and models, but in order to do anything with the information, you have to delineate which field is which.
Digitizing field boundaries is most commonly done by hand, drawing field boundaries point-by-point using satellite imagery as a reference.
We’ve begun testing Segment Anything to delineate field boundaries using commercially available satellite imagery with promising results. The photo below shows naive results from the Segment Anything automatic masking algorithm without any pre-processing or parameter tuning:
While some fields were not detected by SAM’s default mask generator, subsequent tests show high confidence that the detection rate can be scaled up to capture each field. Note that non-agricultural plots will also be segmented, so real-world use cases require further processing of the individually masked components in order to determine whether it is a field versus a stand of trees, homestead, and so on.
Infrastructure Masking in Civil Engineering
Collecting imagery from commercial UAVs, or colloquially “drones”, is a rapidly expanding technique for analysis and modeling in the AEC industry, particularly Civil Engineering.
High quality imagery of infrastructure assets can be fed into statistical and machine learning models to make smart decisions about infrastructure health. Even with high quality imagery, assets need to be isolated from their backgrounds so that algorithms can filter out the noise of adjacent and background objects.
Lofty Senior Software Engineer, Alan Fraley, has achieved high quality results using SAM for isolating bridge deck surfaces in RGB raster images. Other imagery, such as thermal imaging, can be geo-referenced to the original photo and the mask can be applied to that imagery as well, without recomputing.
Fast Dataset Labeling for Defect Detection
One of biggest challenges we’ve seen applying computer vision in our target industries has been the absence of high quality labeled datasets.
Many firms have identified automation opportunities to identify and classify defects, be it in a parking lot, a solar array, or a bridge deck surface. The first challenge these firms run into is that they need a massive amount of defect images, and they need to be labeled by hand in order to train an algorithm.
The good news is that forward looking firms have plenty of Subject Matter Experts who can expertly classify images. The bad news is these SMEs time comes at a premium, and spending months drawing boxes on photos is often a non-starter.
One of the primary use cases highlighted by the Facebook Engineering team has been the speed at which labels can be generated using SAM.
Tools designed for efficient annotation are not a new concept. Roboflow, for example, introduced a “Smart Polygon” tool to its annotation platform last September.
The new benefit here is highlighted by the SAM team’s benchmarks, which report their mask generation is 2-6.5x faster than the standard annotating practices within Facebook. And, with a generalized zero-shot approach, SAM is likely to be able to generate masks in a very diverse set of applications.
What's Next for Segment Anything at Lofty?
The team at Lofty is constantly working with experts in AEC, Agriculture, and Energy to build powerful products around their expertise. These products often rely on remote sensing and geospatial analysis tools, so when we saw SAM we immediately knew how we might use it in some of our daily work with our partners.
So far the results are promising. It’s important to note that masking features is often only one of many steps in producing actionable results. That said, we’re always on the lookout for opportunities to deliver results faster for our clients, and SAM may be an option that provides significant acceleration within our client’s products.
We are continuing our research with SAM, and other cutting edge tools, to find additional use cases and refined approaches to deliver big results for Lofty’s partners.