Sunday, September 7, 2025
Science
No Result
View All Result
  • Login
  • HOME
  • SCIENCE NEWS
  • CONTACT US
  • HOME
  • SCIENCE NEWS
  • CONTACT US
No Result
View All Result
Scienmag
No Result
View All Result
Home Science News Agriculture

Segment anything is not always perfect: an investigation of SAM on different real-world applications

July 10, 2024
in Agriculture
Reading Time: 5 mins read
0
Results of segment anything model (SAM) on various real-world applications
69
SHARES
626
VIEWS
Share on FacebookShare on Twitter
ADVERTISEMENT

Growing interest has been observed in foundation models in recent years, which can be attributed to their sufficient pre-training on web-scale datasets and their superior ability to generalize to various downstream tasks. Not long after, ChatGPT, empowered by the GPT foundation model, has become a great commercial success, owing to its real-time and reasonable language generation and user interaction. Going back to the vision realm, the exploration of foundation models is still in its infancy. The pioneering work of contrastive language-image pretraining (CLIP) effectively combines image-text modalities, enabling zero-shot generalization to novel visual concepts. However, its generalization ability for vision tasks remains unsatisfactory due to the scarcity of abundant training data, unlike in natural language processing (NLP).

Results of segment anything model (SAM) on various real-world applications

Credit: Beijing Zhongke Journal Publising Co. Ltd.

Growing interest has been observed in foundation models in recent years, which can be attributed to their sufficient pre-training on web-scale datasets and their superior ability to generalize to various downstream tasks. Not long after, ChatGPT, empowered by the GPT foundation model, has become a great commercial success, owing to its real-time and reasonable language generation and user interaction. Going back to the vision realm, the exploration of foundation models is still in its infancy. The pioneering work of contrastive language-image pretraining (CLIP) effectively combines image-text modalities, enabling zero-shot generalization to novel visual concepts. However, its generalization ability for vision tasks remains unsatisfactory due to the scarcity of abundant training data, unlike in natural language processing (NLP).

 

More recently, Meta AI Research released a promptable segment anything model (SAM). By incorporating a single user interface as prompt, SAM is capable of segmenting any object in any image or any video without additional training, which is often referred to as zero-shot transfer in the vision community. As suggested by the authors, SAM′s capabilities are driven by a vision foundational model that has been trained on a massive SA-1B dataset containing more than 11 million images and one billion masks. Meanwhile, the authors have released an impressive online demo to showcase SAM′s capabilities at SAM is designed to generate a valid segmentation result for any prompt, where prompts can include foreground/background points, a rough box or mask, freeform text, or any other information indicating what to segment in an image. The latest project offers three prompt modes: click mode, box mode and everything mode. Click mode allows users to segment objects with one or more clicks, either including and excluding them from the object. Box mode allows for object segmentation by roughly drawing a bounding box and using alternative click prompts. Everything mode automatically identifies and masks all objects in an image.

 

The emergence of SAM has undoubtedly demonstrated strong generalization across various images and objects, opening up new possibilities and avenues for applications in intelligent image analysis and understanding, such as augmented reality and human computer interaction. Some practitioners from both industry and academia have gone so far as to assert that “segmentation has reached its endpoint” and “the computer vision community is undergoing a seismic shift”. Actually, a dedicated dataset for pre-training is hard to encompass the vast array of unusual real-world scenarios and imaging modalities, particularly for computer vision community with a variety of conditions (e.g., low-light, bird′s-eye view, fog, rain), or employing various input modalities (e.g., depth, infrared, event, point cloud, CT, MRI), and with numerous real-world applications. Thus, it is of great practical interest to investigate how well SAM can infer or generalize under different scenarios and applications.

 

This leads to carry out this study, examining SAM′s performance across a diverse range of real-world segmentation applications, as illustrated in Fig. 1. Specifically, researchers employ SAM in various practical scenarios, including natural image, agriculture, manufacturing, remote sensing and healthcare. Meanwhile, they discuss SAM′s benefits and limitations in practice. Based on these studies, they have made the following observations:

 

1) Excellent generalization on common scenes. Experiments on various images validate SAM′s effectiveness across different prompt modes, demonstrating its ability to generalize well to typical natural image scenarios, especially when target regions distinct prominently from their surroundings. This emphasizes the superiority of the promptable SAM′s model design and the strength of its massive and diverse training data source.

 

2) Require strong prior knowledge. During the usage of SAM, researchers observe that for complex scenes, e.g., crop segmentation and fundus image segmentation, more manual prompts with prior knowledge are required, which could potentially result in a suboptimal user experience. Additionally, they notice that SAM tends to favor selecting the foreground mask. When applying the SAM model to shadow detection task, even with a large number of click prompts, its performance remains poor. This may be due to the strong foreground bias in its pre-training dataset, which hinders its ability to handle certain scenarios effectively.

 

3) Less effective in low-contrast applications. Segmenting objects with similar surrounding elements is considered challenging scenarios, especially when dealing with transparent or camouflaged objects that are “seamlessly” embedded in their surroundings. Experiments reveal that there is considerable room for exploring and enhancing SAM′s robustness in complex scenes with low-contrast elements.

 

4) Limited understanding of professional data. Researchers apply SAM to real-world medical and industrial scenarios and discover that it produces unsatisfactory results for professional data, particularly when using box mode and everything mode. This reveals SAM′s limitations in understanding these practical scenarios. Moreover, even with click mode, both the user and the model are required to possess certain domain-specific knowledge and understanding of the task at hand.

 

5) Smaller and irregular objects can pose challenges for SAM. Remote sensing and agriculture present additional challenges, such as irregular buildings and small-sized streets captured from the aerial imaging sensors. These complexities make it challenging for SAM to produce complete segmentation. How to design effective strategies for SAM in such cases is still an open issue.

 

This study examines SAM′s performance in various scenarios, and provides some observations and insights toward promoting the development of foundational models in vision realm. While researchers have tested many tasks, not all downstream applications have been covered. A multitude of fascinating segmentation tasks and scenarios are encouraged to be explored in future research.

 

 

See the article:

Ji, W., Li, J., Bi, Q. et al. Segment Anything Is Not Always Perfect: An Investigation of SAM on Different Real-world Applications. Mach. Intell. Res. (2024).



Journal

Machine Intelligence Research

DOI

10.1007/s11633-023-1385-0

Article Title

Segment Anything Is Not Always Perfect: An Investigation of SAM on Different Real-world Applications

Article Publication Date

12-Apr-2024

Share28Tweet17
Previous Post

NIH funds consortium to accelerate development of new TB treatments

Next Post

Ochsner, AJMC® partner for conference on value-based care on July 25 in New Orleans

Related Posts

blank
Agriculture

Enhancing Bread Wheat Yield and Nutrients in Ethiopia

September 5, 2025
blank
Agriculture

Exploring Global Research Trends in Aromatic Rice

September 5, 2025
blank
Agriculture

New Insights into Safe Diazotization of 2-ANDSA: Mapping Thermal Risks

September 4, 2025
blank
Agriculture

Exploring Heritability and Selection in Barley Breeding

September 4, 2025
blank
Agriculture

Revolutionary Biotech Breakthrough Enables Engineering of Pathogen-Resistant Crops

September 4, 2025
blank
Agriculture

Decoding Dormancy in Litchi Buds Through Phosphoproteomics Analysis

September 4, 2025
Next Post
Ochsner, AJMC® partner for conference on value-based care on July 25 in New Orleans

Ochsner, AJMC® partner for conference on value-based care on July 25 in New Orleans

  • Mothers who receive childcare support from maternal grandparents show more parental warmth, finds NTU Singapore study

    Mothers who receive childcare support from maternal grandparents show more parental warmth, finds NTU Singapore study

    27545 shares
    Share 11015 Tweet 6884
  • University of Seville Breaks 120-Year-Old Mystery, Revises a Key Einstein Concept

    960 shares
    Share 384 Tweet 240
  • Bee body mass, pathogens and local climate influence heat tolerance

    643 shares
    Share 257 Tweet 161
  • Researchers record first-ever images and data of a shark experiencing a boat strike

    510 shares
    Share 204 Tweet 128
  • Warm seawater speeding up melting of ‘Doomsday Glacier,’ scientists warn

    313 shares
    Share 125 Tweet 78
Science

Embark on a thrilling journey of discovery with Scienmag.com—your ultimate source for cutting-edge breakthroughs. Immerse yourself in a world where curiosity knows no limits and tomorrow’s possibilities become today’s reality!

RECENT NEWS

  • Improving Preschoolers’ Readiness: Teacher Training Impact
  • Porcine Placenta Peptide Boosts Hair Health: Studies
  • Austrian Nurses Advocate Solutions for Geriatric Care Challenges
  • Debunking Myths: Animal Encounters with Big Cats, Crocs

Categories

  • Agriculture
  • Anthropology
  • Archaeology
  • Athmospheric
  • Biology
  • Blog
  • Bussines
  • Cancer
  • Chemistry
  • Climate
  • Earth Science
  • Marine
  • Mathematics
  • Medicine
  • Pediatry
  • Policy
  • Psychology & Psychiatry
  • Science Education
  • Social Science
  • Space
  • Technology and Engineering

Subscribe to Blog via Email

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Join 5,183 other subscribers

© 2025 Scienmag - Science Magazine

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result
  • HOME
  • SCIENCE NEWS
  • CONTACT US

© 2025 Scienmag - Science Magazine

Discover more from Science

Subscribe now to keep reading and get access to the full archive.

Continue reading