This article explores challenges and innovations in medical image retrieval, focusing on dataset imbalance, organ size and shape biases, and recall accuracy interpretation. It highlights a novel application of ColBERT-inspired re-ranking, demonstrating its feasibility in refining CBIR results by incorporating context such as user behavior and medical relevance. While no strong link was found between anatomical region size and retrieval recall, the study opens new pathways for improving image retrieval systems, balancing computational costs, and enhancing real-world usability.This article explores challenges and innovations in medical image retrieval, focusing on dataset imbalance, organ size and shape biases, and recall accuracy interpretation. It highlights a novel application of ColBERT-inspired re-ranking, demonstrating its feasibility in refining CBIR results by incorporating context such as user behavior and medical relevance. While no strong link was found between anatomical region size and retrieval recall, the study opens new pathways for improving image retrieval systems, balancing computational costs, and enhancing real-world usability.

How Dataset Imbalances Shape Medical Image Retrieval Accuracy

3 min read

Abstract and 1. Introduction

  1. Materials and Methods

    2.1 Vector Database and Indexing

    2.2 Feature Extractors

    2.3 Dataset and Pre-processing

    2.4 Search and Retrieval

    2.5 Re-ranking retrieval and evaluation

  2. Evaluation and 3.1 Search and Retrieval

    3.2 Re-ranking

  3. Discussion

    4.1 Dataset and 4.2 Re-ranking

    4.3 Embeddings

    4.4 Volume-based, Region-based and Localized Retrieval and 4.5 Localization-ratio

  4. Conclusion, Acknowledgement, and References

4 Discussion

4.1 Dataset

As depicted in Figure 6, the labels inside the database and query subset (derived from TS train and test set, respectively) are not balanced. This should resemble a pattern as can be observed in future real-world scenarios of image retrieval. At the same time, this imbalance should be kept in mind when reading and interpreting recall values from the provided result tables.

\ Additionally, it is worth noting that the size and shape of organs can impact the probability of correctly predicting a given label by chance. For example, smaller organs can be less likely to collect "by-chance" true positive predictions compared to larger organs. Similarly, organs with elongated shapes aligned with the slice-wise sampling direction can increase the likelihood of "by-chance" hits. A volume and shape-adjusted representation of recall values does not seem reasonable and thus has not been performed in this work. However, organ volume as shown in Figure 7 and Figure 8 should be considered while interpreting result tables.

\ Figure 9 and Figure 10 present an overview of mean recall for each of the retrieval methods (all models) versus the mean anatomical region size for 29 and 104 classes, respectively. There is no pattern suggesting any correlation between the size of the anatomical region and the average retrieval recall.

\ Figure 6: Distribution of the classes in database (a) and query (b) volumes.

\

4.2 Re-ranking

For the first time, we could successfully adopt and show the feasibility of ColBERT-inspired re-ranking for an image retrieval task. In theory, this shows that CBIR results can be made subject to context-aware re-ranking. This is very important as it provides a conceptual entry point to use the information of a future retrieval solution in the real world. Concretely, observations such as user behavior on a graphical user interface, and temporal or medical relevance can be "factored in" to adjust the search results. Further research will study the advantages and disadvantages of ColBERT-inspired re-ranking. In future works, further insights into balancing computational costs in the context of latency-accuracy trade-offs will be shared.

\

:::info Authors:

(1) Farnaz Khun Jush, Bayer AG, Berlin, Germany ([email protected]);

(2) Steffen Vogler, Bayer AG, Berlin, Germany ([email protected]);

(3) Tuan Truong, Bayer AG, Berlin, Germany ([email protected]);

(4) Matthias Lenga, Bayer AG, Berlin, Germany ([email protected]).

:::


:::info This paper is available on arxiv under CC BY 4.0 DEED license.

:::

\

Market Opportunity
RealLink Logo
RealLink Price(REAL)
$0.04777
$0.04777$0.04777
-5.23%
USD
RealLink (REAL) Live Price Chart
Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact [email protected] for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.

You May Also Like

BetFury is at SBC Summit Lisbon 2025: Affiliate Growth in Focus

BetFury is at SBC Summit Lisbon 2025: Affiliate Growth in Focus

The post BetFury is at SBC Summit Lisbon 2025: Affiliate Growth in Focus appeared on BitcoinEthereumNews.com. Press Releases are sponsored content and not a part of Finbold’s editorial content. For a full disclaimer, please . Crypto assets/products can be highly risky. Never invest unless you’re prepared to lose all the money you invest. Curacao, Curacao, September 17th, 2025, Chainwire BetFury steps onto the stage of SBC Summit Lisbon 2025 — one of the key gatherings in the iGaming calendar. From 16 to 18 September, the platform showcases its brand strength, deepens affiliate connections, and outlines its plans for global expansion. BetFury continues to play a role in the evolving crypto and iGaming partnership landscape. BetFury’s Participation at SBC Summit The SBC Summit gathers over 25,000 delegates, including 6,000+ affiliates — the largest concentration of affiliate professionals in iGaming. For BetFury, this isn’t just visibility, it’s a strategic chance to present its Affiliate Program to the right audience. Face-to-face meetings, dedicated networking zones, and affiliate-focused sessions make Lisbon the ideal ground to build new partnerships and strengthen existing ones. BetFury Meets Affiliate Leaders at its Massive Stand BetFury arrives at the summit with a massive stand placed right in the center of the Affiliate zone. Designed as a true meeting hub, the stand combines large LED screens, a sleek interior, and the best coffee at the event — but its core mission goes far beyond style. Here, BetFury’s team welcomes partners and affiliates to discuss tailored collaborations, explore growth opportunities across multiple GEOs, and expand its global Affiliate Program. To make the experience even more engaging, the stand also hosts: Affiliate Lottery — a branded drum filled with exclusive offers and personalized deals for affiliates. Merch Kits — premium giveaways to boost brand recognition and leave visitors with a lasting conference memory. Besides, at SBC Summit Lisbon, attendees have a chance to meet the BetFury team along…
Share
BitcoinEthereumNews2025/09/18 01:20
Tether Advances Gold Strategy With $150 Million Stake in Gold.com

Tether Advances Gold Strategy With $150 Million Stake in Gold.com

TLDR Tether buys $150M Gold.com stake to expand digital gold infrastructure Partnership links physical gold supply with blockchain settlement rails XAUT token distribution
Share
Coincentral2026/02/06 10:09
Payy Launches As Ethereum’s First Privacy-Enabled EVM L2

Payy Launches As Ethereum’s First Privacy-Enabled EVM L2

The post Payy Launches As Ethereum’s First Privacy-Enabled EVM L2 appeared on BitcoinEthereumNews.com. Crypto project Payy, which operates a privacy-focused wallet
Share
BitcoinEthereumNews2026/02/06 09:54