folder

Metrics matter: why we need to stop using silhouette in single-cell benchmarking

Authors

  • P. Rautenstrauch
  • U. Ohler

Journal

  • bioRxiv

Citation

  • bioRxiv

Abstract

  • Current-day single-cell studies comprise complex data sets affected by nested batch effects caused by technical and biological factors, relying on advanced integration methods. Silhouette is an established metric for assessing clustering results, comparing within-cluster cohesion to between-cluster separation, and adaptations of it have emerged as the dominant choice to evaluate the success of these integration methods. However, silhouette's assumptions are often violated in single-cell data integration scenarios. We demonstrate that silhouette-based metrics can neither reliably assess batch effect removal nor biological signal conservation and are thus inherently unsuitable for data with (nested) batch effects. We propose alternative, robust evaluation strategies that enable accurate integration method assessment and call to update benchmarking practices.


DOI

doi:10.1101/2025.01.21.634098