Similarity Search Results Guide


This guide provides an overview of the similarity search algorithm developed in our recent research, currently available as a pre-print. For a detailed description, please refer to [1].


How It Works

When interpreting similarity scores, a score greater than 1 (λ > 1) indicates that two datasets are significantly different. However, a score less than 1 (λ > 1) does not necessarily imply that the datasets are highly similar — it simply means that their differences are not statistically significant. If comparing similarity scores across multiple datasets to establish a ranking of similarity, while some scores may fall below 1, the relative ordering of scores remains informative, helping to rank datasets by their degree of similarity or dissimilarity.


Thinned vs. Unthinned Data

Dissimilarity is computed before and after thinning datasets to a uniform density of 100 localisations/μm², allowing users to choose between:

Important Considerations

Comparing Different Datasets


References

  1. Shirgill, Sandeep, et al. "Nano-org, a functional resource for single-molecule localisation microscopy data." bioRxiv (2024): 2024-08.
  2. Jr, Frank J. "The Kolmogorov-Smirnov test for goodness of fit." Journal of the American Statistical Association 46.253 (1951): 68-78.