Skip to content

Ensure that deduplication operates on disjoint sets

ale requested to merge stable-dedupe into main

Remove the possibility of loops in the deduplication algorithm by first merging all the duplicated sets into a series of disjoint sets, thus ensuring the absence of loops or multi-level edges in the duplication graph.

As an additional improvement, pick the element with the greatest metadata source confidence from the duplication set.

Merge request reports

Loading