Perceptual Duplicate Finder: Catch Visually Similar Images
The duplicates URL matching cannot see
Removing duplicate URLs is easy: two identical links, one gets dropped. But the same image often lives at many different URLs — a CDN copy, a resized version, a re-hosted upload, a slightly recompressed variant. To URL-level dedup, those are all unique. To your eyes, they are the same picture. The Perceptual Duplicate Finder in Bulk Image Downloader From URL List catches that second kind: visual duplicates that share pixels, not addresses.
It works by comparing what the images actually look like, so near-identical shots get grouped even when their URLs have nothing in common.
How the scan works
Open the Duplicate Finder tab and choose your scope: a single task or all tasks. Then hit Scan for Duplicates. The finder evaluates more than fifteen signals — perceptual hashes (pHash, dHash), histograms, and texture patterns among them — to decide how similar two images are, and groups the visual matches together. For large lists, it processes up to six hundred URLs per pass.
The multi-signal approach is what makes it robust. A single hash can be fooled by a crop or a recompress; combining perceptual hashes with color histograms and texture analysis gives a much more reliable read on whether two images are really the same.
Sensitivity presets
Before you scan, you set sensitivity, which controls how eager the finder is to call two images a match:
- Strict — only flags very close matches. Fewer groups, high confidence each is a true duplicate.
- Balanced — a sensible middle ground for most lists.
- Aggressive — casts a wider net and catches looser similarities, at the cost of more borderline groupings to review.
Which one you want depends on the job. Cleaning a product catalog where false matches would be costly? Lean strict. Hunting down every near-copy in a messy scrape? Aggressive will surface more.
Reading the results
The scan produces grouped results, where each group holds images the finder judged visually similar. You can sort groups by size or confidence and collapse or expand them — all at once, or one group at a time — so a long report stays manageable. Seeing the similar images lined up side by side makes it obvious at a glance which group members are genuine duplicates and which slipped in.
Use it alongside URL dedup
The cleanest mental model is two complementary passes. The side panel’s deduplication handles matching URLs. The Perceptual Duplicate Finder handles matching pixels. Run URL dedup to clear the obvious exact-link repeats, then run the perceptual scan to catch the same images hiding behind different addresses. Together they get your list down to genuinely distinct images before you commit to a download — deciding what to do with each group, from keep rules to bulk removal, is the next step the finder is built to handle.
