v4.1.0
Major enhancements and new features:
- Enhancement: provides a fix for #80. difPy now comes with an improved algorithm for handling larger datasets in order to be more memory efficient, see Using difPy with Large Datasets. As part of this enhancement, two new parameters were added:
- Enhancement: difPy comes with improved performance due to major improvements in the comparison algorithms. As part of this enhancement, a new parameter was added:
lazy
was added todifPy.search
which allows difPy to search more efficiently for exact duplicates (i. e. two exact file copies). By default,lazy
is set toTrue
and should only be turned off when searching for images that are not exact duplicates (i. e. having different dimensions, different file types, etc.). Read more here.
- Enhancement: the default value of the
similarity
parameter was reduced from 50 to 5. - Enhancement: the progress bar has been improved.
- New feature:
difPy.search
now supports therotate
parameter. If set toFalse
, images will not be rotated on comparison, which can significantly reduce comparison times. Read more here. - New feature: the output structure of difPy has been adjusted for improved user-friendliness: the structure of
search.result
is now simpler with less levels of depth, andsearch.lower_quality
now comes as alist
. When invoked via the CLI, the lower_quality output file will now be in.txt
format.
See the difPy usage guide for more details. Happy deduplicating! 🎉