You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/usage.rst
+1-36
Original file line number
Diff line number
Diff line change
@@ -179,39 +179,4 @@ Supported File Types
179
179
180
180
difPy supports most popular image formats. Nevertheless, since it relies on the Pillow library for image decoding, the supported formats are restricted to the ones listed in the* `Pillow Documentation`_. Unsupported file types will by marked as invalid and included in the :ref:`invalid_files` output.
🆕 difPy >= v3.x supports Fast Search Algorithm (FSA).
192
-
193
-
difPy's Fast Search Algorithm (FSA) can provide significant performance increases and time complexity reduction when searching for duplicates.
194
-
195
-
FSA can be enabled/disabled with the :ref:`fast_search` parameter.
196
-
197
-
About FSA
198
-
^^^^^^^^^^
199
-
200
-
With the classic difPy algorithm, each image would be compared to every other successive image (by order of images found in the directories). Comparing every image is a very precise option, but leads to high time complexity. When searching for duplicates, this time complexity can be reduced by applying FSA. With FSA, difPy compares an image until it finds a duplicate. This duplicate is classified as duplicate and then excluded from the succeeding search, leading to a lower average number of comparisons.
201
-
202
-
*Example: in the first round, difPy searches for duplicates to imageA and finds imageB and imageC. In the next rounds, the search for duplicates of imageB and imageC will be skipped, since they are all duplicates and no further comparison is required.*
203
-
204
-
Due to its nature, FSA is very efficient when run on duplicate searches, but it is **not advised to be used when searching for similar images**, as the result might be inaccurate. **When searching for similar images, difPy's classic algorithm should be used by setting** :ref:`fast_search` **to** ``False``.
205
-
206
-
*Example: imageA might be similar to imageB and imageC, but this does not imply that imageB is similar to imageC. Nevertheless, FSA would assume imageB and imageC to be equally similar and would therefore potentially return wrong results.*
207
-
208
-
**When searching for similar images, difPy automatically disables FSA** to ensure accurate search results. This applies when :ref:`similarity` is set to ``'similar'`` **or** if :ref:`similarity` is manually set to a value ``> 0``.
209
-
210
-
Make difPy Faster
211
-
----------------
212
-
213
-
difPy's processing speed can increase or decrease, depending on which parameter configurations are used. Speeding up the comparison process can be especially useful, when using difPy to compare a large number of images (>1'000 images). Below you will find some tips on which configurations can make difPy's processing faster:
214
-
215
-
* Enable :ref:`fast_search` when searching for duplicates
216
-
* Enable :ref:`limit_extensions`
217
-
* Set :ref:`px_size` <= 50. Note: the lower the ``px_size``, the less precise the comparison will be. It is not recommended to go below a ``px_size`` of 20.
0 commit comments