Skip to content

Commit 76380a8

Browse files
authored
Merge pull request #270 from ashvardanian/main-dev
Multi-threading in Python 3.13t
2 parents 9a4d325 + 529b0dd commit 76380a8

File tree

20 files changed

+694
-424
lines changed

20 files changed

+694
-424
lines changed

.github/workflows/prerelease.yml

Lines changed: 14 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -209,7 +209,7 @@ jobs:
209209
runs-on: ubuntu-22.04
210210
strategy:
211211
matrix:
212-
python-version: ["37", "38", "39", "310", "311", "312", "313"]
212+
python-version: ["38", "39", "310", "311", "312", "313", "313t"]
213213
needs: [test_python]
214214
steps:
215215
- name: Checkout
@@ -225,6 +225,7 @@ jobs:
225225
env:
226226
# Matches: `cp37-manylinux_x86_64` and `cp38-musllinux_x86_64`
227227
CIBW_BUILD: cp${{ matrix.python-version }}-*
228+
CIBW_ENABLE: cpython-freethreading # No-GIL 3.13t builds
228229
CIBW_PLATFORM: linux
229230
CIBW_ARCHS: x86_64
230231

@@ -235,7 +236,7 @@ jobs:
235236
runs-on: ubuntu-24.04-arm
236237
strategy:
237238
matrix:
238-
python-version: ["37", "38", "39", "310", "311", "312", "313"]
239+
python-version: ["38", "39", "310", "311", "312", "313", "313t"]
239240
needs: [test_python]
240241
steps:
241242
- name: Checkout
@@ -262,6 +263,7 @@ jobs:
262263
env:
263264
# Matches: `cp37-manylinux_aarch64` and `cp38-musllinux_aarch64`
264265
CIBW_BUILD: cp${{ matrix.python-version }}-*
266+
CIBW_ENABLE: cpython-freethreading # No-GIL 3.13t builds
265267
CIBW_PLATFORM: linux
266268
CIBW_ARCHS: aarch64
267269

@@ -270,7 +272,7 @@ jobs:
270272
runs-on: macos-13
271273
strategy:
272274
matrix:
273-
python-version: ["37", "38", "39", "310", "311", "312", "313"]
275+
python-version: ["38", "39", "310", "311", "312", "313", "313t"]
274276
needs: [test_python]
275277
steps:
276278
- name: Checkout
@@ -285,6 +287,7 @@ jobs:
285287
cibuildwheel --output-dir wheelhouse
286288
env:
287289
CIBW_BUILD: cp${{ matrix.python-version }}-*
290+
CIBW_ENABLE: cpython-freethreading # No-GIL 3.13t builds
288291
CIBW_PLATFORM: macos
289292
CIBW_ARCHS: x86_64
290293

@@ -293,7 +296,7 @@ jobs:
293296
runs-on: macos-14
294297
strategy:
295298
matrix:
296-
python-version: ["38", "39", "310", "311", "312", "313"] #! Python 3.7 isn't supported on ARM macOS
299+
python-version: ["38", "39", "310", "311", "312", "313", "313t"] #! Python 3.7 isn't supported on ARM macOS
297300
needs: [test_python]
298301
steps:
299302
- name: Checkout
@@ -308,6 +311,7 @@ jobs:
308311
cibuildwheel --output-dir wheelhouse
309312
env:
310313
CIBW_BUILD: cp${{ matrix.python-version }}-*
314+
CIBW_ENABLE: cpython-freethreading # No-GIL 3.13t builds
311315
CIBW_PLATFORM: macos
312316
CIBW_ARCHS: arm64
313317

@@ -316,7 +320,7 @@ jobs:
316320
runs-on: windows-2022
317321
strategy:
318322
matrix:
319-
python-version: ["37", "38", "39", "310", "311", "312", "313"]
323+
python-version: ["38", "39", "310", "311", "312", "313", "313t"]
320324
architecture: [AMD64] # List ARM64 separately and avoid 32-bit
321325
#! ARM64 isn't supported for Python 3.7 and 3.8
322326
include:
@@ -330,6 +334,8 @@ jobs:
330334
architecture: ARM64
331335
- python-version: "313"
332336
architecture: ARM64
337+
- python-version: "313t"
338+
architecture: ARM64
333339
needs: [test_python]
334340
steps:
335341
- name: Checkout
@@ -348,14 +354,15 @@ jobs:
348354
cibuildwheel --output-dir wheelhouse
349355
env:
350356
CIBW_BUILD: cp${{ matrix.python-version }}-win_${{ matrix.architecture }}
357+
CIBW_ENABLE: cpython-freethreading # No-GIL 3.13t builds
351358
CIBW_PLATFORM: windows
352359

353360
build_wheels_other:
354361
name: Build Python Wheels (Other Platforms)
355362
runs-on: ubuntu-22.04
356363
strategy:
357364
matrix:
358-
python-version: ["37", "38", "39", "310", "311", "312", "313"]
365+
python-version: ["38", "39", "310", "311", "312", "313", "313t"]
359366
needs: [test_python]
360367
steps:
361368
- name: Checkout
@@ -373,5 +380,6 @@ jobs:
373380
env:
374381
# https://cibuildwheel.pypa.io/en/stable/options/#archs
375382
CIBW_BUILD: cp${{ matrix.python-version }}-*
383+
CIBW_ENABLE: cpython-freethreading # No-GIL 3.13t builds
376384
CIBW_PLATFORM: linux
377385
CIBW_ARCHS: ppc64le s390x i686 #! `armv7l` not worth the trouble

.github/workflows/release.yml

Lines changed: 14 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -75,7 +75,7 @@ jobs:
7575
needs: versioning
7676
strategy:
7777
matrix:
78-
python-version: ["37", "38", "39", "310", "311", "312", "313"]
78+
python-version: ["38", "39", "310", "311", "312", "313", "313t"]
7979
steps:
8080
- uses: actions/checkout@v4
8181
with:
@@ -90,6 +90,7 @@ jobs:
9090
cibuildwheel --output-dir wheelhouse
9191
env:
9292
CIBW_BUILD: cp${{ matrix.python-version }}-*
93+
CIBW_ENABLE: cpython-freethreading # No-GIL 3.13t builds
9394
CIBW_PLATFORM: linux
9495
CIBW_ARCHS: x86_64
9596
- name: Upload wheels
@@ -107,7 +108,7 @@ jobs:
107108
needs: versioning
108109
strategy:
109110
matrix:
110-
python-version: ["37", "38", "39", "310", "311", "312", "313"]
111+
python-version: ["38", "39", "310", "311", "312", "313", "313t"]
111112
steps:
112113
- uses: actions/checkout@v4
113114
with:
@@ -133,6 +134,7 @@ jobs:
133134
cibuildwheel --output-dir wheelhouse
134135
env:
135136
CIBW_BUILD: cp${{ matrix.python-version }}-*
137+
CIBW_ENABLE: cpython-freethreading # No-GIL 3.13t builds
136138
CIBW_PLATFORM: linux
137139
CIBW_ARCHS: aarch64
138140
- name: Upload wheels
@@ -148,7 +150,7 @@ jobs:
148150
needs: versioning
149151
strategy:
150152
matrix:
151-
python-version: ["37", "38", "39", "310", "311", "312", "313"]
153+
python-version: ["38", "39", "310", "311", "312", "313", "313t"]
152154
steps:
153155
- uses: actions/checkout@v4
154156
with:
@@ -163,6 +165,7 @@ jobs:
163165
cibuildwheel --output-dir wheelhouse
164166
env:
165167
CIBW_BUILD: cp${{ matrix.python-version }}-*
168+
CIBW_ENABLE: cpython-freethreading # No-GIL 3.13t builds
166169
CIBW_PLATFORM: macos
167170
CIBW_ARCHS: x86_64
168171
- name: Upload wheels
@@ -179,7 +182,7 @@ jobs:
179182
strategy:
180183
matrix:
181184
# 3.7 not supported on macOS ARM
182-
python-version: ["38", "39", "310", "311", "312", "313"]
185+
python-version: ["38", "39", "310", "311", "312", "313", "313t"]
183186
steps:
184187
- uses: actions/checkout@v4
185188
with:
@@ -194,6 +197,7 @@ jobs:
194197
cibuildwheel --output-dir wheelhouse
195198
env:
196199
CIBW_BUILD: cp${{ matrix.python-version }}-*
200+
CIBW_ENABLE: cpython-freethreading # No-GIL 3.13t builds
197201
CIBW_PLATFORM: macos
198202
CIBW_ARCHS: arm64
199203
- name: Upload wheels
@@ -209,7 +213,7 @@ jobs:
209213
needs: versioning
210214
strategy:
211215
matrix:
212-
python-version: ["37", "38", "39", "310", "311", "312", "313"]
216+
python-version: ["38", "39", "310", "311", "312", "313", "313t"]
213217
architecture: [AMD64]
214218
include:
215219
- python-version: "39"
@@ -222,6 +226,8 @@ jobs:
222226
architecture: ARM64
223227
- python-version: "313"
224228
architecture: ARM64
229+
- python-version: "313t"
230+
architecture: ARM64
225231
steps:
226232
- uses: actions/checkout@v4
227233
with:
@@ -240,6 +246,7 @@ jobs:
240246
cibuildwheel --output-dir wheelhouse
241247
env:
242248
CIBW_BUILD: cp${{ matrix.python-version }}-win_${{ matrix.architecture }}
249+
CIBW_ENABLE: cpython-freethreading # No-GIL 3.13t builds
243250
CIBW_PLATFORM: windows
244251
- name: Upload wheels
245252
uses: actions/upload-artifact@v4
@@ -254,7 +261,7 @@ jobs:
254261
needs: versioning
255262
strategy:
256263
matrix:
257-
python-version: ["37", "38", "39", "310", "311", "312", "313"]
264+
python-version: ["38", "39", "310", "311", "312", "313", "313t"]
258265
steps:
259266
- uses: actions/checkout@v4
260267
with:
@@ -272,6 +279,7 @@ jobs:
272279
env:
273280
# e.g. ppc64le, s390x, i686, etc.
274281
CIBW_BUILD: cp${{ matrix.python-version }}-*
282+
CIBW_ENABLE: cpython-freethreading # No-GIL 3.13t builds
275283
CIBW_PLATFORM: linux
276284
CIBW_ARCHS: ppc64le s390x i686
277285
- name: Upload wheels

CMakeLists.txt

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ set(CMAKE_CXX_STANDARD 17)
1616
set(CMAKE_CXX_STANDARD_REQUIRED YES)
1717
set(CMAKE_CXX_EXTENSIONS NO)
1818

19-
# Determine if SimSIMD is built as a subproject (using `add_subdirectory`) or if it is the main project
19+
# Determine if SimSIMD is built as a sub-project (using `add_subdirectory`) or if it is the main project
2020
set(SIMSIMD_IS_MAIN_PROJECT OFF)
2121

2222
if (CMAKE_CURRENT_SOURCE_DIR STREQUAL CMAKE_SOURCE_DIR)
@@ -128,7 +128,8 @@ if (SIMSIMD_BUILD_SHARED)
128128
PRIVATE_HEADER
129129
PUBLIC_HEADER
130130
RESOURCE
131-
RUNTIME)
131+
RUNTIME
132+
)
132133
endif ()
133134

134135
install(DIRECTORY ./include/ DESTINATION /usr/include/)

CONTRIBUTING.md

Lines changed: 15 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -96,24 +96,30 @@ export BLIS_NUM_THREADS=1 # for BLIS
9696
## Python
9797

9898
Python bindings are implemented using pure CPython, so you wouldn't need to install SWIG, PyBind11, or any other third-party library.
99-
Still, you need a virtual environment, and it's recommended to use `uv` to create one.
100-
101-
```sh
102-
uv venv --python 3.11 # Or your preferred Python version
103-
source .venv/bin/activate # To activate the virtual environment
104-
uv pip install -e . # To build locally from source
105-
```
106-
107-
Testing:
99+
Still, you need a virtual environment.
100+
If you already have one:
108101

109102
```sh
103+
pip install -e . # build locally from source
110104
pip install pytest pytest-repeat tabulate # testing dependencies
111105
pytest scripts/test.py -s -x -Wd # to run tests
112106

113107
# to check supported SIMD instructions:
114108
python -c "import simsimd; print(simsimd.get_capabilities())"
115109
```
116110

111+
Alternatively, use `uv` to create the virtual environment.
112+
113+
```sh
114+
uv venv --python 3.13t # or your preferred version
115+
source .venv/bin/activate # activate the environment
116+
uv pip install -e . # build locally from source
117+
118+
# to run GIL-related tests in a free-threaded environment:
119+
uv pip install pytest pytest-repeat tabulate numpy scipy
120+
PYTHON_GIL=0 python -m pytest scripts/test.py -s -x -Wd -k gil
121+
```
122+
117123
Here, `-s` will output the logs.
118124
The `-x` will stop on the first failure.
119125
The `-Wd` will silence overflows and runtime warnings.

README.md

Lines changed: 35 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -491,7 +491,7 @@ slow_output_color = (alpha * light_intensity * diffuse_component + beta * specul
491491

492492
By default, computations use a single CPU core.
493493
To override this behavior, use the `threads` argument.
494-
Set it to `0` to use all available CPU cores.
494+
Set it to `0` to use all available CPU cores and let the underlying C library manage the thread pool.
495495
Here is an example of dealing with large sets of binary vectors:
496496

497497
```py
@@ -507,9 +507,42 @@ distances = simsimd.cdist(matrix1, matrix2,
507507
)
508508
```
509509

510+
Alternatively, when using free-threading Python 3.13t builds, one can combine single-threaded SimSIMD operations with Python's `concurrent.futures.ThreadPoolExecutor` to parallelize the computations.
510511
By default, the output distances will be stored in double-precision `float64` floating-point numbers.
511512
That behavior may not be space-efficient, especially if you are computing the hamming distance between short binary vectors, that will generally fit into 8x smaller `uint8` or `uint16` types.
512-
To override this behavior, use the `dtype` argument.
513+
To override this behavior, use the `out_dtype` argument, or consider pre-allocating the output array and passing it to the `out` argument.
514+
A more complete example may look like this:
515+
516+
```py
517+
from multiprocessing import cpu_count
518+
from concurrent.futures import ThreadPoolExecutor
519+
from simsimd import cosine
520+
import numpy as np
521+
522+
# Generate large dataset
523+
vectors_a = np.random.rand(100_000, 1536).astype(np.float32)
524+
vectors_b = np.random.rand(100_000, 1536).astype(np.float32)
525+
distances = np.zeros((100_000,), dtype=np.float32)
526+
527+
def compute_batch(start_idx, end_idx):
528+
batch_a = vectors_a[start_idx:end_idx]
529+
batch_b = vectors_b[start_idx:end_idx]
530+
cosine(batch_a, batch_b, out=distances[start_idx:end_idx])
531+
532+
# Use all CPU cores with true parallelism (no GIL!)
533+
num_threads = cpu_count()
534+
chunk_size = len(vectors_a) // num_threads
535+
536+
with ThreadPoolExecutor(max_workers=num_threads) as executor:
537+
futures = []
538+
for i in range(num_threads):
539+
start_idx = i * chunk_size
540+
end_idx = (i + 1) * chunk_size if i < num_threads - 1 else len(vectors_a)
541+
futures.append(executor.submit(compute_batch, start_idx, end_idx))
542+
543+
# Collect results from all threads
544+
results = [future.result() for future in futures]
545+
```
513546

514547
### Helper Functions
515548

c/lib.c

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -255,7 +255,11 @@ SIMSIMD_DYNAMIC simsimd_capability_t simsimd_capabilities(void) {
255255
// with dummy inputs:
256256
simsimd_distance_t dummy_results_buffer[2];
257257
simsimd_distance_t *dummy_results = &dummy_results_buffer[0];
258-
void *x = 0;
258+
259+
// Passing `NULL` as `x` will trigger all kinds of `nonull` warnings on GCC.
260+
typedef double largest_scalar_t;
261+
largest_scalar_t dummy_input[1];
262+
void *x = &dummy_input[0];
259263

260264
// Dense:
261265
simsimd_dot_i8((simsimd_i8_t *)x, (simsimd_i8_t *)x, 0, dummy_results);

include/simsimd/simsimd.h

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,6 @@
33
* @brief SIMD-accelerated Similarity Measures and Distance Functions.
44
* @author Ash Vardanian
55
* @date March 14, 2023
6-
* @copyright Copyright (c) 2023
76
*
87
* References:
98
* x86 intrinsics: https://www.intel.com/content/www/us/en/docs/intrinsics-guide

javascript/lib.c

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,6 @@
44
* @author Ash Vardanian
55
* @date October 18, 2023
66
*
7-
* @copyright Copyright (c) 2023
87
* @see NodeJS docs: https://nodejs.org/api/n-api.html
98
*/
109

0 commit comments

Comments
 (0)