Skip to content

Commit 6601e9a

Browse files
committed
Reduce default VRAM used by sub-quad attention
This does not seem to cause a significant change in performance, but should significantly lower the amount of RAM used in many cases.
1 parent ae6299b commit 6601e9a

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

modules/sd_hijack_optimizations.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -279,7 +279,7 @@ def sub_quad_attention(q, k, v, q_chunk_size=1024, kv_chunk_size=None, kv_chunk_
279279
qk_matmul_size_bytes = batch_x_heads * bytes_per_token * q_tokens * k_tokens
280280

281281
if chunk_threshold is None:
282-
chunk_threshold_bytes = max(int(get_available_vram() * 0.9), 1073741824) if q.device.type == 'mps' else int(get_available_vram() * 0.7)
282+
chunk_threshold_bytes = 536870912
283283
elif chunk_threshold == 0:
284284
chunk_threshold_bytes = None
285285
else:

0 commit comments

Comments
 (0)