Skip to content

Commit db0b975

Browse files
authored
misc: fix kv-layout doc references (#1009)
1 parent 73bf334 commit db0b975

File tree

6 files changed

+16
-16
lines changed

6 files changed

+16
-16
lines changed

flashinfer/cascade.py

+2-2
Original file line numberDiff line numberDiff line change
@@ -568,7 +568,7 @@ class BatchDecodeWithSharedPrefixPagedKVCacheWrapper:
568568
of requests. The shared-prefix KV-Cache was stored in a standalone tensors, and the
569569
unique KV-Cache of each request was stored in a paged KV-Cache data structure.
570570
571-
Check :ref:`our tutorial<page-layout>` for page table layout.
571+
Check :ref:`our tutorial<kv-layout>` for page table layout.
572572
573573
Warning
574574
-------
@@ -807,7 +807,7 @@ class BatchPrefillWithSharedPrefixPagedKVCacheWrapper:
807807
r"""Wrapper class for prefill/append attention with shared-prefix paged kv-cache for
808808
batch of requests.
809809
810-
Check :ref:`our tutorial<page-layout>` for paged kv-cache layout.
810+
Check :ref:`our tutorial<kv-layout>` for paged kv-cache layout.
811811
812812
Warning
813813
-------

flashinfer/decode.py

+2-2
Original file line numberDiff line numberDiff line change
@@ -509,7 +509,7 @@ class BatchDecodeWithPagedKVCacheWrapper:
509509
r"""Wrapper class for decode attention with paged kv-cache (first proposed in
510510
`vLLM <https://arxiv.org/abs/2309.06180>`_) for batch of requests.
511511
512-
Check :ref:`our tutorial<page-layout>` for page table layout.
512+
Check :ref:`our tutorial<kv-layout>` for page table layout.
513513
514514
Examples
515515
--------
@@ -1187,7 +1187,7 @@ class CUDAGraphBatchDecodeWithPagedKVCacheWrapper(BatchDecodeWithPagedKVCacheWra
11871187
because we won't dispatch to different kernels for different batch sizes/sequence lengths/etc
11881188
to accommodate the CUDAGraph requirement.
11891189
1190-
Check :ref:`our tutorial<page-layout>` for page table layout.
1190+
Check :ref:`our tutorial<kv-layout>` for page table layout.
11911191
11921192
Note
11931193
----

flashinfer/gemm.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -449,7 +449,7 @@ def run(
449449
y[i] = x[i] \times W[\text{weight_indices}[i]]
450450
451451
We use Ragged Tensor to represent the input tensor :attr:`x` and the output tensor :attr:`y`, and each x[i]
452-
is a segment of the concatenated tensor. Please see :ref:`Ragged Tensor tutorial <ragged-layout>` for more details.
452+
is a segment of the concatenated tensor. Please see :ref:`Ragged Tensor tutorial <kv-layout>` for more details.
453453
We use a ``seg_len`` or ``seg_indptr`` tensor (either would work) to indicate the start and end of each segment,
454454
where the ``seg_indptr`` is the cumulative sum of the ``seg_lens`` tensor (with an additional 0 at the beginning):
455455

flashinfer/pod.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -78,7 +78,7 @@ class PODWithPagedKVCacheWrapper:
7878
r"""Wrapper class for POD-Attention with paged kv-cache (first proposed in
7979
`<https://arxiv.org/abs/2410.18038>`_) for batch of requests.
8080
81-
Check :ref:`our tutorial<page-layout>` for page table layout.
81+
Check :ref:`our tutorial<kv-layout>` for page table layout.
8282
8383
Examples
8484
--------

flashinfer/prefill.py

+2-2
Original file line numberDiff line numberDiff line change
@@ -887,7 +887,7 @@ class BatchPrefillWithPagedKVCacheWrapper:
887887
r"""Wrapper class for prefill/append attention with paged kv-cache for batch of
888888
requests.
889889
890-
Check :ref:`our tutorial <page-layout>` for page table layout.
890+
Check :ref:`our tutorial <kv-layout>` for page table layout.
891891
892892
Example
893893
-------
@@ -1722,7 +1722,7 @@ class BatchPrefillWithRaggedKVCacheWrapper:
17221722
r"""Wrapper class for prefill/append attention with ragged (tensor) kv-cache for
17231723
batch of requests.
17241724
1725-
Check :ref:`our tutorial <ragged-layout>` for ragged kv-cache layout.
1725+
Check :ref:`our tutorial <kv-layout>` for ragged kv-cache layout.
17261726
17271727
Example
17281728
-------

flashinfer/rope.py

+8-8
Original file line numberDiff line numberDiff line change
@@ -298,7 +298,7 @@ def apply_rope_inplace(
298298
segment the query of the i-th segment is ``q[indptr[i]:indptr[i+1]]`` and the key of the
299299
i-th segment is ``k[indptr[i]:indptr[i+1]]``, the first element of :attr:`indptr` is always
300300
0 and the last element of :attr:`indptr` is the total number of queries/keys in the batch.
301-
Please see :ref:`Ragged Tensor tutorial <ragged-layout>` for more details about the
301+
Please see :ref:`Ragged Tensor tutorial <kv-layout>` for more details about the
302302
ragged tensor.
303303
304304
Parameters
@@ -384,7 +384,7 @@ def apply_rope_pos_ids_inplace(
384384
segment the query of the i-th segment is ``q[indptr[i]:indptr[i+1]]`` and the key of the
385385
i-th segment is ``k[indptr[i]:indptr[i+1]]``, the first element of :attr:`indptr` is always
386386
0 and the last element of :attr:`indptr` is the total number of queries/keys in the batch.
387-
Please see :ref:`Ragged Tensor tutorial <ragged-layout>` for more details about the
387+
Please see :ref:`Ragged Tensor tutorial <kv-layout>` for more details about the
388388
ragged tensor.
389389
390390
Parameters
@@ -446,7 +446,7 @@ def apply_llama31_rope_inplace(
446446
segment the query of the i-th segment is ``q[indptr[i]:indptr[i+1]]`` and the key of the
447447
i-th segment is ``k[indptr[i]:indptr[i+1]]``, the first element of :attr:`indptr` is always
448448
0 and the last element of :attr:`indptr` is the total number of queries/keys in the batch.
449-
Please see :ref:`Ragged Tensor tutorial <ragged-layout>` for more details about the
449+
Please see :ref:`Ragged Tensor tutorial <kv-layout>` for more details about the
450450
ragged tensor.
451451
452452
Parameters
@@ -553,7 +553,7 @@ def apply_llama31_rope_pos_ids_inplace(
553553
segment the query of the i-th segment is ``q[indptr[i]:indptr[i+1]]`` and the key of the
554554
i-th segment is ``k[indptr[i]:indptr[i+1]]``, the first element of :attr:`indptr` is always
555555
0 and the last element of :attr:`indptr` is the total number of queries/keys in the batch.
556-
Please see :ref:`Ragged Tensor tutorial <ragged-layout>` for more details about the
556+
Please see :ref:`Ragged Tensor tutorial <kv-layout>` for more details about the
557557
ragged tensor.
558558
559559
Parameters
@@ -629,7 +629,7 @@ def apply_rope(
629629
segment the query of the i-th segment is ``q[indptr[i]:indptr[i+1]]`` and the key of the
630630
i-th segment is ``k[indptr[i]:indptr[i+1]]``, the first element of :attr:`indptr` is always
631631
0 and the last element of :attr:`indptr` is the total number of queries/keys in the batch.
632-
Please see :ref:`Ragged Tensor tutorial <ragged-layout>` for more details about the
632+
Please see :ref:`Ragged Tensor tutorial <kv-layout>` for more details about the
633633
ragged tensor.
634634
635635
Parameters
@@ -738,7 +738,7 @@ def apply_rope_pos_ids(
738738
segment the query of the i-th segment is ``q[indptr[i]:indptr[i+1]]`` and the key of the
739739
i-th segment is ``k[indptr[i]:indptr[i+1]]``, the first element of :attr:`indptr` is always
740740
0 and the last element of :attr:`indptr` is the total number of queries/keys in the batch.
741-
Please see :ref:`Ragged Tensor tutorial <ragged-layout>` for more details about the
741+
Please see :ref:`Ragged Tensor tutorial <kv-layout>` for more details about the
742742
ragged tensor.
743743
744744
Parameters
@@ -810,7 +810,7 @@ def apply_llama31_rope(
810810
segment the query of the i-th segment is ``q[indptr[i]:indptr[i+1]]`` and the key of the
811811
i-th segment is ``k[indptr[i]:indptr[i+1]]``, the first element of :attr:`indptr` is always
812812
0 and the last element of :attr:`indptr` is the total number of queries/keys in the batch.
813-
Please see :ref:`Ragged Tensor tutorial <ragged-layout>` for more details about the
813+
Please see :ref:`Ragged Tensor tutorial <kv-layout>` for more details about the
814814
ragged tensor.
815815
816816
Parameters
@@ -931,7 +931,7 @@ def apply_llama31_rope_pos_ids(
931931
segment the query of the i-th segment is ``q[indptr[i]:indptr[i+1]]`` and the key of the
932932
i-th segment is ``k[indptr[i]:indptr[i+1]]``, the first element of :attr:`indptr` is always
933933
0 and the last element of :attr:`indptr` is the total number of queries/keys in the batch.
934-
Please see :ref:`Ragged Tensor tutorial <ragged-layout>` for more details about the
934+
Please see :ref:`Ragged Tensor tutorial <kv-layout>` for more details about the
935935
ragged tensor.
936936
937937
Parameters

0 commit comments

Comments
 (0)