Skip to content

Commit 62576a9

Browse files
authored
Update READMEs badges and links (#1)
* init * Update README.md * Update README.md * Update README.md
1 parent 0df5b6b commit 62576a9

File tree

4 files changed

+12
-12
lines changed

4 files changed

+12
-12
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -80,7 +80,7 @@ PyPi Package (Linx) | [![](https://github.com/google-ai-edge/ai-edge-torch/ac
8080
* Python versions: 3.9, 3.10, 3.11
8181
* Operating system: Linux
8282
* PyTorch: ![torch](https://img.shields.io/badge/torch-2.4.0.dev20240429-blue)
83-
* TensorFlow: [![tf-nightly](https://img.shields.io/badge/tf--nightly-2.17.0.dev20240430-blue)](https://pypi.org/project/tf-nightly/)
83+
* TensorFlow: [![tf-nightly](https://img.shields.io/badge/tf--nightly-2.17.0.dev20240509-blue)](https://pypi.org/project/tf-nightly/)
8484

8585
<!-- requirement badges are updated by ci/update_nightly_versions.py -->
8686

ai_edge_torch/generative/README.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -57,13 +57,13 @@ Once you re-author the model and validate its numerical accuracy, you can conver
5757
For example, in the `generative/examples/test_models/toy_model_with_kv_cache.py`, you can define inputs for both signatures:
5858

5959
Sample inputs for the `prefill` signature:
60-
https://github.com/google-ai-edge/ai-edge-torch-archive/blob/1791dec62f1d3f60e7fe52138640d380f58b072d/ai_edge_torch/generative/examples/test_models/toy_model_with_kv_cache.py#L105-L108
60+
https://github.com/google-ai-edge/ai-edge-torch/blob/853301630f2b2455bd2e2f73d8a47e1a1534c91c/ai_edge_torch/generative/examples/test_models/toy_model_with_kv_cache.py#L105-L108
6161

6262
Sample inputs for the `decode` signature:
63-
https://github.com/google-ai-edge/ai-edge-torch-archive/blob/1791dec62f1d3f60e7fe52138640d380f58b072d/ai_edge_torch/generative/examples/test_models/toy_model_with_kv_cache.py#L111-L114
63+
https://github.com/google-ai-edge/ai-edge-torch/blob/853301630f2b2455bd2e2f73d8a47e1a1534c91c/ai_edge_torch/generative/examples/test_models/toy_model_with_kv_cache.py#L111-L114
6464

6565
Then export the model to TFLite with:
66-
https://github.com/google-ai-edge/ai-edge-torch-archive/blob/1791dec62f1d3f60e7fe52138640d380f58b072d/ai_edge_torch/generative/examples/test_models/toy_model_with_kv_cache.py#L133-L139
66+
https://github.com/google-ai-edge/ai-edge-torch/blob/853301630f2b2455bd2e2f73d8a47e1a1534c91c/ai_edge_torch/generative/examples/test_models/toy_model_with_kv_cache.py#L133-L139
6767

6868
Please note that using the `prefill` and `decode` method conventions are required for easy integration into the Mediapipe LLM Inference API.
6969
<br/>
@@ -78,7 +78,7 @@ The user needs to implement the entire LLM Pipeline themselves, and call TFLite
7878

7979
This approach provides users with the most control. For example, they can implement streaming, get more control over system memory or implement advanced features such as constrained grammar decoding, speculative decoding etc.
8080

81-
A very simple text generation pipeline based on a decoder-only-transformer is provided [here](https://github.com/google-ai-edge/ai-edge-torch-archive/blob/main/ai_edge_torch/generative/examples/c%2B%2B/text_generator_main.cc) for reference. Note that this example serves as a starting point, and users are expected to implement their own pipelines based on their model's specific requirements.
81+
A very simple text generation pipeline based on a decoder-only-transformer is provided [here](https://github.com/google-ai-edge/ai-edge-torch/blob/main/ai_edge_torch/generative/examples/c%2B%2B/text_generator_main.cc) for reference. Note that this example serves as a starting point, and users are expected to implement their own pipelines based on their model's specific requirements.
8282

8383
#### Use MediaPipe LLM Inference API
8484

@@ -105,7 +105,7 @@ model-explorer 'gemma-2b.tflite'
105105

106106
<img width="890" alt="Gemma-2b visualization demo" src="screenshots/gemma-tflite.png">
107107

108-
For an end-to-end example showing how to author, convert, quantize and execute, please refer to the steps [here](https://github.com/google-ai-edge/ai-edge-torch-archive/blob/main/ai_edge_torch/generative/examples/README.md)
108+
For an end-to-end example showing how to author, convert, quantize and execute, please refer to the steps [here](https://github.com/google-ai-edge/ai-edge-torch/blob/main/ai_edge_torch/generative/examples/README.md)
109109
<br/>
110110

111111
## What to expect

ai_edge_torch/generative/examples/README.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -22,10 +22,10 @@ For each of the example models, we have a model definition file (e.g. tiny_llama
2222
Here we use `TinyLlama` as an example to walk you through the authoring steps.
2323

2424
#### Define model's structure
25-
https://github.com/google-ai-edge/ai-edge-torch-archive/blob/e54638dd4a91ec09115f9ded1bd5540f3f1a4e68/ai_edge_torch/generative/examples/tiny_llama/tiny_llama.py#L43-L74
25+
https://github.com/google-ai-edge/ai-edge-torch/blob/853301630f2b2455bd2e2f73d8a47e1a1534c91c/ai_edge_torch/generative/examples/tiny_llama/tiny_llama.py#L46-L77
2626

2727
#### Define model's forward function
28-
https://github.com/google-ai-edge/ai-edge-torch-archive/blob/e54638dd4a91ec09115f9ded1bd5540f3f1a4e68/ai_edge_torch/generative/examples/tiny_llama/tiny_llama.py#L79-L101
28+
https://github.com/google-ai-edge/ai-edge-torch/blob/853301630f2b2455bd2e2f73d8a47e1a1534c91c/ai_edge_torch/generative/examples/tiny_llama/tiny_llama.py#L79-L104
2929

3030
Now, you will have an `nn.Module` named `TinyLlama`, the next step is to restore the weights from orginal checkpoint into the new model.
3131

@@ -37,12 +37,12 @@ place to simplify the `state_dict` mapping process (`utilities/loader.py`).
3737
The user needs to provide a layer name tempelate (TensorNames) for the source
3838
model. This tempelate is then used to create an updated `state_dict` that works
3939
with the mapped model. The tensor map includes the following fields:
40-
https://github.com/google-ai-edge/ai-edge-torch-archive/blob/3b753d80fdf00872baac523dc727b87b3dc271e7/ai_edge_torch/generative/utilities/loader.py#L120-L134
40+
https://github.com/google-ai-edge/ai-edge-torch/blob/853301630f2b2455bd2e2f73d8a47e1a1534c91c/ai_edge_torch/generative/utilities/loader.py#L94-L109
4141

4242
The fields that have a default value of `None` are optional and should only be
4343
populated if they are relevant to the model architecture. For `TinyLlama`, we
4444
will define the following `TENSOR_NAMES`:
45-
https://github.com/google-ai-edge/ai-edge-torch-archive/blob/e54638dd4a91ec09115f9ded1bd5540f3f1a4e68/ai_edge_torch/generative/examples/tiny_llama/tiny_llama.py#L27-L40
45+
https://github.com/google-ai-edge/ai-edge-torch/blob/853301630f2b2455bd2e2f73d8a47e1a1534c91c/ai_edge_torch/generative/examples/tiny_llama/tiny_llama.py#L30-L43
4646

4747
With the `TensorNames` defined, a user can simply use the loading utils to load
4848
an instance of the mapped model. For instance:
@@ -59,7 +59,7 @@ using a few input samples before proceeding to the conversion step.
5959

6060
### Model conversion
6161
In this step, we use the `ai_edge_torch`'s standard multi-signature conversion API to convert PyTorch `nn.Module` to a single TFLite flatbuffer for on-device execution. For example, in `tiny_llama/convert_to_tflite.py`, we use this python code to convert the `TinyLLama` model to a multi-signature TFLite model:
62-
https://github.com/google-ai-edge/ai-edge-torch-archive/blob/3b753d80fdf00872baac523dc727b87b3dc271e7/ai_edge_torch/generative/examples/tiny_llama/convert_to_tflite.py#L22-L53
62+
https://github.com/google-ai-edge/ai-edge-torch/blob/853301630f2b2455bd2e2f73d8a47e1a1534c91c/ai_edge_torch/generative/examples/tiny_llama/convert_to_tflite.py#L26-L61
6363

6464
Once converted, you will get a `.tflite` model which will be ready for on-device execution. Note that the `.tflite` model generated uses static shapes. Inside the generated `.tflite` model, there will be two signatures defined (two entrypoints to the model):
6565
1) `prefill`: taking 2 tensor inputs `prefill_tokens`, `prefill_input_pos`. With shape `(BATCH_SIZE, PREFILL_SEQ_LEN)` and `(PREFILL_SEQ_LEN)`.

ai_edge_torch/generative/layers/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,4 +43,4 @@ Currently, the library provides the following configuration class for you to cus
4343

4444
## High-Level function boundary for performance
4545
We introduce High-Level Function Boundary (HLFB) as a way of annotating performance-critical pieces of the model (e.g. `scaled_dot_product_attention`, or `KVCache`). HLFB allows the converter to lower the annotated blocks to performant TFLite custom ops. Following is an example of applying HLFB to `SDPA`:
46-
https://github.com/google-ai-edge/ai-edge-torch-archive/blob/3b753d80fdf00872baac523dc727b87b3dc271e7/ai_edge_torch/generative/layers/attention.py#L74-L122
46+
https://github.com/google-ai-edge/ai-edge-torch/blob/853301630f2b2455bd2e2f73d8a47e1a1534c91c/ai_edge_torch/generative/layers/attention.py#L74-L122

0 commit comments

Comments
 (0)