Skip to content

Commit 9eabe76

Browse files
authored
Update README.md
1 parent 5fc3ff9 commit 9eabe76

File tree

1 file changed

+6
-0
lines changed

1 file changed

+6
-0
lines changed

README.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,12 @@ cd RWKV-v5/
1313
./demo-training-run.sh
1414
(you may want to log in to wandb first)
1515
```
16+
Your loss curve should look almost exactly the same as this, with the same ups and downs (if you use the same bsz & config):
17+
18+
![RWKV-v5-minipile](RWKV-v5-minipile.png)
19+
20+
You can run your model using https://pypi.org/project/rwkv/ (use "rwkv_vocab_v20230424" instead of "20B_tokenizer.json")
21+
1622
## RWKV: Parallelizable RNN with Transformer-level LLM Performance (pronounced as "RwaKuv", from 4 major params: R W K V)
1723

1824
RWKV is an RNN with Transformer-level LLM performance, which can also be directly trained like a GPT transformer (parallelizable). And it's 100% attention-free. You only need the hidden state at position t to compute the state at position t+1. You can use the "GPT" mode to quickly compute the hidden state for the "RNN" mode.

0 commit comments

Comments
 (0)