Skip to content

Commit 0adb5b8

Browse files
authored
Fix ffn link (#892)
* Fix ffn link * Apply suggestion from @rasbt * Apply suggestion from @rasbt
1 parent 7ca7c47 commit 0adb5b8

File tree

2 files changed

+2
-2
lines changed

2 files changed

+2
-2
lines changed

ch04/07_moe/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -98,7 +98,7 @@ uv run plot_memory_estimates_moe.py \
9898
 
9999
## MoE Code Examples
100100

101-
The [gpt_with_kv_moe.py](gpt_with_kv_moe.py) and [gpt_with_kv_moe.py](gpt_with_kv_moe.py) scripts in this folder provide hands-on examples for comparing the regular FFN and MoE memory usage in the context of a GPT model implementation. Note that both scripts use [SwiGLU](https://arxiv.org/abs/2002.05202) feed-forward modules as shown in the first figure of this page (GPT-2 traditionally uses GELU).
101+
The [gpt_with_kv_ffn.py](gpt_with_kv_ffn.py) and [gpt_with_kv_moe.py](gpt_with_kv_moe.py) scripts in this folder provide hands-on examples for comparing the regular FFN and MoE memory usage in the context of a GPT model implementation. Note that both scripts use [SwiGLU](https://arxiv.org/abs/2002.05202) feed-forward modules as shown in the first figure of this page (GPT-2 traditionally uses GELU).
102102

103103
**Note: The model is not trained and thus generates nonsensical text. You can find a trained MoE in the bonus materials at [../../ch05/11_qwen3/standalone-qwen3-moe-plus-kvcache.ipynb](../../ch05/11_qwen3/standalone-qwen3-moe-plus-kvcache.ipynb).**
104104

reasoning-from-scratch

0 commit comments

Comments
 (0)