Commit 4d74ac0
committed
fix: add warning for TinyStories n_ctx mismatch
TinyStories models were trained with sequence length 512, but HuggingFace
config claims n_ctx=2048. This causes performance degradation for sequences
>512 tokens. Added warning to alert users of this limitation.
Note: We cannot change n_ctx in the config because the pretrained weights
have positional embeddings for 2048 positions. Changing n_ctx would break
weight loading.
Fixes #4921 parent 7df72ff commit 4d74ac0
1 file changed
Lines changed: 9 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1970 | 1970 | | |
1971 | 1971 | | |
1972 | 1972 | | |
| 1973 | + | |
| 1974 | + | |
| 1975 | + | |
| 1976 | + | |
| 1977 | + | |
| 1978 | + | |
| 1979 | + | |
| 1980 | + | |
| 1981 | + | |
1973 | 1982 | | |
1974 | 1983 | | |
1975 | 1984 | | |
| |||
0 commit comments