yarn

yarn

YaRN: Efficient Context Window Extension of Large Language Models

jquesnellejquesnelle
1472 stars
121 forks
Python

MCP Relevance Analysis

Relevance Score40/100 - Related Relevance

Summary

yarn is a related relevance project related to Model Context Protocol. It has 1472 stars and 121 forks on GitHub.

Key Features

  • MCP integration capabilities
  • AI context management
  • Language model communication
  • Structured data processing

Use Cases

  • Enhancing LLM context handling
  • Improving model response quality
  • Building more effective AI applications

README

YaRN#

This repo contains the code and data for the YaRN context window extension method.

Paper#

Paper (ICLR 2024): YaRN: Efficient Context Window Extension of Large Language Models
Old Preprint (arXiv)

Models#

LLaMA#

We publish variants of Llama 2 fine-tuned with YaRN at 32K, 64K and 128K context window length. They are available under the Llama 2 license on 🤗 Hugging Face.

In addition, we also publish 8K context window versions of Llama 2 7B fine-tuned with NTK-aware and YaRN (Table 1 in the conference paper).

Mistral#

With the release of v2 of our paper we are also publishing 64K and 128K variants of Mistral 7B v0.1.

SOLAR#

The SOLAR 10.7B v1.0 model utilizes depth-up scaling to add layers to Mistral 7B v0.1, which may potentially improve long context performance on a per-parameter basis. We publish 32K and 64K variants.

Reproduction#

We strongly believe in open science, and thus publish all code and data to reproduce the results in our paper. To reproduce, clone the repository and perform a local installation.

python
git clone https://github.com/jquesnelle/yarn
cd yarn
pip install -e .

Training#

To train the models, run accelerate config and enable DeepSpeed acceleration. deepspeed/zero3.json was the configuration file used for training.

sh
# ./train.sh

The tokenized training data is available on 🤗Hugging Face and was derived from the pg19 dataset. For the Mistral models, a mix of the pretrain and fine-tune splits of Long-Data-Collections was used and the tokenized dataset is also available on 🤗Hugging Face.

Evaluation#

To reproduce the evaluations, install lm-evaluation-harness with pip install git+https://github.com/EleutherAI/lm-evaluation-harness and then run the two provided scripts.

sh
# ./eval.sh
# ./eval-harness.sh

Citation#

@inproceedings{
      peng2024yarn,
      title={Ya{RN}: Efficient Context Window Extension of Large Language Models},
      author={Bowen Peng and Jeffrey Quesnelle and Honglu Fan and Enrico Shippole},
      booktitle={The Twelfth International Conference on Learning Representations},
      year={2024},
      url={https://openreview.net/forum?id=wHBfxhZu1u}
}