Pager 3 - K2H'log

QLoRA

Efficient Finetuning of Quantized LLMs

Mamba

Linear-Time Sequence Modeling with Selective State Spaces

Mistral

Mistral 7B

LLM.int8()

8-bit Matrix Multiplication for Transformers at Scale

LLaMA 2

Open Foundation and Fine-Tuned Chat Models