Exllama

2yrs agorelease 3 00

exllama is a memory-efficient tool for executing Hugging Face transformers with the LLaMA models using quantized weights, enabling high-performance NLP tasks on modern GP...

Collection time:

2024-09-02

Open website Mobile view

LLM # LLM

Exllama

exllama is a memory-efficient tool for executing Hugging Face transformers with the LLaMA models using quantized weights, enabling high-performance NLP tasks on modern GPUs while minimizing memory usage and supporting various hardware configurations.

No comments

No comments...