Exllama

7mos agorelease 0 00

exllama is a memory-efficient tool for executing Hugging Face transformers with the LLaMA models using quantized weights, enabling high-performance NLP tasks on modern GP...

Collection time:
2024-09-02
ExllamaExllama
Exllama
exllama is a memory-efficient tool for executing Hugging Face transformers with the LLaMA models using quantized weights, enabling high-performance NLP tasks on modern GPUs while minimizing memory usage and supporting various hardware configurations.

No comments

none
No comments...