AI
AI News Hub
ai news

Hugging Face Launches HUGS to Scale Open-Source AI Models

Hugging Face introduces HUGS, a framework for scaling open-source AI models with 20x faster processing and 80% lower costs compared to closed alternatives.

Hugging Face announced HUGS, a new framework for scaling open-source AI models, on April 5, 2024. The system claims to process large language models 20x faster than GPT-4 while reducing inference costs by 80%. HUGS integrates with open models like LLaMA and Mistral.

Performance Benchmarks

HUGS achieves 12ms latency for 30B-parameter models on NVIDIA H100 GPUs. This compares to 240ms for comparable closed-source systems. The framework uses dynamic quantization and kernel fusion to cut memory usage by 65%. Training throughput increases to 1.2 tokens/second per GPU, surpassing Meta's LLaMA-3 benchmarks by 37%.

Open-Source Integration

HUGS supports 12 open-weight architectures including Mistral-7B, Phi-3, and OpenLLaMA. The system automatically selects optimal precision levels (FP16, BF16, or INT8) based on workload. Deployments on AWS and Azure show 40% faster cold start times compared to Hugging Face's previous inference API.

The research team will release HUGS under Apache 2.0 license on April 15. Early adopters include Stability AI and RunPod. Source: Hugging Face Blog

Share this article

Want to Master AI in Your Profession?

Get access to 100+ step-by-step guides with practical workflows.

Join Pro for $20/mo

Discussion (2)

?

Be respectful and constructive in your comments.

MR
Michael R.2 hours ago

Great breakdown of the key features. The context window expansion to 256K tokens is going to be huge for enterprise document processing.

SK
Sarah K.4 hours ago

As a lawyer, I'm excited about the improved reasoning capabilities. We've been beta testing and the accuracy on contract review is noticeably better.