VentureBeat October 17, 2024
Shubham Sharma

Zyphra Technologies, the company working on a multimodal agent system combining advanced research in next-gen state-space model architectures, long-term memory, and reinforcement learning, just released Zyda-2, an open pretraining dataset comprising 5 trillion tokens.

While Zyda-2 is five times larger than its predecessor and covers a vast range of topics, what truly sets it apart is its unique composition. Unlike many open datasets available on Hugging Face, Zyda-2 has been distilled to retain the strengths of the top existing datasets while eliminating their weaknesses.

This gives organizations a way to train language models that show high accuracy even when operating across edge and consumer devices on a given parameter budget. The company trained its Zamba2 small language model using this...

Today's Sponsors

LEK
ZeOmega

Today's Sponsor

LEK

 
Topics: AI (Artificial Intelligence), Technology
Google digs deeper into healthcare AI: 5 notes
JP Morgan Annual Healthcare Conference 2025: What are the key talking points likely to be?
How AI Has And Will Continue To Transform Healthcare
AI Translates Nature into New Medicines | StartUp Health Insights: Week of Nov 26, 2024
Building AI trust: The key role of explainability

Share This Article