VentureBeat July 26, 2024
Michael Nuñez

Salesforce AI Research this week has quietly released MINT-1T, a mammoth open-source dataset containing one trillion text tokens and 3.4 billion images. This multimodal interleaved dataset, which combines text and images in a format mimicking real-world documents, dwarfs previous publicly available datasets by a factor of ten.

The sheer scale of MINT-1T matters tremendously in the AI world, particularly for advancing multimodal learning — a frontier where machines aim to understand both text and images in tandem, much like humans do.

“Multimodal interleaved datasets featuring free-form interleaved sequences of images and text are crucial for training frontier large multimodal models,” the researchers explain in their paper published on arXiv. They add, “Despite the rapid progression of open-source LMMs [large multimodal...

Today's Sponsors

LEK
ZeOmega

Today's Sponsor

LEK

 
Topics: AI (Artificial Intelligence), Technology
Ingenious Self-Ask Prompting Technique Boosts Generative AI
How AI is making copyright issues more complicated | Devcom panel
Artificial intelligence method could advance gene mutation prediction in lung cancer
Yair Lotan, MD, on ethical considerations for AI in urology
Promise and Perils of AI in Medicine

Share This Article