MIT Technology Review December 18, 2024
Melissa Heikkilä, Stephanie Arnett

New findings show how the sources of data are concentrating power in the hands of the most powerful tech companies.

AI is all about data. Reams and reams of data are needed to train algorithms to do what we want, and what goes into the AI models determines what comes out. But here’s the problem: AI developers and researchers don’t really know much about the sources of the data they are using. AI’s data collection practices are immature compared with the sophistication of AI model development. Massive data sets often lack clear information about what is in them and where it came from.

The Data Provenance Initiative, a group of over 50 researchers from both academia and industry, wanted...

Today's Sponsors

Venturous
ZeOmega

Today's Sponsor

Venturous

 
Topics: AI (Artificial Intelligence), Big Data, Technology
Professionalism 2026: Data Driven Doctor in ‘Most Innovative’ Arena
AI May No Longer Require Big Data Centers to Scale
5 Data Ethics Principles Every Business Needs To Implement In 2026
How Do I Know if My Health Data Is Bad?
Data Benefits AI; AI Benefits Data

Share Article