Tag Archives: publiclyavailable

Google Slapped With Lawsuit Over Data Used To Train Its AI

The seemingly sudden explosion of publicly-available chatbots that utilize very capable large language models (LLMs) raised uncomfortable questions about the nature of copyright and how creators can be properly involved in (or, at least, compensated for) the AI-training process. At the heart of the matter are the datasets used to train various AI models, which can include everything from content scraped from random blogs to scientific journals, libraries of published books, social media platforms, and more. Some companies that wield vast quantities of human-generated content like Reddit and Twitter have scrambled to ensure they’re paid for the info.

While big companies fight with lawsuits, there are many people indirectly swept up in the matter who don’t have the resources to individually challenge tech giants, which is where class action lawsuits may come into play. It’s no surprise, then, that Google is facing a proposed class action suit that wants, among other things, for the company to hit pause on providing commercial access to its AI models. 

The legal action comes from Clarkson Law Firm, and one of the attorneys on the case, Tim Giordano, explained the reasoning in a statement to CNN: “Google needs to understand that ‘publicly available’ has never meant free to use for any purpose. Our personal information and our data is our property, and it’s valuable, and nobody has the right to just take it and use it for any purpose.” Alphabet, Google, and DeepMind haven’t commented on the lawsuit at the time of writing.