How do we know what information fuels the generative chatbots that are revolutionizing our industry?
Well, in many cases, we simply don’t.
This week, we’ll look at some people who are trying to make the data that feeds our voracious robot overlords more transparent and accessible – even if they must go to court to do it.
Stanford University created a ranking to evaluate how transparent language learning models (LLMs) are, the New York Times reported.
The general answer when it comes to the major players – OpenAI, Meta, Google – is “not very.”
As the Times’ Kevin Roose wrote:
These firms generally don’t release information about what data was used to train their models, or what hardware they use to run them. There are no user manuals for A.I. systems, and no list of everything these systems are capable of doing, or what kinds of safety testing have gone into them. And while some A.I. models have been made open-source — meaning their code is given away for free — the public still doesn’t know much about the process of creating them, or what happens after they’re released.
If you want to use the most transparent LLM, you’ll want Meta’s LLaMA 2, though it earned a score of just 54% — still a failing grade under basically any rubric.
The former professor and founder of a startup he sold to Apple is now creating what he hopes will be a truly transparent generative AI tool, with its data set and code freely available to all. But some experts are concerned that openness can go too far, opening a digital Pandora’s box we can never close.
Courts continue to grapple with the thorny questions raised by the AI revolution.
Google moved to dismiss a lawsuit accusing the giant of mass data scraping that violated the rights of internet users and copyright holders.
The lawsuit, filed in San Fransisco back in July, is a sweeping condemnation of Google’s AI training procedures, arguing that Google has only been able to build its AI models by way of “secretly stealing everything ever created and shared on the internet by hundreds of millions of Americans.” AI is data-hungry, and as the lawsuit tells it, Google’s decision to feed its AI models using web-scraped data — the vast majority of it created by millions, if not billions, of everyday netizens — amounts to nothing short of theft.
For its part, Google sounds alarmed. In the search giant’s dismissal motion, filed this week, it denounces the accusations, arguing not only that it’s done absolutely nothing wrong, but that the lawsuit also undermines the generative AI field as a whole.
In other words: High stakes here. A court case to watch.
“Uptown Funk” is also at the center of a lawsuit over the use of AI – among hundreds of other popular songs. According to Reuters, AI company Anthropic stands accused of using the songs to train AI bot Claude, violating copyright laws in the process. The music field joins visual artists and authors in suing over the alleged improper use of copyrighted materials to train LLMs.
Soon, we’ll find out if uptown funk will, indeed, funk you up.
There is one constant question hanging over generative AI: Is this a bubble? Is this doomed to become a new dot-com bust or is it here to stay?
Axios reported that some of the grandiose statements made by investors make this sound like a big old balloon – one estimated that AI will perform “80% of 80% of all the jobs we know of today” in the next decade, which is a bold claim, to put it lightly. Another claimed that the global stock market will triple by the end of the decade, on the strength of AI.
AI is obviously a big deal, that’s why we’re talking about it so much. But let’s temper those expectations a bit.
But one thing is true: AI is being used in jobs that once were thought to be outsource-proof. More reporting from Axios reveals that AI is being used heavily in chain restaurants. It’s being called upon to “man” the drive-thru, fry tortilla chips and prep salads.
A long list of global governing bodies has released suggestions to consider when world governments draft AI guidelines. Now the World Health Organization has thrown in its two cents, with an emphasis on the field of healthcare.
WHO’s considerations include:
- Transparency and documentation
- Risk management practices
- External validation of data and clarity on intended use of AI
- Data quality
- Serving data and privacy protection
- Collaboration between patients, providers, governments, tech companies and more
Adobe is beefing up its AI capabilities in popular tools like Photoshop and Premiere, Yahoo News reported. These can do everything from helping remove the sky from images to removing “digital artifacts” for a smoother look to automatically creating video highlight reels.
This next one isn’t technically a communications tool, but it’s just so darn cool: AI was used to help decode an ancient scroll found buried in the Vesuvian mud that buried Herculaneum nearly 2,000 years ago, CNN reported. The scrolls were too delicate to be unrolled, but X-rays and AI helped image and digitally flatten the scrolls enough to read the first word: porphyras – purple in Greek.
Amid all the fear and worry about AI, it’s heartening to see such an innovative use of technology to connect us to our ancient past.