Artificial Intelligence AI Data and Insight

Data weaponization: The war AI ignited

Creativ Strategies

|

Open Mic article

This content is produced by a publishing partner of Open Mic.

Open Mic is the self-publishing platform for the marketing industry, allowing members to publish news, opinion and insights on thedrum.com.

Find out more

January 18, 2024 | 6 min read

AI’s rapid rise revealed the value of big data, says Wes Morton (CEO, Creativ Strategies)

A secret war is afoot, ignited by the promise of artificial intelligence (AI).

In the well-cushioned boardrooms of silicon valley, executives plot, litigate, and jockey to possess the fuel of the future. These synthetic fuels are not the AI or machine learning systems so fawned over in the media, but the oceans of data that feed them.

Media moguls, tech entrepreneurs, and financial titans have emerged to viciously guard their newly discovered data goldmines.

A multi-trillion dollar market driven by the promise of the insights, advantages, and riches that new forms of machine learning can distill from vast troves of proprietary data.

The battleground has been set in 2024. The winners will be the set of competitors that can combine the most valuable datasets with the cutting edge artificial intelligence that can mine them for information.

In the information economy, data has become the arsenal and AI the gun. Welcome to data weaponization.

Data weaponization

In a digital world, every action you take is tracked, logged, and stored. Over the last decade digital tracking and collection have allowed companies to amass monstrous amounts of data.

The information includes your consumption preferences, spending habits, entertainment preferences, driving history, credit history, medical history, search history, social media history, travel history, criminal record, voting record, financial records, medical records, employment record, down to your DNA.

Despite the mass tracking, the sheer volume of ‘big data’ proved unwieldy for corporations and governments to manage. The 2023 explosion of AI onto the collective consciousness of the business community changed that. AI generated a visceral epiphany in the boardrooms of global corporations on the value of the previously fallow data repositories. Previously static mountains of collected data suddenly gained immense value as a fuel for the digitized future.

The branch of AI known as ‘machine learning’ combines statistics, algorithms, feedback, and massive computing power to mine vast troves of data in order to distill trends, correlations, and patterns. At the same time, executives began to realize that the semi-magical systems known as AI are as useful as a new born baby without ample data sets to train on.

Open AI, Midjourney, Stable Diffusion, Dream Up, Bard, DALL-E, Anthropic, and other now billion dollar brands built their models on both closed and open data sets that their systems mined and trained on. Unfortunately, much of that open data was being mined, unknowingly, in other companies’ backyards.

The collective awakening to the data gold, excavated and pillaged using machine learning, has ignited a mad rush of lawsuits, positioning, and strategic alliances as companies square off to protect their newly discovered fuel of the future, big data.

The war spills into the public sphere

The distinction at the heart of recent AI corporate strife lies in data ownership.

To build a force you need both firepower and the means to dispense it. The combination of data and AI act as arsenal and gun. You can’t fire a gun without any bullets. One has no value without the other. The most valuable datasets produce the most firepower for the corresponding AI delivery mechanisms.

Some of those data sets are closed, owned, and only accessible to certain companies. The emails you write on Gmail, the Recaptcha images you identified, and the searches you input on Google, trillions of terabytes of consumer data, are the owned and operated property of Alphabet. You have consented, by accepting their terms of service you didn’t read, to using your data to train their AI models. Their service is ‘free’ because it extracts value from your human input, filters your inputs through machine learning models, and then sells you things based on their model outputs.

Meanwhile, open data has begun to fuel lawsuits and corporate positioning. Most notably the New York Times sued Open AI and their new owners at Microsoft for allegedly using millions of articles from the news publication to develop chatbots that now compete with it. The New York Times claims massive copyright infringement. Class action lawsuits by authors whose novels were scrapped for training data have similarly filed suit.

On the generative art front, in which AI trains on billions of images to recombine them into originals, a federal judge delivered a mixed-ruling against a group artists who claimed their pieces were used to train the AIs of Midjourney and Deviant Art but allowed an infringement claim to move forward against Stable Diffusion.

For publishers, an antitrust suit against Alphabet claims damages from Google’s use of search AI, which scrapes publisher content to display it in Google owned products. The products such as the ‘Knowledge Panel’ and ‘Featured Snippets’ deliver quick answers to questions by surfacing content from 3rd party sites. Google makes ad revenue from search ads while the creators of the content receive no traffic to their site, and thus no compensation.

Similar battles are unfolding across courts in America. According to John Kell at Fortune, “There are over 100 AI-related lawsuits working their way through the legal system”. In addition, state and federal governments have begun holding hearings on the new technology in anticipation of regulation. These actions threaten to stifle AI’s development as companies lock up valuable troves of data that models rely on and governments regulate the application of AI.

Ramifications for consumers

The tension between AI models and the publisher, medical, financial, and consumer data that provide its fuel will dramatically affect the technology brought to market.

The public war coming into focus poses fundamental questions that beg answering.

Do we feel comfortable using services that train their AI models on our behavior and inputs?

What constitutes ownership of creative work, when AI generates derivatives?

Should AI companies be forced to compensate writers, journalists, and artists and how?

What sensitive information are we comfortable sharing and making accessible to build better products and experiences?

How will new AI systems audit how and why actions were taken towards individuals?

What limits should we place on AI systems?

The fundamental determination of AI’s future will crystallize in a stark binary. Will we aim our vast arsenals of data and AI firepower at solving society's problems or turn its sights on ourselves for self-exploitation?

For consumers, as with products and services before it, we have the power to vote with our dollar. The mass collection of our personal data for AI’s commercial use has already begun. The train has left the station with us and our data on board. Where we collectively steer this runaway train is up to us all.

*Header image designed by Cole de Brito*

Artificial Intelligence AI Data and Insight

Trending

Industry insights

View all
Add your own content +