Imagine a world where the articles you read, the financial reports that inform major investments, and even the data labeling that trains the very algorithms we rely on are not the products of human intellect but of artificial intelligence. This is not a glimpse into a distant future; it’s the reality of today’s digital ecosystem. What are the strategic implications of this everyday bigger alternative world (and we are not talking about metaverse here)?
A parallel world where machines interact with machines
Online news. Take Buzzfeed’s bold move to employ ChatGPT for content creation, which led to a surge in their stock value. Similarly, some private equity funds owning news websites seized the opportunity to automate content production with software. This is emblematic of a broader shift toward automated content production. Machines are now adept at crafting articles that not only engage readers but also satisfy the intricate requirements of search algorithms. The result is an almost fully automated cycle: a machine generates content, another algorithm ranks it, and the revenue from the automated display ads or affiliate links flows in. This self-sustaining ecosystem captures real money, but what does it mean for the value of human-generated content?
In finance, we have long seen algorithms parse corporate communications to predict stock performance. According to research, nearly 80% of the readers of the 8K fillings are machines. As businesses catch on, they’ve begun to craft their disclosures to appeal to these non-human readers—avoiding negative terms and emphasizing positive sentiment. With generative AI, companies can more easily produce at-scale content to feed the algorithm. In a similar process as the one described for news, the algorithm is retro-engineered to design another algorithm to produce content accordingly. The first algorithm analyses the content produced and performs recommendations or even automated trades on the stock market.
In software development, some artificial intelligence algorithms require labelled data to be trained. This implies massive amount of human efforts to label data at scale (for example labelling millions of pictures to train an image recognition algorithm). Recently, researchers from EPFL concluded that 33-46% of crowd workers assigned to label data used Large Language Models when completing the task. In this case, machine-produced content (the labelled data) is used to train an algorithm which runs on the same labelled data. Another recent study found that the use of model-generated content in training causes irreversible defects in the resulting models, where tails of the original content distribution disappear.
Implications on Datanomics
The three previous cases are just examples of a much bigger phenomenon which is accelerating. How many résumés and cover letters are now produced by machines with directions to satisfy the demands of the algorithms used by the HR department to select applicants for interview? How many students use software to write assignments? How many profs are using similar software to analyse these productions and grade them?
Quality of data has never been more crucial. As AI-generated content becomes ubiquitous, discerning and ensuring the human element in data labeling and content creation is paradoxically both more challenging and more valuable. Here’s why:
- Quality Data as a Competitive Edge: When everyone uses similar AI tools, the unique, high-quality data that only humans can provide becomes a rare commodity. On another note, with good quality automated content, the value of human-produced content decreases for commoditized content but increases sharply for more specific work.
- Content Creation vs. Algorithm Understanding: Companies that understand the underlying algorithms will have a leg up. It’s a race between content creators and algorithm developers—a race where knowledge equals power. The more a company knows about how algorithms work, the more chances it has to strike a favourable balance, using the terms valued by the algorithm. It means building both human and technical capabilities.
- Transparency as a Brand Differentiator: In an age of automation, transparent operations can significantly enhance brand reputation, standing in stark contrast to those who might use AI unethically.
What should you do?
To navigate and thrive in this new automated environment where machines produce content for other machines the following steps might prove useful:
- Invest in AI and Data Literacy: Ensure that the leadership team and decision-makers undergo regular training sessions in AI and data literacy. This will empower them to make informed decisions and strategize effectively in the AI-driven content landscape.
- Prioritize Ethical AI Practices: Establish an ethics committee or board focused on AI practices. This body can provide guidance on the ethical use of AI in content generation and review processes to ensure transparency, accuracy, and fairness.
- Identify External Algorithms Exposure and Dependency: Map through your current processes which of your written production is analysed by an external algorithm and what type of impact the algorithm has on your operations and performance.
- Define the Relevant Scope for Automation and Human Action: Identify for which tasks human contribution can be used to build a difference and for which automation is relevant. In particular, discuss the opportunities to build algorithms which analyse external data to make decisions.
- Allocate R&D Budgets: Invest in internal research and development teams focused on studying, developing, and refining AI-driven content strategies. These teams can also explore potential risks and challenges in this domain.
- Collaborate with Academia and Research Institutions: Forge partnerships with universities and research centers. These collaborations can provide insights into the latest research, trends, and best practices in AI and algorithmic analysis.
The interplay of AI systems is not just a technical evolution; it’s a strategic revolution. As machines grow more adept at creating content for each other, the roles of data, content, and algorithms in value creation and competitive dynamics will be redefined. The organizations that will thrive are those that view these advancements as a catalyst for innovation and a call to elevate their ethical standards, not as a mere cost-cutting measure.
Photo de Joanna Kosinska sur Unsplash