Business moats, traditionally, have been recognized as the competitive advantages that companies carve out to protect their market position and ensure long-term profitability. While these moats can take various forms, such as patents, brand reputation, or cost efficiencies, there’s an emerging perspective in the digital age—Data as a business moat. In particular, this viewpoint holds significant relevance for companies operating in the generative Artificial Intelligence (AI) sector. Why? Because the power of generative AI lies in its ability to create new, useful content or predictions from the data it’s trained on, making data a potentially invaluable asset. But the question remains—Can data truly serve as a sustainable business moat in this competitive, fast-evolving industry?
Data as the main input for Generative AI
In the realm of artificial intelligence, data is akin to fuel for an engine. It powers the algorithms that drive AI systems, enabling them to learn, adapt, and improve. The significance of data becomes even more pronounced in generative AI, where diverse and high-quality data can lead to more creative, accurate, and useful outputs. Large volumes of data allow these models to discern nuanced patterns and make fine-grained predictions or generate innovative creations. Consider the success of Google’s language translation service—it’s powered by immense amounts of multilingual text data, allowing it to make accurate translations across numerous languages. Similarly, companies like DeepMind and OpenAI use vast and diverse data sets to train their AI models, leading to breakthroughs like AlphaGo and GPT-3. ChatGPT is trained on 300 billion words, it has 570 gigabytes of text data. This indicates a clear relationship between the quality and quantity of data and the performance of AI models. Consequently, a company’s access to and effective use of data could feasibly create a powerful competitive advantage or ‘moat’ in the market.
For companies leveraging generative AI, the distinctiveness and scale of their data can indeed create a robust moat, providing a competitive edge that is hard for others to overcome. When a company possesses proprietary data—information that is unique and exclusive to them—it enjoys a significant advantage. This could be data from unique user interactions, proprietary databases, or exclusive partnerships. For example, BloombergGPT is trained partly on the proprietary data accumulated by Bloomberg, For finance tasks, it performs better than generalist tools. Furthermore, companies that possess large volumes of data can train more sophisticated AI models, as they’re able to discern more subtle patterns and make more accurate predictions. For instance, a company like Google, with its access to vast troves of search and user behaviour data, is better positioned to refine its generative AI algorithms. Additionally, businesses that have developed innovative ways to process and manage data efficiently may build a ‘technological’ moat alongside a ‘data’ moat. By mastering cutting-edge data processing techniques, they can maximize the utility of their data, creating AI solutions that are more effective and impactful.
Thus, in these ways, data and its effective use can form a formidable business moat for companies leveraging generative AI: the more data, the more sophisticated the algorithm; the more exclusive data, the more exclusive the model performance.
Data uniqueness is hard to defend
Despite the compelling reasons to consider data as a potential business moat, it’s crucial to address some counterarguments and challenges associated with it. One of the key limitations is data replicability. While proprietary data can provide a significant competitive edge, much of the data used to train AI models, particularly in the public domain, can be accessed and utilized by competitors. Thus, the exclusivity and protectability of a data moat might be weaker than traditional moats like patents. Generative AI applications lack product differentiation because they use similar models which are undifferentiated because trained on similar datasets.
Another concern revolves around data privacy and ethical considerations. As regulations like GDPR and CCPA become more stringent, the way companies collect, store, and use data is becoming more restricted. This may limit the extent to which data can be used as a competitive advantage.
Additionally, the hypothesis of data as a moat assumes a linear relationship between data volume and AI performance. However, after a certain point, the incremental benefits of more data may diminish unless complemented by algorithmic improvements or more computing power. Even more, it’s likely that for certain use cases, the model performance improvement brought by additional data might not be perceived by the human who uses the applications.
Therefore, while data can be a valuable asset in the realm of generative AI, its effectiveness as a sustainable business moat is limited by data replicability, regulations on personal data and the diminishing returns to scale of data.
Combining data with other moats
While data can indeed serve as a compelling business moat for companies leveraging generative AI, it should not be viewed in isolation. Data (even big) has no value per se, the value of data is revealed when it’s used and to be used, other assets and capabilities are required. Additionally, the most resilient and effective business moats often combine multiple assets. It’s because one source of competitive advantage is to combine different resources and capabilities
Following this reasoning, one way to leverage data and build a moat is be to combine data with other assets or capabilities. Technological innovation, for instance, is a crucial complement to a data moat. Companies that consistently innovate and improve their algorithms can extract more value from their data and maintain a technical edge over their competitors. A talent moat, characterized by a team of skilled AI researchers and engineers, can also significantly bolster a data moat by driving these technological advancements. Furthermore, brand recognition can be another reinforcing moat. If a company is recognized as a leader in AI, it can attract more users, partnerships, or data sources, thereby enriching its data moat. Consider the likes of OpenAI and DeepMind—they combine data with groundbreaking research, top-notch talent, and strong brand recognition to maintain their leadership in the AI space. Thus, while data is undoubtedly a powerful asset for companies using generative AI, it’s the synergistic blend of data with other business moats that can yield a truly robust and sustainable competitive advantage.
3 questions to use data to build a moat with generative AI
Data are necessary and useful for building a moat but are far from being sufficient. To use data to build or sustain a moat with generative AI, you might navigate in these three questions:
- what are the key elements of your defensibility? Beyond data, what are the elements which, combined, are key for your competitive advantage (patents, talents, service, scale, …)? For inspiration, you might use this typology of moats.
- How could proprietary data reinforce other moats and increase defensibility? Are proprietary data significant and valuable? How could you use them to support existing services or operations?
- How other moats could reinforce the impact of the data moat? What are the other resources and capabilities which combined with proprietary data could reduce the ability of competitors to imitate (distribution network, talents, in-house architecture, exclusive partnerships, …)?
Data is no magic wand for ensuring defensibility but can serve as a reinforcement for other moats.