The degeneration of Generative AI

Table of Contents

Toggle

Reading Time: 4 minutes

A python snaking around to chew its own tail. That’s the late nineteenth-century daydream the German chemist August Kekulé had, while contemplating interconnectivity amongst the six carbon atoms in the backbone of the benzene molecule.

Something very similar is happening in the AI space today: Python (of the computer code kind) is at play in the AI-eat-AI world. While Generative Artificial Intelligence has enabled AI models to create remarkably authentic and creative outputs in art, music and human-like text generation, the decorations are beginning to undrape. Degeneration of Generative AI, aided by model collapse is here. Any mishandling could kill the tech’s potential; expertise is absolutely essential.

Backstory incoming

AI is quite an all-encompassing term. Machine learning – empowering computer systems with the ability to learn from examples – is a major component of AI today. Machines that are programmed to learn from examples are called neural networks. One chief mode of learning for machines is the provision of plenty of examples to learn from. By watching several well-tagged images of an automobile, the network learns to distinguish the defining features of an automobile vs the rest, in an image.

Language models are another type of neural network trained on large volumes of text so they are able to predict what word is statistically likely to come next. ChatGPT – the beloved baby of the Generative AI tribe – is trained on about 45 terabytes of text, roughly one fourth the Library of Congress.

Cue the confetti, it’s Generative AI time

Generative AI describes algorithms that aid in the generation of content: text, image, audio, video, code etc. A generative model takes what it has learned from the examples it’s been fed and creates something entirely new based on that information: hence the word “generative!” Large language models (LLMs) are one type of generative AI since they generate novel combinations of text in the form of natural-sounding language. OpenAI’s ChatGPT and Alphabet’s Bard were originally trained using predominantly human-generated text scraped from the Internet, and fine tuned using further human input

But, can quality content be created in such a reductionist manner, by mimicking “patterns and relationships between different types of data?” Content that is generated as an average of its mathematical inputs will be just that: average. Reason why music, movies and other creative media are flush with me-too content, imitating industry hits in a formulaic fashion; they all look and feel about the same. Even as the replication of a blockbuster authenticates the aspiration and longing to produce a hit, somewhere away from the radar’s gaze, innovative and emotion-evoking content breaks through to be a pack leader. The rest just jump on to the bandwagon to perpetuate the cycle.

Digital incest

Increasingly, online content is being created by the AI models themselves, leading to (1) the production of inaccurate data and content and (2) the feeding of this incorrect data, for training future models. When AI models learn from machine-generated rather than human-created data, major degradation happens within just a few iterations, even when some of the original data is preserved, reports a recent European study. Errors in optimization, limited models and finite data lead to the low-quality synthetic data. Over time, mistakes snowball, distorting models’ perception of reality.

Benzene whose cyclic structure Kekulé’s snake-eats-its-tail dream inspired to elucidate, was found to be a cancer-causing agent. The AI-eat-AI metastasis that’s beginning to play out now is just as carcinogenic to creativity, if not monitored and mentored.

Malignancy’s debut: A grim tango

Studies from Stanford and UC Berkeley have exposed the model drift that’s causing a stark overall decline in ChatGPT’s abilities – in March 2023 vs June 2023. The performance and behavior of GPT-3.5 and GPT-4 displayed huge variations in solving math problems, answering sensitive questions, generating executable programming codes and providing visual reasoning.

AI’s redemption dance
Multiple approaches can be envisioned, to solve this problem in training LLMs. The first-mover advantage approach stresses that access to the original human-generated data source be preserved. The use of data tainted with errors infuses flaws into models’ learning processes, resulting in a skewed understanding of reality. As time progresses, these misinterpretations amplify, enfeebling the utility of AI – a thought that resonates deeply with our core beliefs and practices at Vue.ai.

Differentiating AI-generated data from human-produced data is difficult. Hence, a community-wide coordination approach must also be considered to ensure that the different entities involved in LLM creation and deployment share the information needed to determine the source of data. It’s also important to ensure that the minority groups from the original data are fairly represented in subsequent datasets, not merely in terms of quantity, but also in terms of their distinctive attributes.

The shadows of model collapse loom large as AI models devour machine-generated content. Robust data collection, accurate annotation and community-wide coordination are urgent to fend off the haunting abyss and salvage this transformative technology.

Tags: Generative AIModel collapse

2 years ago

Vue.ai

Next Demystifying Data: Making Numbers Approachable »

Previous « 4 Ways To Rack Up That Basket Size: Start With Home Page Personalization

The Hidden Costs of Manual Meter Reading: Why Your Utility Company Can’t Afford to Wait for Automation

Manual meter verification has long been the norm for many utility companies. But this legacy process does more harm than… Read More

8 months ago

Enterprise AI

Vue.ai and Xponent.ai Announce Strategic Partnership to Scale Enterprise AI Adoption in AUS/NZ

January 23, 2025 – San Francisco, CA – Vue.ai, a leader in AI-orchestration and Sydney (Australia) based Xponent.ai, building and… Read More

1 year ago

Enterprise AI

Vue.ai to Transform Document Processing and Workflow Automation with Gen AI on Microsoft Azure

San Francisco, CA — November 12, 2024 Vue.ai, a leading AI orchestration platform, is proud to announce the launch of… Read More

1 year ago

Intelligent Automation

From Weeks to Days: AI-Powered Precision in Loan Automation

The old-school, paper-heavy loan processing methods are like using a flip phone in the age of smartphones—outdated and frustrating. Customers… Read More

1 year ago

Growth and Partnerships

Vue.ai joins hands with SimpliFI Consulting to amplify AI orchestration across Middle Eastern financial institutions

SimpliFI Consulting, founded by Jinesh Gosar, a banking veteran from the MENA region, selects Vue.ai, an enterprise data and AI orchestration… Read More

1 year ago

Growth and Partnerships

Vue.ai + Decimal: No-Code AI Powerhouse for BFSI Enterprises

We are excited to announce our new partnership with Decimal Technologies, a leader in the BFSI sector, to accelerate digital… Read More

1 year ago

The degeneration of Generative AI

Backstory incoming

Cue the confetti, it’s Generative AI time

Digital incest

Malignancy’s debut: A grim tango

Related Post

Recent Posts

The Hidden Costs of Manual Meter Reading: Why Your Utility Company Can’t Afford to Wait for Automation

Vue.ai and Xponent.ai Announce Strategic Partnership to Scale Enterprise AI Adoption in AUS/NZ

Vue.ai to Transform Document Processing and Workflow Automation with Gen AI on Microsoft Azure

From Weeks to Days: AI-Powered Precision in Loan Automation

Vue.ai joins hands with SimpliFI Consulting to amplify AI orchestration across Middle Eastern financial institutions

Vue.ai + Decimal: No-Code AI Powerhouse for BFSI Enterprises

Headline