Degenerative A.I.”: Business Owners Should Avoid This Pitfall in Training A.I.

The Five Most Key Takeaways from This Blog Post

A recent article by The New York Times covers and offers examples of a potential problem in the realm of artificial intelligence, which is training A.I. on A.I.-generated data, a practice that could lead to what the article refers to as “degenerative A.I.”
The potential issue rests in the fact that A.I.-generated content is generally less reliable, with “reliable” in the sense of offering accurate information. Or, in the case of something like an image- or video-generating A.I., the output content may be less reliable in the “reliable” sense of simply not doing a good job, by our metrics. (For an example of this latter type of unreliability, just check out how terribly off-base A.I. can be in generating a realistic approximation of what it would look like if Will Smith was eating spaghetti.)
On a technical level, what happens in these cases is that the A.I.’s range of possible outputs narrows considerably, so that it will more reliably be unreliable. So, if you want to generate accurate images of trees, if A.I. is trained on too many shoddy A.I.-generated representations of trees, then the range of possible non-shoddy outputs becomes slimmer, while the probability of generating shoddy outputs becomes higher. A stark example was given in that The New York Times article, where an A.I. image-generator that made pictures of seemingly handwritten numbers was trained on its own output until every number it produced was a blurry symbol not in the Arabic numerals system or, as far as this writer knows, in any alphabet of numerals.
So, in the case of using A.I.-generated content to train A.I. systems, business owners trying to custom-tailor an A.I. system to its systems could get a situation where the ignorant (A.I.-generated content) lead the ignorant (the A.I. trained on that content).
The sunny side up: if this becomes a general problem for A.I. systems, then that could potentially lessen the impact of deepfakes, as the deepfake-attempts will become more identifiably deeply fake.

Contagious Hallucination in Machine Learning

A.I. is known to “hallucinate”, even mildly, which is one of the chief issues here. Instead of machine learning, the A.I. begins machine-unlearning everything that it knows, such as e.g. how to accurately depict human fingers.

This potential training issue could lead to what will resemble contagious hallucination within an A.I. system, to the point where it will be consistently hallucinating.

So, for business owners, what can be done here?

The Best Mitigation Tip for Business Owners

This writer will lead this section with the tip: continually feed A.I. new, real raw data to counterbalance any A.I.-generated data coming in.

More and more business owners will be using customizable A.I. systems that the company itself will provide data for.

Think, for instance, of an A.I. system that a company wants to use to create business communications.

The A.I. has a better shot of conforming to the company’s “house style” if the company provides plenty of examples of

But of course, most A.I. systems are trained on far more content, especially in machine learning. The problem with the available data sets being increasingly flooded with second-rate A.I.-generated “slop” and quasi-slop is that it will become more difficult for business owners to get A.I. that actually performs the objective with dependable competency.

The Final Key Takeaway

When it becomes more difficult to train A.I., then companies will be hit with higher costs for using A.I., since the tech companies creating the A.I. will be shoveling more resources into training that A.I.

Overall, it is in everyone’s interest to ensure that tech companies observe the “garbage in, garbage out” maxim regarding the training of A.I.