Artificial hallucinations and AI-generated plagiarism
By Dr Fazal Ali
AI has re-imagined films, used the voices of
deceased musicians to create new tracks, and can
now make the sound of a brand “liquid”.
Using AI, we can now create a patentable “sonic
identity” for any product or service. Moreover,
“sonic identity” allows a brand to take any shape
according to the desired customer experience.
AI enables sound to shift its shape in our highspeed, digitally connected world. Amp is an AIcentred music company. It was recently acquired
by Landor and Fitch, which is part of the WPP
advertising group.
Amp generates all kinds of sounds. Short bursts
of noises when an app launches, the sound of a
credit card transaction on completion,
compositions for podcasts, and social media
posts.
Using machine learning the DNA of the
“composition” is verified to ensure that it does
not resemble an arrangement already in use.
Then the algorithm checks if the signature
patterns in the composition are likely to be
memorable.
Once the DNA is established, the role of the AI is
to allow the owner to generate infinite remixes
from this base DNA using different tempos,
durations, moods and rests depending on the
context. Brands are no longer mute.
Recently, using AI techniques similar to those
used to create the artificial voices on the viral
track “Heart on My Sleeve”, the Canadian lo-fi
singer Grimes has created a digital likeness of her
voice that can be used by anyone to create new
music using a liberal licensing scheme.
Shadow libraries are also named “pirate
libraries” because they often infringe on
copyrighted work. This has opened schools,
universities, think tanks, and policy labs to the
threat of AI-generated plagiarism.
“Over the Bridge” has created an album of AIgenerated “lost tracks” by musicians who passed
away at a young age. The tracks are not re-mixed
versions of previous songs—they are new tracks
created by computers and gifted audio engineers.
This new AI-generated album, “Lost Tapes of the
27 Club” features four tracks in the musical
styles of Kurt Cobain, Amy Winehouse, Jimi
Hendrix, and Jim Morrison. Using Magenta,
“Over the Bridge” analysed MIDI files of selected
works of the artists.
AI sifted out each artist’s melodies, harmonies,
and rhythmic choices into synthesised
recreations, and then small snippets were
isolated and woven into new songs. More than
ever, it is becoming increasingly critical to
identify if content is AI-generated.
Shadow libraries are now at the heart of
copyright infringement lawsuits and the
unauthorised use of content to train some AI
platforms. Shadow libraries are online databases
that offer access to articles and texts that are out
of print, hard to obtain, and those that are
paywalled.
Shadow libraries are also named “pirate
libraries” because they often infringe on
copyrighted work. This has opened schools,
universities, think tanks, and policy labs to the
threat of AI-generated plagiarism.
It is not unlikely that there may be instances
where AI is used to produce content but
unbeknownst to the user it may be drawing upon
copyrighted treatises, dissertations, books, and
manuscripts housed in shadow libraries.
Governments have taken action against shadow
libraries and those involved have been charged
with criminal copyright infringement, wire fraud,
and money laundering. But as the sites are taken
down, mirrors appear.
In an authoritative peer-reviewed scientific
journal in the field of medical sciences, a
reference in the bibliography offers a hyperlink
with a Uniform Resource Locator or URL. A URL is
the address of a particular resource on the World
Wide Web.
On clicking the URL, with a recognisable domain
name, the startling response was—“not found.”
Probing this conundrum deeper revealed that the
first author cited in the publication, who is
presently an established academic at a
recognised university, had no knowledge of the
research or its findings.
Likewise, the sixth author in the citation was
befuddled. The citation contained in the
bibliography of the publication was totally
fabricated—an example of an artificial
hallucination. Librarians and examination
syndicates are underscoring and highlighting the
production of fake citations when using AI as a
research assistant.
In New York, an advocate for 30 years provided
the court with submissions that contained at
least six fake judicial decisions, containing false
quotes and bogus internal citations, according to
the judge presiding over the matter in the
Southern District of New York in an order.
Several of the purported cases that were cited in
the pleadings did not appear to exist to either the
judge or defence team.
It is now possible for human actors to come upon
an artificial hallucination or a confabulation or
delusion as a confident output of an AI system.
A hallucination is an output that is unfaithful to
the provided source content. These responses
cannot be justified by the training data used to
build the AI platform. An AI hallucination is
presently described as the pointless embedding
of plausible random falsehoods.
AI hallucinations can result from insufficient
training data in some cases, but it is also likely
that some “incorrect” AI responses classified by
humans as “hallucinations” may, in fact, be the
“correct” answer that human reviewers are
unable to favour or comprehend.
The architects of generative AI have erected a few
guardrails to avoid the worst of these
hallucinations.
However, many in the field remain unsure about
whether or not AI hallucinations are a solvable
problem.
Dr Fazal Ali completed his Masters in Philosophy
at the University of the West Indies, he was a
Commonwealth Scholar who attended the
University of Cambridge, Hughes Hall, provost of
the University of T&T and the acting president
and chairman of the Teaching Service
Commission. He is presently a consultant with
the IDB.