WEBINAR
GigaOm Radar Report for PTaaS: How to Make a Smarter Investment in Pentesting
WEBINAR
GigaOm Radar Report for PTaaS: How to Make a Smarter Investment in Pentesting

When Generative AI Goes Wrong: Security Lessons from 8 Top Artificial Intelligence Incidents

Like any major technology, generative AI holds both promise of great benefits and potential for great risks. As generative AI applications have become more prevalent, so have news stories of disastrous artificial intelligence incidents.

Beyond making for interesting read, these stories can help security teams gain insight into AI risks and suggest mitigation strategies. In this article, we'll review 8 top AI incidents that illustrate the risks of generative AI and the issues today's security teams must address. Furthermore, you can explore the risks associated with AI systems with the AI Incident Database for a more comprehensive view of different incident reports related to AI.

With that in mind, let’s take a look at real-world examples of AI chatbots going astray and explore how security teams are responding to these emerging AI technologies.

  1. Researchers expose Microsoft Copilot vulnerabilities
  2. Gemini accused of accessing files without permission
  3. Amazon Rufus exposed to jailbreaking
  4. Amazon driverless taxis crash
  5. Air Canada sued for chatbot misinformation
  6. Microsoft Azure AI tells New Yorkers to break the law
  7. Los Angeles Unified School District educational chatbot implodes
  8. Deepfakes go phishing

1. Researchers Expose Microsoft Copilot Vulnerabilities

In February 2023, Microsoft launched Bing Chat, later rebranded as Copilot, now used by 50,000 businesses. This August, at the Black Hat USA 2024 conference, former Microsoft senior security architect Michael Bargury revealed 15 Ways to Break Your Copilot, illustrating how Microsoft's chatbot could be manipulated to impersonate users, bypass authentication checks, enable prompt injections, exfiltrate data, and wreak other mischief. He followed up by showing how attackers could exploit Copilot to locate sensitive data, exfiltrate it without leaving logs, and launch phishing attacks without victims reading emails or clicking links.

Focusing on the risk of prompt injections, Bargury showed how an offensive security toolset called power-pwn could jailbreak Copilot and alter a parameter or instruction. For example, an attacker could add an HTML tag to an email to swap bank account numbers without leaving any visible trace of compromise.

Bargury used his presentation to highlight the need for what he called "anti-promptware" security tools to mitigate prompt injection vulnerabilities. While he noted Microsoft was working to mitigate Copilot risks, he stated that no out-of-the-box solution for detecting prompt injections currently exists. Meanwhile, 50,000 organizations using Copilot remain vulnerable to a disaster waiting to happen.

2. Gemini Accused of Accessing Files without Permission

In another incident involving Gemini, Facebook privacy policy director Kevin Bankston complained on X that Gemini summarized a tax document stored in Google Docs without him requesting it. Bankston was unable to find a setting to disable Gemini integration with Google Drive and unable to get Gemini to identify the setting. Another X user helped Bankston find the setting, but it turned out it was already disabled.

In response, Google insisted to TechRadar Pro that Gemini requires explicit user activation to enable it in Google Workspace and that content generated by prompts is not stored without permission. Bankston and others have speculated about the explanation for his experience, with no definitive answer.

Bankston's experience illustrates the risk of generative AI tools accessing data without explicit authorization. Large language models typically have a degree of autonomy to execute functions and interact with other apps, creating a risk of excessive agency when the model performs undesired actions. Excessive agency can be mitigated by steps such as limiting functionality to necessary operations, restricting permissions, performing authentication checks, and requiring human approval.

3. Amazon Rufus Exposed to Jailbreaking

In February 2024, Amazon launched Rufus, a generative AI shopping assistant. Rufus was intended to answer shopping questions, make recommendations, and locate products. Netflix senior software engineer Jay Phelps disclosed on X that he could get Rufus to answer JavaScript coding questions. His post inspired Lasso Security researchers to investigate Rufus vulnerabilities, leading to the discovery that the chatbot's prompt and security controls could be bypassed by accessing their instructions.

Amazon Rufus illustrates how LLM systems are vulnerable to jailbreaking by designing prompts to bypass security measures. For example, users may manipulate LLMs by positing hypothetical scenarios, insisting on purely logical deductions without practical considerations, or posing as users with higher access permissions.

Preventing AI jailbreaking requires careful architecture planning and strong security measures. Mitigation strategies include access controls, authentication checks, and offensive security tests.

4. Amazon Driverless Taxis Crash

In April 2024, as Amazon prepared to roll out its Zoox automated driving system (ADS) taxis, two test Zoox vehicles braked suddenly in front of motorcyclists in separate incidents in San Francisco and Las Vegas suburb Spring Valley. The incidents prompted a National Highway Traffic Safety Administration (NHTSA) investigation.

Zoox combines vehicle sensors with machine learning to perceive objects, predict trajectories of vehicles and surrounding objects, and plan driving decisions. In these incidents, Zoox made poor decisions. Both collisions involved Toyota Highlander vehicles and trailing motorcycles. Both occurred in daylight under conditions within Zoox operational design limits. Both motorcyclists initially sustained minor injuries, while the Zoox operator in Spring Valley reported lower back pain and tightness.

These incidents illustrate a larger concern with AI-powered vehicles. Autonomous vehicles were involved in 3,979 reported collisions between June 2019 and June 2024, including 473 in 2024, according to a study by Craft Law Firm. Most incidents involved Tesla vehicles, and most occurred in California. Injuries have resulted in 10% of incidents, and fatalities in 2%, with 83 fatalities as of June 17, 2024.

Researchers led by State University of New York engineering professor Chunming Qiao recently probed vulnerabilities in AI driverless cars and found significant security concerns. Qiao's team discovered that AI models could be deceived by using physical hacks such as foil radar masks to pollute data being collected from vehicle sensors. This represents a unique type of LLM training data poisoning attack that driverless vehicle manufacturers must find ways to address.

5. Air Canada Sued for Chatbot Misinformation

AI hallucinations triggered legal trouble when the Air Canada Chabot gave a customer false information about a discount. Traveling after the death of his grandmother, Vancouver resident Jake Moffatt asked if the airline offered a bereavement discount. The chatbot told him that he could book a full-fare flight and apply for a bereavement discount up to 90 days later.

Unfortunately, the chatbot misrepresented Air Canada's actual policy, which required pre-flight submission of a discount request. The airline refused to honor the discount. Moffatt took the case to court, where Air Canada argued that its chatbot was a separate legal entity liable for its own actions.

The British Columbia Civil Resolution Tribunal wasn't impressed with this defense, finding that it was "obvious" Air Canada was responsible for information on its own site. Moffatt was awarded $812.02 in damages and fees. This finding was consistent with other rulings in Canada and the United States establishing a precedent that businesses are liable for automated tools acting as their agents.

6. Microsoft Azure AI Tells New Yorkers to Break the Law

New York City's MyCity app, powered by the Microsoft Azure AI platform, helps residents access services and benefits. Users can check eligibility status, submit applications, track services, and store personal data and documents. However, MyCity also has been found to give business owners legal misinformation.

For example, in response to a query, MyCity advised New York City landlords that they're not required to accept tenants on rental assistance. In fact, the law prohibits New York City landlords from discriminating against tenants who receive government assistance. Other legally inaccurate advice included advising bosses they were entitled to a cut of workers' tips, counseling store owners they didn't need to accept cash, and telling business owners they had to serve violent customers.

MyCity's problems appear to represent another example of AI hallucinations, but it's difficult to know the underlying cause because Microsoft and New York City officials have not responded to critics with detailed explanations. The MyCity portal currently advises visitors, "As a beta product still being tested, it may occasionally provide incomplete or inaccurate responses. Verify information with links provided after the response or by visiting MyCity Business and NYC.gov. Do not use its responses as legal or professional advice nor provide sensitive information to the Chatbot."

7. Los Angeles Unified School District Educational Chatbot Implodes

In March, the Los Angeles Unified School District (LAUSD) launched a district-wide chatbot, Ed, promising to provide students with round-the-clock customized support in dozens of languages. Ed was developed by LAUSD in conjunction with Harvard Innovation Labs under the venture AllHere, whose CEO Joanna Smith-Griffin promised students and families that Ed would serve as a "trusted co-pilot for their educational journeys."

In April, AllHere senior director of software engineering Chris Whiteley blew the whistle to LAUSD officials on privacy concerns with Ed's platform. Whiteley disclosed that prompts containing student information were being shared with third-party companies and processed through overseas servers, violating AllHere's contract.

Three months and $3 million after Ed's launch, Smith-Griffin was gone as CEO, AllHere was laying off workers, parts of AllHere's website were not there, and the company was up for sale. Los Angeles canceled its contact with AllHere and began looking to acquire the company or find an alternative chatbot.

The swift rise and fall of Ed illustrates how AI privacy protection failures can cripple a company. For AI privacy best practices, OWASP recommends using a "need to know" principle to limit sensitive data access to stakeholders who need it.

8. Deepfakes Go Phishing

Deepfake video, audio, and text have been around for several years, and over the last year, their impact has been increasingly felt. For instance, in August 2023, custom software developer Retool saw one of its clients lose $15 million to a phishing attack employing deep fakes.

The attack started with an SMS email campaign that lured one employee into visiting a malicious website. After the employee supplied their credentials, the perpetrators used a deep fake voice cloning call to pose as an IT team representative and gain access to multi-factor authentication codes. This set the stage for the attackers to add a device to the employee's account, gain access to the company's Google account, escalate privileges, change usernames and passwords, and control one-time passwords.

Retool traced the issue to a Google update that changed a multifactor authentication setting to single-factor authentication. This attack illustrates both the potential for malicious use of generative AI and the need to monitor contracts with third-party contractors for updates to policies and terms and conditions.

The case of Retool points to the growing role of deep lakes in cybersecurity incidents. Deloitte’s analysis of FBI data projects that fraud assisted by deepfakes and generative AI could cost victims $40 billion annually by 2027. CEOs and financial services providers such as banks are primary targets. To combat this, companies should work with security professionals to integrate deepfake detection tools into their anti-money laundering (AML) and know-your-customer (KYC) procedures.

Avoid AI Disasters with Cobalt Pentesting

Some AI disasters are inevitable, but many can be avoided by following AI security best practices. For example, the Open Worldwide Application Security Project (OWASP) has developed guidelines for mitigating the top 10 LLM and Generative AI vulnerabilities.

The Cobalt platform makes it easy for security teams to collaborate with our network of professional pentesters and quickly identify security gaps before attackers discover them. Connect with Cobalt to discuss how we can help you secure your AI environment and keep disasters from disrupting your business.

Back to Blog
About Andrew Obadiaru
Andrew Obadiaru is the Chief Information Security Officer at Cobalt. In this role Andrew is responsible for maintaining the confidentiality, integrity, and availability of Cobalt's systems and data. Prior to joining Cobalt, Andrew was the Head of Information Security for BBVA USA Corporate Investment banking, where he oversaw the creation and execution of Cyber Security Strategy. Andrew has 20+ years in the security and technology space, with a history of managing and mitigating risk across changing technologies, software, and diverse platforms. More By Andrew Obadiaru