Where Claude’s Security Scanning Falls Short (and Why That's Okay)

Mar 10, 2026

Est Read Time: 2 min

Across the security industry, we’re seeing the effects of advances in AI technology, and Claude Code Security's recent announcement is no exception. The market, as volatile as it is these days, seems to think this is a turning point—as we saw several major SaaS security stocks tank on the date of Anthropic’s announcement.

This use case for AI isn’t new. However, there is an army of new companies and products that aim to use LLMs for scanning code, attacking applications, and recommending fixes, all in the same pipeline. Where does the hype meet reality? Let’s dig in.

Claude Security Scanner: Changes to Security Capital Flow

First, it’s important to understand that the stock market dip in security stocks was not a direct cause and effect—Claude Code won’t replace all your security investments. Anthropic’s new offering won’t replace CrowdStrike or Palo Alto Networks—these aren’t direct competitors at all—but those stocks did dip after their announcement. More likely, this is about available capital, and how it’ll be allocated. We’ve been seeing budget compression across all of security over the last few years. More likely, this is about investors waking up to potential convergence, which is driving further compression of security budgets.

AI-powered scanning tools like those from Anthropic, Aikido, XBOW, and every other high-valued startup that will be hitting the RSA floor in just a couple of weeks have one thing in common: They are all using LLMs to attempt to replicate the human creativity and novelty inherent in vulnerability research using machines.

These AI tools are great at finding known vulnerabilities because they were trained on that data. But they can also find classes of vulnerabilities they’ve been trained on in both code and through DAST-like fuzzing. In my own experience, these technologies are great at uncovering the low-hanging fruit vulnerabilities that don’t require context to understand.

I was playing with the Claude Code Security implementation, and once it finishes, it runs a second skill or agent that reviews the output and checks for issues such as hallucinations and false positives—essentially having the AI check its own homework. When it works, it works really well, but there are disadvantages that are inherent in how these tools are built.

Drawbacks to AI Security Scanners

Where these types of AI scanning products start to fall apart is in repeatability. By the nature of LLMs, these tools are non-deterministic—meaning that every time you run them, they may approach the problem differently, producing different results.

One researcher documented this by running it over 50 times and plotting which vulnerabilities it identified for each run. Long story short, sometimes it would identify all the issues, and other times it would miss some here and there. Humans aren’t much different, but we’ve learned how to balance the art and science of vulnerability research through standardized methodologies, coverage checklists, and peer review. This is also why traditional SAST, DAST, and IAST vendors are not going anywhere: Their value add is showing progress over time in a predictable, repeatable way. It’s also important to understand the unpredictable cost of AI-powered tools. Token cost is just as non-deterministic as the tools themselves.

So can we have our cake and eat it too? Using these tools helps us cover more bases, find more bugs, and fix faster—but we have to be able to show real risk reduction to leadership, not just some low-severity bugs fixed.

Combining the power of AI with that of traditional scanners, along with the human ingenuity and context of highly skilled pentesters, is the way forward. In fact, our research shows that elite pentesters like those found in our Cobalt Core are remarkably good at finding the complex vulnerabilities that automated pentesting misses: 54% of these human pentesters have discovered a Zero-Day in their career. That helps explain why just 1% of our Core believes that autonomous/AI pentesting is the most effective pentesting model at finding the highest-risk flaws, while 58% say PTaaS is the gold standard for finding those vulnerabilities, many times more than bug bounties or AI.

Thankfully, that’s what we do at Cobalt with our human-led, AI-powered offensive security platform and services.

AI & Pentesting Pulse 2026: The 6 things elite security teams do differently to close the risk gap

State of Pentesting Report

Services Overview

By Use Case

Solutions Overview

AI and Pentesting Pulse Report 2026

Pentesting Pulse Report

State of Pentesting Report

Services Overview

By Use Case

Solutions Overview

AI and Pentesting Pulse Report 2026

Pentesting Pulse Report

AI & Pentesting Pulse 2026: The 6 things elite security teams do differently to close the risk gap

Where Claude’s Security Scanning Falls Short (and Why That's Okay)

Claude Security Scanner: Changes to Security Capital Flow

Drawbacks to AI Security Scanners

About Willa Riggins

Related readings

Never miss a story

This is a title

This is a title

This is a title