Across the security industry, we’re seeing the effects of advances in AI technology, and Claude Code Security's recent announcement is no exception. The market, as volatile as it is these days, seems to think this is a turning point—as we saw several major SaaS security stocks tank on the date of Anthropic’s announcement.
This use case for AI isn’t new. However, there is an army of new companies and products that aim to use LLMs for scanning code, attacking applications, and recommending fixes, all in the same pipeline. Where does the hype meet reality? Let’s dig in.
First, it’s important to understand that the stock market dip in security stocks was not a direct cause and effect—Claude Code won’t replace all your security investments. Anthropic’s new offering won’t replace CrowdStrike or Palo Alto Networks—these aren’t direct competitors at all—but those stocks did dip after their announcement. More likely, this is about available capital, and how it’ll be allocated. We’ve been seeing budget compression across all of security over the last few years. More likely, this is about investors waking up to potential convergence, which is driving further compression of security budgets.
AI-powered scanning tools like those from Anthropic, Aikido, XBOW, and every other high-valued startup that will be hitting the RSA floor in just a couple of weeks have one thing in common: They are all using LLMs to attempt to replicate the human creativity and novelty inherent in vulnerability research using machines.
These AI tools are great at finding known vulnerabilities because they were trained on that data. But they can also find classes of vulnerabilities they’ve been trained on in both code and through DAST-like fuzzing. In my own experience, these technologies are great at uncovering the low-hanging fruit vulnerabilities that don’t require context to understand.
I was playing with the Claude Code Security implementation, and once it finishes, it runs a second skill or agent that reviews the output and checks for issues such as hallucinations and false positives—essentially having the AI check its own homework. When it works, it works really well, but there are disadvantages that are inherent in how these tools are built.
Where these types of AI scanning products start to fall apart is in repeatability. By the nature of LLMs, these tools are non-deterministic—meaning that every time you run them, they may approach the problem differently, producing different results. One researcher documented this by running it over 50 times and plotting which vulnerabilities it identified for each run. Long story short, sometimes it would identify all the issues, and other times it would miss some here and there. Humans aren’t much different, but we’ve learned how to balance the art and science of vulnerability research through standardized methodologies, coverage checklists, and peer review. This is also why traditional SAST, DAST, and IAST vendors are not going anywhere: Their value add is showing progress over time in a predictable, repeatable way. It’s also important to understand the unpredictable cost of AI-powered tools. Token cost is just as non-deterministic as the tools themselves.
So can we have our cake and eat it too? Using these tools helps us cover more bases, find more bugs, and fix faster—but we have to be able to show real risk reduction to leadership, not just some low-severity bugs fixed.
Combining the power of AI with that of traditional scanners, along with the human ingenuity and context of highly skilled pentesters, is the way forward. In fact, our research shows that elite pentesters like those found in our Cobalt Core are remarkably good at finding the complex vulnerabilities that automated pentesting misses: 54% of these human pentesters have discovered a Zero-Day in their career. That helps explain why just 1% of our Core believes that autonomous/AI pentesting is the most effective pentesting model at finding the highest-risk flaws, while 58% say PTaaS is the gold standard for finding those vulnerabilities, many times more than bug bounties or AI.
Thankfully, that’s what we do at Cobalt with our human-led, AI-powered offensive security platform and services.

