Back to Basics: How to Build Resilient Blue Teams

Building strong defensive cyber capabilities isn’t easy. Even established organizations often struggle to maintain effective security programs with the resources available. It’s even harder to build and scale a defensive cyber capability that can ‘keep up’ with the needs of an organization going through large-scale change. Coping with big business pivots can be done, though — with careful planning and execution. In this guide, we’ll explain how to strengthen your defensive cyber capabilities (otherwise known as blue teams) so they are up to the task of building resilience and maintaining high security standards. But first, a definition:

'A blue team is a group of individuals who perform an analysis of information systems to ensure security , identify security flaws, verify the effectiveness of each security measure, and to make certain all security measures will continue to be effective after implementation.’ — Wikipedia

The term blue team originates from the military, as does its opposite, the red team. The blue team/red team approach was historically used for physical security, but as with many military concepts, it translates well to cybersecurity.

Blue teams work to identify threats and defend against them. In a cyber context, blue teams manage everything from preventative technologies like firewalls and EDR tools to vulnerability management, incident response, security operations, and more.

Red teams emulate an attacker or threat to test the efficacy of defensive measures. This is known as ‘force preparedness’ — testing prevention and response capabilities to make sure they fully address a given threat.

When blue teams and red teams are brought together, you get… a purple team.

A purple team is an exercise that involves both teams working closely together to maximize each other’s strengths and capabilities to drive improvement. It’s a constant feedback loop. Knowledge transfer makes both teams better and more efficient. But that’s a subject for another time. First, you need a blue team that can safeguard your organization — both now and in the future.

Graph 1

In times of business transformation, going back to basics can keep your team grounded and focused on what matters most in the current moment. While rapid change can push teams to “wing it,” that approach cannot scale and withstand pressure for prolonged periods of time.

In this section, we’ll look at the components of a strong blue team foundation — and where you can focus if your team is finding itself swept in too many business shifts:

Ticketing System
Preventative Security
Vulnerability Management
Incident Response
Operational Visibility
Security Monitoring Playbooks

Components of a Strong Blue Team

1. Ticketing System

Many small teams work without a formal ticketing system. This is a big mistake. It leaves them with no way to track or evidence workload, and makes it hard to identify and learn from mistakes. A ticketing system will help your team:

Log and track all incoming work. This is critical for resource management and planning.
Track the origin and content of requests, so you know where the workload is coming from.
Define categories of requests and incidents. Helps to identify time spent on each activity, and what value it creates.

As you record and track this information, you can also use it to make decisions about future actions. How does this help you scale? In two important ways:

Demonstrating ROI — Technology investment is a natural part of growth for a blue team. With a ticketing system, it becomes much easier to prove the ROI of investments because you can demonstrate precisely how much resource time is saved.

Identifying Resource Issues — A ticketing system provides insight into how much work is being done and how much is outstanding. Most teams don’t have enough time to do everything, and metrics are essential to evidence the need for increased resources as the organization grows.

Workload metrics also go a long way to demonstrating maturity in the security function. The more ‘on the ball’ your team appears, the more seriously the organization will take you.

2. Preventative Security

Graph 2

Preventative security is important for any blue team. For smaller teams that don’t have the capacity to chase down every threat, it’s crucial.

Take a pragmatic approach to preventative security controls. Remember — you can’t buy everything, you need maximum bang for your buck. In business terms, that means being able to demonstrate genuine ROI.

When faced with a shiny new tool, the question to ask is: “Does this address a genuine threat that our organization faces?” Unless the answer is “Absolutely, definitely,'' it's a bad investment.

For most organizations, strong preventative security controls include:

Email Threat Detection — Phishing is still the #1 data breach and network intrusion threat across all industries. For a small team that can’t respond to every malicious email that hits a user inbox, filtering is vital.
Endpoint Protection — You know the drill. There are dozens of antivirus, firewalls, and EDR technologies on the market. Identify a combination that meets the specific needs of your organization… and doesn’t break the bank.
Multi-Factor Authentication (MFA) — Controlling access to cloud and on-premise environments using something stronger than passwords will disarm a lot of threats. MFA is light on resource costs and drastically improves security maturity.

Remember, you must consider which threats are native to your industry, vertical, and physical location. While some threats are universal, others aren’t — and you must be prepared for the threats you’re most likely to face.

How does this work as you scale? Simple: When investing in a new tool, make sure you’re fully operationalizing it before moving on. Equally, take the time to track and report on the ROI of your investment. Do this consistently and each time you expand your preventative security capabilities, you’ll maximize the benefit gained.

One final note: Avoid the pitfall of security through tool purchase. It’s difficult for small teams to handle lots of tools, and they can easily become a burden. Be specific and methodical about investing resources.

3. Vulnerability Management

Graph 3

In Cobalt’s State of Pentesting 2022 report, we analyzed data from over 2,300 pentests. For the fifth year in a row, misconfiguration of systems and services remained the top vulnerability category.

Not everything in cloud and data center environments is configured correctly. A vulnerability management (VM) program must be able to pick this up. A strong VM program should include:

Asset scanning to ensure the team is aware of everything they should be monitoring.
Vulnerability scanning to identify known issues in hardware and software assets.
Pentesting web and mobile applications, APIs, networks, etc. to make sure everything is watertight.

Of course, everybody knows they need these three functions. What people tend to forget is that VM programs need one more critical component: stakeholder buy-in and SLAs for patching and remediation.

Vulnerabilities won’t patch themselves. Once the VM team has identified issues that need fixing, they must have relationships and communication channels in place to make sure the remediation work gets done. For VM, scaling is all about automation. Automating components of stakeholder tickets and asset discovery can save engineers a lot of work, and will also improve outcome consistency.

4. Incident Response

Security incidents are inevitable. It’s not about IF they happen. It’s about WHEN they happen, and HOW you respond and communicate.

Getting incident response right as a growing blue team boils down to four things:

1. Process and Documentation

For incident response, consistency is key — incidents must be handled in precisely the same way every time. The most important steps to achieving this are:

Document your IR policy
Have detailed handler checklists
Train all incident handlers thoroughly

2. Post Mortem

Never let a security incident go to waste. Whether your response effort was successful or not, learn from every incident and work to identify and address the root cause. Don’t rely on ‘band-aid’ solutions that only address the immediate problem.

3. Test Response Capabilities

Table-top exercises are an excellent way to determine the effectiveness of your IR capabilities. Bring all stakeholders together, provide a fictional security incident, and ask them to work through it. This technique will help identify misunderstandings between stakeholders and clear up any bad assumptions about who is responsible for each stage of the response effort.

4. Notification and Communication

Even the best processes fall apart if communication is lacking. To combat this, have a formal procedure for communication and notifying stakeholders of predetermined events. Your procedure should leave no room for interpretation — if a stakeholder needs to know something, they should be notified immediately.

Scaling incident response is mainly about adding automation to the triage and response activities. Responding to many incidents will initially require a lot of manual, repetitive work, and replacing this process with automation frees up resources and drives maturity within the security function.

5. Operational Visibility

Visibility is an essential precursor to incident response. If you can’t identify when something is wrong in your environment, you can’t respond to it.

The best way to ensure operational visibility is to work backward. You can do this by:

Determining which threats are most pressing for your business.
Identifying the data sources most likely to indicate an incident is occurring (e.g., a system is compromised, or data is being exfiltrated out of your environment).

The table below shows four common sources of security data and maps them to the specific security incidents they could indicate for.

Data Source	Environment	Category	Value
AWS CloudTrail/GCP Cloud Audit Logs	Production	Product Environment	Resource/account abuse
Host Logs	Production & Corporate	Product Environment	System compromise/abuse
G Suite/Salesforce	Corporate	Critical Employee Services	Account compromise/data loss prevention
EDR / IDS	Production & Corporate	Security	Security event monitoring

Turning data into action takes more than an understanding of your environment — it requires a centralized solution to collect and analyze data. For most organizations, this is a Security Incidents & Events Management (SIEM) tool.

A SIEM is a critical tool for blue team scaling because it’s the only realistic way to aggregate and learn from the huge volume of security data being produced. To get maximum value from your investment, take the time to study and understand the data you’re collecting and what you can do with it.

6. Security Monitoring Playbooks

Every security team has piles of data and alerts. That’s not enough. What you need is a consistent approach and response to it. A security monitoring playbook should go well beyond a standard process document. Instead of focusing purely on process, it should focus on:

Expected results — what outcome do you want to achieve?
Data and queries used — which dataset will be used, and which specific query will be run?
What needs to be done — What actions will ensure the desired result every time?

Once you have a playbook in place, it should be routinely monitored for effectiveness and tuned accordingly. To help you scale, automate repetitive manual tasks to free up resources.

But what if you don’t have the capacity for all this? What if you’re so busy fighting fires you don’t have time to develop strong playbooks?

Many teams find themselves in this situation, and the answer is: If you lack resources, find an MSSP to temporarily hold down the fort while you get your house in order.

What about Compliance?

Compliance is often a driver (although generally not the driver) for blue team development. For example, a customer might insist that your organization get SOC 2 or ISO 27001 certified before they’ll work with you.

Depending on the nature of your organization, you may even need to be compliant with multiple frameworks. Fortunately, there’s a lot of overlap between frameworks, so this shouldn’t add too much additional burden. Beyond this, there are two things you need to realize about compliance:

Choosing and adhering to a major framework is essential, and not just for legal reasons; and,
Being compliant is NOT the same thing as being secure.

Once you’re compliant with a major framework, you’ve proven you can conform to a set of rules and you have some basic controls in place. Once you’ve hit this stage, you’ll need to take things up a level before you can really consider your organization secure.

OpSec: Protect What’s Most Vulnerable

As the leader of a blue team, you must know where your organization is most vulnerable. Start by identifying the critical information and systems you need to protect and work backward from there.

To do this, use threat modeling — a process for identifying and enumerating threats, and finding ways to mitigate them. A number of frameworks are available that can aid this process:

OWASP can help identify threats, vulnerabilities, and risks.
STRIDE is useful for classifying attacker objectives.
DREAD helps to rate, compare, and prioritize risks based on severity.

Stakeholder Management

Blue teams don’t exist in isolation. They need support to get things done, and that requires buy-in from stakeholders. After all, those vulnerabilities you found aren’t going to fix themselves. To help this process along, do your best to identify stakeholders preemptively.

Find out what they are responsible for and what their challenges are. Even better, find a way to help them achieve their objectives. The more you can turn stakeholder relationships into mutually supportive arrangements, the more chance your team will have of achieving its objectives.

Other tips for building and maintaining strong stakeholder relationships include:

Be clear about what’s needed and when. Don’t leave room for interpretation, and ask stakeholders to be very clear about what they need from your team.
Understand resource planning cycles. Don’t expect stakeholders to drop everything to accommodate your needs. Instead, have your needs included in routine resource planning. E.g., make sure stakeholders are aware of when routine vulnerability scans are run, and when you’ll be asking them for assistance to remediate.
Come armed with realistic estimates. Stakeholders want to know the bottom line — what effort and resources are needed to fulfill your requests? Be open and realistic.
Have commitments and SLAs with stakeholders. This is a two-way arrangement. Having commitments in place makes it much more likely you’ll consistently receive the support you need.
Agree on reporting and success metrics. Sharing the workload means sharing the credit. Sharing successes aids stakeholder relationships, and helps all parties demonstrate value to leadership.

Structuring Blue Teams for Resilience

So long as they have stakeholder buy-in, small, flat security teams work well early on. As the organization and team grow, you should consider splitting up teams to maintain focus and effectiveness. And physical structure isn’t the only consideration. Here are three more things to keep in mind:

Create alignment between your goals and the organization’s goals.

You do not want to be seen as a cost center or a hindrance. Good stakeholder management is important, but it’s equally important to be seen to forward the organization’s objectives instead of holding them back.

Have a strategy to train and keep your security talent.

If you’ve ever hired a cybersecurity professional, you’ll be painfully aware of how scarce they are. Once you have a team in place, it’s vital that you provide training and keep them engaged, challenged, and happy.

Accountability is critical to a growing team’s success.

One or two low underperforming members can damage the success of a blue team. Holding people to account might seem unpleasant — and it can be — but it’s critical to the success of the team that you identify and address shortcomings before they can undermine your operations.

Measure and Report on Progress

In the beginning, evidencing improvements in a blue team is easy. It’s all about net-new capabilities — you implement a new system or process, and you get a drastically improved outcome. Sadly, this type of exponential improvement doesn’t last very long. Once your team starts to scale, improvements become gradual. They require activities such as:

Careful collection and analysis of data to identify areas for improvement.
Tracking maturity of existing capabilities and processes.
Assessing cyber risk at least annually, and reporting on risk burn-down.

While less dramatic than the wins you celebrated early on, these activities demonstrate to leadership that you have a mature security function. And as you track the performance of your blue team, you’ll keep identifying areas for improvement.

Key Takeaways

If you don’t have time to read the entire guide (or you need a quick refresher) here are the main learning points:

Blue teams are the lifeblood of cybersecurity. But successfully scaling a blue team to meet the needs of a rapidly growing organization is far from easy.

A focus on security basics provides the foundation for future growth. Security basics are critical no matter how your organization changes.

Take a methodical and pragmatic approach to security to avoid fatigue and ineffectiveness. You can’t do everything, so don’t try.

Invest in data collection and analysis to help you make informed decisions on program effectiveness, staffing, and ROI.

Invest time and budgetary resources in automation to free up blue team members for more important work.

Nurture relationships with stakeholders. Your blue team will be nothing without them.

Learn more about the benefits of a Pentest as a Service (PtaaS) platform to empower your security team.

Secure the agentic shift and bridge the AI readiness gap with the Responsible AI Imperative white paper

State of Pentesting Report

Services Overview

By Use Case

Solutions Overview

The Responsible AI Imperative

Pentesting Pulse Report

State of Pentesting Report

Services Overview

By Use Case

Solutions Overview

The Responsible AI Imperative

Pentesting Pulse Report

Secure the agentic shift and bridge the AI readiness gap with the Responsible AI Imperative white paper

Back to Basics: How to Build Resilient Blue Teams

Components of a Strong Blue Team

1. Ticketing System

2. Preventative Security

3. Vulnerability Management

4. Incident Response

1. Process and Documentation

2. Post Mortem

3. Test Response Capabilities

4. Notification and Communication

5. Operational Visibility

6. Security Monitoring Playbooks

What about Compliance?

OpSec: Protect What’s Most Vulnerable

Stakeholder Management

Structuring Blue Teams for Resilience

Create alignment between your goals and the organization’s goals.

Have a strategy to train and keep your security talent.

Accountability is critical to a growing team’s success.

Measure and Report on Progress

Key Takeaways

About Caroline Wong

Related readings

Never miss a story

This is a title

This is a title

This is a title