MSSP Metrics: Key Risk Indicators (KRIs)

Key risk indicators (KRIs) are used to measure future adverse impacts of events and activities. They are widely used in areas such as healthcare, operations, and disaster risk management. KRIs use existing system and security sensor data to calculate residual risk due to IT operations.

The inputs are similar to a combination of SIEM, GRC, and threat intelligence systems; the output is continuous, objective, actionable metrics. With easy-to-understand and security-posture relevant metrics, technology leaders can design measurable goals and communicate the status and health of security operations to business leaders for decision making purposes.

A platform that supplies KRIs approaches risk measurement differently from traditional systems:

  • Observed Behaviors Across the Enterprise. The platform collects information on actual events observed in the enterprise – not theoretical “possibilities” based on predictive analytics. The concept of false positives does not exist because only real, live events reported by sensors are observed and go into the calculation of a risk indicator.
  • Residual Risk. Even with all cyber defenses in optimal configuration, risk factors ebb and flow throughout enterprise systems. The platform collects evidence and calculates risk factors, tracking and reporting on residual risk. This is the risk that really matters, and not, for example, theoretical risk due to “intelligence” of activity on the internet.
  • Normalization, Quantification, and Context. The platform applies machine learning and advanced statistical analysis to determine what is normal, what is important, and provides context around both of those in the form of calculations or reports.
  • Continuous, Objective. Audit results tell a story of compliance around a point in time. Manually compiled reports are unreliable. KRIs measure risk and activities in real-time directly from sensor outputs, offering a continuous and consistent view of activities independent of interpretations, audit schedules or quarterly reports.

KRIs can “roll up” the stream – which is more like a fire-hose – of sensor data into easy-to-understand metrics:


What vulnerabilities are you susceptible to, and what steps do you need to take to resolve those issues?

The key here is not to bludgeon or overwhelm your customers with problems. There might be dozens of new vulnerabilities being discovered on a daily basis, but only one or two of them are actually relevant to the customer’s critical IT environment. (On the other hand, if all of them really are noteworthy, then you have bigger issues to deal with.)

Threat Intelligence

What threats in the wild are you susceptible to, and what steps do you need to take to resolve those issues?

Your approach to threat intelligence should be very similar to how you approach vulnerabilities. Customers should be able to easily answer questions such as:

  • Is there any evidence that I have been breached by one of the well-known threats?
  • Does my MSSP regularly conduct threat-hunting missions in my environment? How many of these missions have occurred in the past week or month?
  • Is my MSSP finding evidence of data breaches or attacks against my industry peers?

New Threats

What are the most recent threats to appear in the cyber security landscape?

One extremely important IT security metric is the number of new threats that the customer has faced recently. If this figure is steadily increasing or has seen a rapid spike from normal levels, it is a strong indication that the customer is on the receiving end of a targeted attack, or will be in the near future.

New threats are often more dangerous because clients and MSSPs likely do not have a prescription ready to handle them.

Defense Effectiveness

Are attempts to shut down a threat or eliminate a vulnerability effective?

When you try to patch a security flaw, you need to know that you have resolved the problem and that it will not continue to resurface every few days or weeks. Just like treating an illness, successfully handling cyber security issues means that you identify the root cause of the matter, instead of just addressing the symptoms, then remove it.

Severity and Velocity

Is your environment getting better or worse in terms of the pace and intensity of threats?

Of course, any company above a certain size will be the target of probes and attempted attacks from malicious actors, and many of them will already have fallen victim to a breach. However, an increase in the leading indicators of attack activity, such as the pace or severity of events, is a clear sign of a targeted attack or client industry-wide focused campaign.

Metrics that show severity and velocity will allow you to easily pinpoint the concentration of this kind of attack, allowing you to react quickly.

Surface Area

Is there a dramatic increase or decrease in the number of hosts showing activity on your network? Are there any major critical or high events against my critical monitored assets?

A significant spike in the number of hosts showing defense activity, such as a tenfold increase from 100 to 1,000, could indicate that attackers are broadening their scope. Of course, it could also be due to a benign event, such as the acquisition of a new company and the merger of the two networks. Whatever the reason, such an anomaly needs to be identified and examined by security experts.

Another consideration for surface area is the location of your critical assets. You would expect that parts of the network such as the DMZ are more susceptible to would-be attackers trying to expose issues and create holes. These events are definitely important, but far more important would be something like a sudden increase in the number of high and critical events against your customer’s financial data. In this case, you should look into the problem immediately and find out what’s going on.

Architectural Maturity

Are your tools working to identify, detect, protect, respond and recover as you need them to?

MSSPs might fill out an OWASP Cyber Defense Matrix on their clients, to keep track of how well each customer’s security architecture is performing and compliance framework is being covered. This will give you a clear picture of where you are, what you are missing and how you will resolve it along the way.

Why FourV and OPAQ Joined Forces

I’m pleased to be writing this post as a member of OPAQ Networks, following the announcement today that FourV Systems has become part of the OPAQ family. Our two companies share a common focus — empowering MSPs and MSSPs with security automation to help them gain greater visibility and control while substantially simplifying the management of their customers’ network and security architecture.

FourV’s patented GreySpark solution provides continuous security metrics, compliance monitoring and reporting. And the OPAQ security-as-a-service platform integrates comprehensive enterprise-grade security capabilities with a private software-defined network backbone. Together, we’re delivering the single most effective and efficient tool that MSPs and MSSPs can use to:

  • Identify what security controls should be prioritized;
  • Manage and enforce best-of-breed network security controls; and
  • Demonstrate and communicate the value of security services to technical and non-technical decision makers

Beyond this natural technology “fit”, several other factors convinced FourV’s management that we could achieve goals more quickly as part of OPAQ.

OPAQ’s platform is built to address a market that we at FourV also believed is both underserved and critically important – the midsize enterprise. These companies often find challenges in applying the personnel and financial resources needed to acquire, deploy, and manage the type of security infrastructure required to properly fend off today’s advanced threats. OPAQ’s cloud platform levels the playing field, packaging their best-of-breed security platform in a way that is accessible for midsize enterprises while also making it simple for service providers to manage.

OPAQ’s leadership team and support teams are also extremely experienced in our space. Glenn Hazard and Ken Ammon certainly ‘get it’ when it comes to the intersection of business and technology needs of service providers and the midsize enterprises they support.

The FourV solution serves as a complementary addition to the OPAQ cloud platform. An assessment of the security operations performance and compliance maturity is often the first step MSPs and MSSPs need to take with their clients in order to provide trusted recommendations to reduce risk and exposure. We could not be happier that we are now a part of an organization whose platform enables those MSPs and MSSPs to meet the needs of their clients by giving them the ability to instantly deploy and manage enterprise grade security.

Want to learn more? See how simple it is to get started with OPAQ.

KPI v. KRI v. KCI: Key Cyber Security Indicators

Companies that have spent significant resources and money on managing their cyber security environment understandably want to know the results of all this expenditure. As such, it is important for Managed Security Service Providers (MSSPs) to be able to provide customers with some visibility into those results. However, results only tell you half the story. For instance, they may demonstrate that there was a breach, but, without significant forensic effort, will not necessarily provide the sequence of events or failures which led up to the compromise.

Organizations are complex and have many performance measures. Most have designated key performance indicators (KPIs) at various levels of the organization, which business management agrees are the most important metrics to monitor. They are designed to be leading indicators of business performance. Key risk indicators (KRIs) are similar in that they are leading indicators; however, rather than signal performance, they signal increased probability of events that have a negative impact on business performance. Then there are key control indicators (KCIs), that are closely related to KRIs in that they measure the effectiveness of risk controls.

Business managers use KPIs to show where things are going well or poorly and KRIs to indicate when the probability of the latter is increasing. KCIs are a measure of how well risks controls are performing. MSSPs can, and should, do the same using security data which is commonly available for most of their clients.

More on KPIs, KRIs and KCIs

You may hear these terms used interchangeably; however, they are distinctive and should be treated differently in order to make them understandable.

  • Key performance indicator (KPI): Shows how the business is performing based on the goals and objectives leadership has set as well as the progress that is being made toward those goals. For security operations, this metric might be used in an effort to resolve open items or tackle a backlog of unresolved security investigations.
  • Key risk indicator (KRI): Measures the company’s level of risk, and how its risk profile changes over time. An example for security operations is to use metrics that measure the severity of threats and vulnerabilities being reported by sensors. Another example is  to look for places in the security defensive chain events that are happening (e.g. end-point-based events are more “risky” than firewall or WAF events). Finally, make sure you have a good understanding as to business-role the assets involved play. Security events that occur on critical assets present more risk than those on noncritical ones.
  • Key control indicator (KCI): Indicates how much control a company has over its environment and its level of risk, or how effectively a particular control is working. Putting this in context with IT security operations, a question to ask is whether you have the necessary controls across all areas of the business – for example, the NIST Cyber Security Framework functional areas (identify, protect, detect, respond and recover). Knowing that these functions have sufficient coverage throughout your defense in depth (devices, applications, networks, data and users) gives you a degree of confidence in your controls.

How to Use These Metrics

The interplay between the performance, risk and control metrics is the key feedback that an organization needs in order to be confident that investments in cyber security are appropriate. Now that we have defined the appropriate use for the individual metrics, let’s see some examples of how to apply them:

  • Risk is the probability of bad things happening applied to the business cost of it happening. You can calculate an estimate of the probability by looking at the number, place (where in the defense in depth model) and severity of events measured by sensors. For the impact, or real cost, look at which hosts are involved. Are they where the crown jewels are kept, or more of an extra store-room full of old furniture? Faced with so much data, organizations can be afflicted with “analysis paralysis,” so simplify these measures into risk metrics everyone can understand.
  • Performance metrics are meant to show how efficient an organization is at accomplishing its mission. In cyber security, the mission happens to be risk mitigation. So performance is how well you manage your backlog of open security cases, time to resolution, etc. with respect to the staff and systems you have. There are significant parallels to customer support metrics in this category.
  • Controls mitigate risks and enable performance. In cyber security, technical (security sensors) and process controls are your bread and butter. They also generate the data that drive risk metrics and allow you to optimize performance. Compliance measures are your friend here. Measure your degree of coverage against a framework such as NIST CSF.

Generating the metrics here seems like a daunting task at first. But, once you start simplifying and categorizing the measures, you will find that you can come to a reasonable set quickly. Then you need to automate their calculation. With experience, you will learn whether you’ve chosen the right KPIs and KRIs, and you can make adjustments as necessary. Getting started can be a challenge for MSSPs, but it’s 80 percent of the battle.

The most important thing to remember is that the statistics coming out of your cyber security systems are not KPIs, KRIs or KCIs. They are just data. Decide what risk performance, risk or control measures you need in order to clearly explain metrics of security operations to the business you support.

Test these on business managers to make sure they resonate, adjust and go again. The more consistent and transparent your measures, the more confidence your clients will have in their security investments.

Putting KPIs, KRIs and KCIs into Practice

On one hand, you have a large amount of security data – the proverbial big data problem. On the other hand, you need actionable output – a list of what to do now to transform your clients’ security programs into a high performance business driver. Metrics will guide your path to success, but generating consistent and reliable information security metrics is hard. So here are a few steps to get you started.

Step 1: Understand your Coverage, Operations, and Compliance Challenges

Security operations involves a set of functions being performed across a set of assets. The NIST Cyber Security Framework (CSF) provides a core list of the functions and the Cyber Defense Matrix from OWASP does a fine job of aligning those functions against a representative set of assets. Categorizing the deployed security products or processes in your client’s environment within the matrix will establish coverage and identify gaps in the program’s architecture.

Operationalizing the matrix by collecting, identifying and assigning the output data from your security products to each cell in the matrix shows evidence of operations and serves as your first step in addressing the ‘big security data’ problem.  Gaps between what you thought you had deployed and what actually shows up as evidence of operations will provide you with an immediate ‘to-do list’.  

Applying a control framework (such as CIS Top 20, GDPR, or FFIEC) adds depth to each of the intersections by mapping specific security controls to both deployed security products and your client’s assets.  The resultant overlay identifies gaps in your compliance effort and your second ‘to-do list’.  When combined with your operational to-dos, the entire list can be mapped to a 30, 60, 90-day plan of action with key milestones. Wash, rinse and repeat for each of your lines of business or departments, and you now have a path for your journey.

Step 2: Measure your Efficacy 

With security products and processes deployed and more on the way as you move down your path, it is time to measure the effectiveness of each action and ensure its alignment with the business.  Recall that operationalizing the matrix served as the first step in solving the big data challenge by categorizing the data and applying business context through the assets in the matrix and each line of business or department.

Enriched with this context, the security data can now be normalized and analyzed to produce key metrics, or as we called them earlier, KPIs, KRIs and KCIs. Examples include the speed of new threats or vulnerabilities for KRIs, the treatment of symptoms or root causes for KPIs, or the reduction defensive workload for KCIs.

With metrics in place, each to-do on your journey can be seen as a resultant change in one or more metrics. What’s more, the value of fixing operational to-dos or implementing a specific control can be measured and communicated specific to the business context it affects. At each milestone on the journey, thresholds for metrics can be set to determine success or identify needed adjustments in the plan.

Final Thought

It’s all about the journey. A successful information security program is not an end-state, but a continually monitored and adjusted compilation of people, process and technologies. Mapping the program’s functions with your client’s assets and required controls provides you the steps needed to mature your program while metrics will keep you honest about how well the program is performing.

Considering Compliance in the Cloud

Gates Marshall is Director of Cyber Services at CompliancePoint. He has many years of experience in information security consulting with expertise across secure architectural design, vulnerability and penetration testing, OWASP, forensics, incident response, GDPR, FISMA, MARS-E, and cryptographic control design and implementation.

OPAQ: What exactly do we mean these days by “cloud compliance” versus other security and compliance topics?

GM: In some respects, there is not a big difference between on-premise and the cloud. HIPAA or PCI standards don’t make special exceptions for the cloud. The rules apply the same everywhere. There are also some cloud-specific compliance solutions out there like CloudeAssurance or CSA Star Certification, which allow organizations to achieve a quantifiable rating on compliance. Yet for a lot of things, being compliant in the cloud is not much different than having a data center somewhere or a colocation provider.

A significant problem is that when people sign on with a cloud service provider (CSP), they sometimes think they are outsourcing the due diligence aspect of compliance. Google, Microsoft and Amazon have a number of certifications, but these are to certify their own services. They are not certifying that their merchants and other customers are compliant in any specific client-level implementation.

OPAQ: There are some differences, though, right?

GM: The way you can configure systems in the cloud is different than a traditional on-premise installation. For instance, take PCI DSS, which is a fairly prescriptive standard for merchants. It calls for having a separate demilitarization (DMZ) zone from your LAN to isolate and protect credit card data with a firewall. CSPs may support other mechanisms, like AWS security groups, to facilitate a similar functionality; however doing so still doesn’t meet all of the compliance requirements for a DMZ.  So organizations are using these new cloud services, but they are missing some of the requirements as relates to architecture controls and/or logical segmentation.

OPAQ: How would you describe the level of security and compliance support at the major cloud providers?

GM: They do quite a bit to reduce the burden of compliance. Most of them produce good documentation to declare what we call a service provider controls responsibility matrix.  It shows what the provider is doing around compliance and that helps because it both reduces the burden on the customer and declares where the customer’s remaining responsibilities begin. Security at the large CSPs has improved a lot, for instance with services like Amazon CloudWatch for monitoring. All the major providers now have good auditing capabilities for the management interface and offer multifactor authentication. These developments give customers more confidence in the cloud.

OPAQ: Is security protection in the cloud as good as or better than an enterprise on-premise environment?

GM: We tend to have an affinity toward legacy configurations in the on-premise world.  By that, meaning we set it up and it works and we never change it. It’s security via obscurity. When you go through the transformation process to become a cloud-first organization, you need to fix all those legacy issues that were acceptable in the LAN environment. You can’t be so sloppy. Cloud providers may be less secure than on premise, however, because you’re letting someone else manage the Layer 1 infrastructure. The physical addressing and networking and storage configurations now fall on the CSP. They may have weaknesses that you don’t know about and the customer has to depend on third-party attestations. Hypervisor hopping has been a concern for a while. If a CSP’s hypervisor technology has a flaw, a malicious actor could jump between different customers’ VM guests through the hypervisor. There aren’t any disclosed examples of this happening, but it’s always a risk in a multi-tenant environment.

OPAQ: Yet most if not all of the massive breaches in recent years have been in on-premise environments, right?

GM: While this is true, many of these breaches could have taken place in the cloud. Equifax had a real problem with inventory because they didn’t have visibility into the software that should have been patched. That scenario could have also occurred with a CSP. Vulnerability management is critical in any implementation. Accenture did have an issue in the cloud recently, which could have been disastrous. In October, it was discovered that the global consulting firm had left an AWS S3 storage location unsecured, leaving over 100GB  of customer data accessible without authentication by anyone on the Internet with the correct S3 URL.  The insecure configuration of Amazon S3 could also apply to on-premise technologies.  No matter where your data sits, IT needs to secure the location against exploitable configurations and software flaws.

OPAQ: Do you foresee more regulation in the area of cloud compliance and security?

GM: Yes. The EU’s General Data Protection Regulation (GDPR) has huge potential to change a lot of things in tech. It goes into enforcement in 2018, and may become a global standard for privacy. GDPR applies to any organization that uses the data of people who are in the EU at the time of data collection. Two key principles of GDPR are that companies and organizations should use data minimization to keep the smallest amount of data possible and use consent mechanisms to ensure they’re authorized to hold or use that data. If you have 10 million customer records, but determine that you only need to keep two million records and purge the rest, your risks go down. If a breach occurs, there is less data loss and lower costs to mitigate the impacts of the loss. Information privacy is the next frontier. The large CSPs realize that if they don’t get in front of this, they will lose business. This will require that CSPs look closely at the leading cyber risk rating mechanisms, and adopt one or two of them. I think we’ll also see more CSPs provide guidance on how to meet global data security and privacy requirements in an effort to help customers help themselves.