2021 DBIR Master's Guide

Thank you.

You will soon receive an email with a link to confirm your access. When you click to confirm from your email, your document will be available for download.

If you do not receive an email within 2 hours, please check your spam folder.

Thank you.

You may now close this message and continue to your article.

  • Hello first-time reader, and welcome to the 2021 Data Breach Investigations Report (DBIR). We have been creating this report for a while now, and we appreciate that all the verbiage we use can be a bit obtuse at times. We use very deliberate naming conventions, terms and definitions and spend a lot of time making sure that we are consistent throughout the report. Hopefully this section will help make all of those more familiar.

    VERIS resources

    The terms "threat actor," "action" and "varieties" will be referenced often. These are part of the Vocabulary for Event Recording and Incident Sharing (VERIS) a framework designed to allow for a consistent, unequivocal collection of security incident details. Here is  how they should be interpreted:

    Threat actor: Who is behind the event? This could be the external “bad guy” who launches a phishing campaign or an employee who leaves sensitive documents in their seat back pocket.

    Action: What tactics (actions) were used to affect an asset? VERIS uses seven primary categories of threat actions: Malware, Hacking, Social, Misuse, Physical, Error, and Environmental. Examples at a high level are hacking a server, malware or influencing human  behavior through a social attack.

    Variety: More specific enumerations of higher-level categories, e.g., classifying the external “bad guy” as an organized criminal group or recording Hacking action as SQL injection or brute force.

    Learn more here:

    Incident vs. breach

    We talk at length about incidents and breaches and we use the following definitions:

    Incident: A security event that compromises the integrity, confidentiality or availability of an information asset.

    Breach: An incident that results in the confirmed disclosure—not just potential exposure—of data to an unauthorized party.

    Industry labels

    We align with the North American Industry Classification System (NAICS) standard to categorize the victim organizations in our corpus. The standard uses two-to six-digit codes to classify businesses and organizations. Our analysis is typically done at the two-digit level and we will specify NAICS codes along with an industry label. For example, a chart with a label of Financial (52) is not indicative of 52 as a value. “52” is the code for Finance and Insurance sector. The overall label of "Financial" is used for brevity within the figures. Detailed information on the codes and classification system is available here:


    Being confident of our data

    Starting in 2019 with slanted bar charts, the DBIR has tried to make the point that the only certain thing about information security is that nothing is certain. Even with all the data we have, we’ll never know anything exactly. However, instead of throwing our hands up and complaining that it is impossible to measure anything in a data-poor environment, or worse, simply making stuff up, we get to work. This year we continue to represent uncertainty throughout the report figures.

    Figures 1, 2, 3 and 4 all convey the range of realities that could credibly be true. Whether it be the slant of the bar chart, the threads of the spaghetti chart, the  dots of the dot plot, or the color of the violin chart, they all convey the uncertainty of our industry in their own special way.

    The slant on the bar chart represents the uncertainty of that data point to a 95% confidence level (which is quite standard for statistical testing). In layman’s terms, if the slants of two (or more) bars overlap, you can’t really say one is bigger than the other without angering the math gods (and their wrath is terrible).

    Dot plots are also frequently used, and the trick to understanding this chart is that the dots represent organizations. For example if there are 200 dots (like in Figure 3), each dot represents 0.5% of organizations. This is a much better way of understanding how something is distributed among organizations and provides considerably more information than an average or a median. We added many more colors and callouts to make these even more informative this year.

    Our newcomers this year are spaghetti and violin charts. They attempt to capture uncertainty in a similar way to slanted bar charts but are more suited for, respectively, data visualized over time and proportions of changes over a specific  time period. For these charts, the darker area is more likely to be the correct value.

    Let us know what you think of them.1 We hope they make your journey through this complex dataset a little less daunting.

    Credit where credit is due

    Turns out folks enjoy citing the report, and we often get asked how they should go about doing it.

    You are permitted to include statistics, figures and other information from the report, provided that you (a) cite the source as “Verizon 2021 Data Breach Investigations Report” and (b) that content is not modified in any way. Exact quotes are permitted but paraphrasing requires review. If you would like to provide people a copy of the report, we ask that you provide them a link to verizon.com/dbir/ rather than the PDF.

  • Figure 1
  • Figure 3
  • Figure 2
  • Figure 4
    • Questions? Comments? Upset there is no AR/VR version of the DBIR?2

      Let us know! Drop us a line at dbir@verizon.com, find us on LinkedIn, tweet @VerizonBusiness with #dbir. Got a data question? Tweet @VZDBIR!

  • 1 But only if you like them. Our figures guy is really thin skinned.

    2 We REALLY want to make it happen!

Let's get started.