Issue 1 | Bayesian thinking, building base rates with IRIS 2025 & cyber prediction markets

How to Start with What You Have and Get Better Over Time

Jul 01, 2025

📖 Book update: progress on data collection, vetting, and the three-source model for CRQ
🧠 Why every risk analyst should think like a Bayesian (and how to start)
📊 How to use IRIS 2025 to establish base rates and build sector-specific priors
🛠️ Step-by-step GenAI prompts for applying IRIS data in your own models
📗 Book Excerpt: My Experience with Prediction Markets
📚 What I’m reading: favorite picks on risk sin eater, security-first controls, and financial discipline in CRQ
🗂 From the archives: two blog posts on pushback in risk analysis and decision framing
🙋 Reader Q&A: what to do when your org has 1,500 risks in the register

Hey there,

Welcome to Issue 1! Thanks for being part of this journey from the very beginning. This newsletter accompanies my upcoming book "From Heatmaps to Histograms: A Practical Guide to Cyber Risk Quantification" (Apress, early 2026). Each month, I share practical techniques, behind-the-scenes insights from the book writing process, and field-tested CRQ tactics that help you build better risk models.

If you missed my launch post (Issue 0), you can check it out here. I shared the full abstract of the upcoming book as well as a guide to your first quantitative cyber risk assessment.

📖 Book Update: Progress on the Big Middle

First, a quick update on the book project. The deal is signed, and we're targeting early 2026 for publication. Pre-order links should be available later this year (likely Q4 2025), and I'll share those with you as soon as they're live.

I'm currently focused on what I thought would be the most challenging section: data collection, vetting, and preparation for quantitative analysis. I tackled this middle chunk first because getting the foundation right makes everything else click into place.

The Three-Source Framework

One of the core principles I'm building the book around is what I call the "three essential data sources" for cyber risk quantification:

External data provides industry context and base rates - breach frequencies, cost studies, enforcement patterns. Your "what typically happens to organizations like ours" baseline.
Internal data grounds everything in your specific reality - incident logs, actual recovery times, real costs from past events. Your "what actually happens here" update.
Subject matter expert (SME) input bridges past and future with forward-looking judgment about current threats, new controls, and changing conditions. Your "what's likely to happen next" forecast.

The magic happens when you systematically combine all three. Each source has blind spots, but together they create something far more reliable than any single input.

Getting Unstuck

People get stuck on data collection for three main reasons: the "perfect data myth," not knowing where to look, and analysis overwhelm when facing messy information. I'm addressing each with practical methods that work with real-world constraints.

Vetting and Blending

The book includes a simple framework for evaluating any data source in minutes, plus time-tested quality adjustment methods borrowed from fields like climate science, nuclear safety, and actuarial science that specialize in high-stakes decisions under uncertainty. When you find that vendor survey claiming "average breach cost is $2.1M," you'll know exactly how to convert it into an honest range that reflects what you actually know.

For combining sources, I use a math-free Bayesian approach - start with your best external baseline, then systematically update it with internal evidence and expert judgment. Like checking traffic before heading to the airport, but structured for risk data.

This foundation work enables everything else - scenarios, simulations, and actual business decisions. Getting the data piece right makes the rest straightforward.

🧠 Why Every Risk Analyst Should Think Like a Bayesian

Speaking of Bayesian thinking… let me share a tool that's transformed how I approach risk analysis, and I believe it can do the same for you. It’s more than a tool, though; it’s a way of thinking.

You don't need to be a statistics wizard to benefit from Bayesian thinking. Yes, there are sophisticated mathematical applications that belong in any serious risk analysis toolkit, but today I want to focus on Bayesian reasoning as a mental model. This shift in mindset can revolutionize how you frame risk analyses, keep projects focused, and avoid the dreaded "boiling the ocean" syndrome.

More importantly, if you embrace this thinking early, you'll sidestep the common CRQ myths that paralyze many analysts:

Believing you need to collect ALL the data before starting
Thinking a risk analysis is invalid without perfect information
Getting stuck in analysis paralysis

Bayesian thinking also acts as a natural defense against cognitive biases we all carry: overconfidence, anchoring, and the IKEA effect (overvaluing something because we built it ourselves).

Bayesian Thinking as a Mental Model

Prior: Your initial belief or estimate, based on current knowledge, experience, or data.

Evidence: New information, observations, or data relevant to the belief.

Posterior: Your updated belief after factoring in the new evidence. It becomes the new prior for next time.

The Process:

Start with a belief: based on what you already know (your prior).
Gather evidence: actively look for new data or observations.
Update your belief: revise your estimate in light of the evidence.
Repeat : treat your updated belief as the new starting point for the next cycle.

The beauty is in the cycle: today's posterior becomes tomorrow's prior. You're always learning, always updating, never claiming to have the final answer.

A Real-World Example

Let's say you're assessing the risk of a data breach at your company.

Your Prior: Based on industry reports, you estimate a 15% chance of a significant breach in the next year.

New Evidence: Your security team reports they've detected 3x more phishing attempts this quarter, and a recent vulnerability scan found several unpatched systems.

Your Posterior: This evidence suggests higher risk. You update to around 25% chance of a breach.

Next Cycle: A month later, phishing training shows dramatic improvement in employee click rates, and all critical patches are applied. Your posterior (25%) becomes your new prior, and with this positive evidence, you adjust down to around 18%.

Notice what happened? You started with imperfect information, made decisions, gathered more data, and refined your thinking. You never stopped to collect "all possible data." You started with what you had and improved from there.

This is exactly how effective risk analysis works in the real world.

The key insight: Without this mindset, beginners often try to "get it perfect" before they start and never actually do a risk analysis. Bayesian thinking gives you permission to start with what you have and get better over time. You're not ignoring uncertainty, you're structuring it. That's what makes it so powerful for real-world risk.

📊 Using IRIS 2025 Data as Your Risk Analysis Starting Point

Now that you're thinking like a Bayesian, let's put it into practice. The Cyentia Institute's IRIS 2025 report is a treasure trove of data for establishing your base rates - those crucial priors that anchor your risk analysis.

Here's the unlock: instead of starting with a wild guess (or an adjective like “high”) about your organization's cyber risk, you can begin with research-backed probabilities. This is exactly how Bayesian thinking transforms risk analysis from guesswork into systematic reasoning.

Four Ways to Use IRIS 2025 Right Now

Use Case 1: Establish Your Organization’s Base Rate

Let’s start by estimating a base rate for a hypothetical healthcare organization with $800 million in annual revenue. This base rate reflects the annual probability of experiencing a significant security incident - one that requires public disclosure.

Step-by-Step Exercise

Find your revenue tier: Go to Page 12, Figure 7 in IRIS 2025.
Locate your category: Find the "$100M to $1B" revenue band for our healthcare org.
Read the chart: Follow the line to the rightmost point (2024).
Record your base rate: You’ll see an approximate annual probability of 8–9%.

This becomes your Bayesian prior for modeling significant cyber incidents.

⚠️ These rates are for publicly disclosed events, not all security incidents. I recommend using a range to reflect reading uncertainty and real-world variability.

🔍 GenAI Prompt:

[upload the PDF into the prompt]
See the attached PDF for the IRIS 2025 report, Page 12, Figure 7. Help me extract the incident probability range for an organization with $800M annual revenue. Walk me through reading the chart and give me a reasonable range rather than a precise percentage. These are probabilities for significant cyber incidents requiring disclosure.

Use Case 2: Build Sector-Specific Frequency Models

Now let’s tailor that base rate to your specific industry; in our case, healthcare.

Step-by-Step Exercise

Now let’s tailor your base incident probability to your specific industry — in this example, Healthcare.

Here we use two pieces of information from IRIS 2025:

Base annual incident probability by revenue tier (Figure 7, p. 12) — the starting point for your organization’s likelihood of a significant, publicly reportable incident.
Sector-specific likelihood multipliers (Appendix A3, p. 34) — important: these are different from the loss multipliers used in Use Case 3. Appendix A3 clearly labels which multiplier is for likelihood vs. loss.

Step-by-Step Exercise

Start with your base rate
- For organizations with $100M–$1B in annual revenue (Figure 7, p. 12), the base probability is approximately 8–9%.
Find your sector-specific likelihood multiplier
- Appendix A3 lists Healthcare’s likelihood multiplier as 1.34.
- Note: The Healthcare loss multiplier is different (1.19) and is only used in Use Case 3.
Apply the likelihood multiplier
- Lower bound: 8% × 1.34 ≈ 10.7%
- Upper bound: 9% × 1.34 ≈ 12.1%

Your Result:
For a Healthcare organization with $800M in annual revenue, the adjusted probability of a significant cyber incident is ~11–12% per year.

🔍 GenAI Prompt:

Using IRIS 2025, help me calculate sector-specific incident probability ranges. My organization is in the Healthcare sector with $800M annual revenue.
From Figure 7 on p. 12, extract the base annual incident probability range for organizations with $100M–$1B in revenue.
From Appendix A3 on p. 34, find the likelihood multiplier (not the loss multiplier) for Healthcare.
Apply that multiplier to both ends of the base probability range, show each calculation step, and clarify these are for significant cyber incidents requiring disclosure.

Use Case 3: Calculate Sector and Revenue-Specific Loss Magnitudes

Now let’s estimate what a significant cyber incident might cost in your sector, using both typical and high-end outcomes.

Here we use two different types of information from IRIS 2025:

Loss amounts by revenue tier (Table 1, p. 16) — these are actual dollar figures for the median (50th percentile) and extreme (95th percentile) incidents.
Sector-specific loss multipliers (Appendix A3, p. 34) — note: these are different from the likelihood multipliers in Figures A1/A2. Appendix A3 clearly labels which multiplier applies to loss magnitude.

Step-by-Step Exercise

Find your base loss amounts
- For organizations with $100M–$1B in revenue (Table 1, p. 16):
  - Median (50th percentile): $466.7K
  - High-end (95th percentile): $12.3M
Get your sector-specific loss multiplier
- Appendix A3 shows that the Healthcare sector has a loss multiplier of 1.19
- Important: Healthcare’s likelihood multiplier is higher (1.34), but that applies to Use Cases 1–2, not here.
Apply the loss multiplier
- Median: $466.7K × 1.19 ≈ $555K
- 95th percentile: $12.3M × 1.19 ≈ $14.6M

Your Result:
For a Healthcare organization with $800M in annual revenue, a significant incident (publicly reportable) has:

Typical loss: ~$555K
Extreme loss: ~$14.6M

These figures are sector-adjusted and based on the most recent IRIS 2025 data, giving you realistic inputs for scenario modeling and decision-making.

🔍 GenAI Prompt:

Using the IRIS 2025 report, help me calculate incident-specific loss estimate ranges for my organization (Healthcare sector, $800M annual revenue).
From Table 1 on p. 16, extract the median (50th percentile) and extreme (95th percentile) loss values for organizations in the $100M–$1B revenue range.
From Appendix A3 on p. 34, find the loss multiplier (not the likelihood multiplier) for the Healthcare sector.
Apply that multiplier to both loss values, show each calculation step using ranges, and clarify that these are for significant cyber incidents requiring disclosure.

Use Case 4: Factor in Sector Risk Trajectory

Now let’s assess whether your sector’s risk is increasing or stabilizing - an important signal for forward-looking planning.

Step-by-Step Exercise

Find your sector trend: Go to Page 13, Figure 8 (Healthcare panel)
Assess the slope: From 2008 to 2022, Healthcare rose from ~2% to nearly 9%
Note recent change: From 2022 to 2024, the trend flattens and shows a slight decline
Interpret the momentum: The long-term trajectory has been upward, but recent years suggest a plateau or even a downturn

🧠 Insight

The long-term momentum in healthcare risk has been strong (a ~7-point increase over 16 years), but short-term signals show stabilization or a slight decline.

2008 to 2022: Rising risk
2022 to 2024: Flattening or soft decline
2025+ outlook: Uncertain; may indicate a peak or temporary lull

Don’t assume continued acceleration. Monitor your sector's data annually to adjust for real changes in trajectory.

🔍 GenAI Prompt:

Using IRIS 2025 Page 13, Figure 8, help me analyze the risk trajectory for healthcare organizations from 2008–2024. I need to determine whether the trend is rising, falling, or stabilizing and estimate the long-term rate of change as well as the short-term directional signal. Use this to inform 2025+ planning assumptions.

Final Output: Your Sector Risk Profile

You now have a quantified risk profile for a healthcare organization with approximately $800M in annual revenue. Here’s how it breaks down:

Base incident probability: Based on your revenue tier ($100M to $1B), your starting point is an 8–9% annual likelihood of experiencing a significant cyber incident (source: Page 12, Figure 7).
Sector adjustment: Healthcare organizations experience incidents more frequently than the average. Applying the 1.34x multiplier for the healthcare sector (source: Page 34), your adjusted probability becomes approximately 11–12% annually.
Loss magnitude (typical): The median loss for organizations in your revenue tier is $466.7K. After applying the 1.19x healthcare multiplier (Page 34), the sector-adjusted median loss is about $555K.
Loss magnitude (extreme): The 95th percentile loss for your tier is $12.3M. With the same 1.19x adjustment, your high-end loss estimate is roughly $14.6M.
Risk trajectory: From 2008 to 2022, incident probability in the healthcare sector rose sharply: from about 2% to nearly 9%. However, from 2022 to 2024, the trend has flattened or slightly declined, ending at around 9.1% in 2024. This suggests the long-term trend is upward, but recent years may indicate a plateau.

What This Means for You

You can now model risk using realistic inputs for:

Frequency: 11–12% chance per year of a significant, reportable incident
Impact: Losses between $555K (typical) and $14.6M (extreme)
Trajectory: Risk increased over the long term but may be leveling off; important for forward-looking budgeting and strategic planning
GenAI: Using GenAI knocks the time required to do this kind of analysis from an afternoon to a few minutes.

This model is dynamic, based on both your organization's firmographics and sector trends, not just a static benchmark or color on a heat map.

💡The Bayesian Unlock

Notice what's happening here: You're not using IRIS data as the end goal; something you skim once and never open again. You're using it as your informed starting point, then systematically updating with organization-specific evidence.

This prevents the classic risk analysis traps:

Starting with gut feelings instead of data
Ignoring industry baselines
Treating your organization as completely unique
Getting paralyzed by the need for "perfect" data

Your risk analysis becomes a conversation between industry research and organizational reality - exactly what Bayesian thinking is designed to handle.

🏠 Homework: Put This Into Practice

Your Assignment: Download IRIS 2025 and build your organization's risk priors using the GenAI prompts above.

Use the four exercises to calculate your organization's base rates
Try all four GenAI prompts - see how they help extract and organize the data
Document your priors - you'll need them as your Bayesian starting point
Compare your current risk assessments with research-backed priors - how do they differ from your existing estimates?

Bonus Challenge: Take your calculated ranges and ask yourself: "What organizational-specific evidence could move me up or down within these ranges?" This is exactly how Bayesian updating should work.

Remember: These aren't your final risk assessments - they're your informed starting points. The real power comes when you start updating these priors with your organization's specific threat intelligence, control effectiveness, and incident history.

📗 Book Excerpt: My Experiment with a Prediction Market

One of the cooler experiences in my career happened when I worked with Richard Seiersen, co-author of How to Measure Anything in Cybersecurity Risk and author of The Metrics Manifesto. Together, we set up a functioning prediction market at a large financial services company in San Francisco.

The Setup

We created an open-to-the-public event that was part Shark Tank, part American Idol: cybersecurity startups pitched to a panel of CISO judges in our arena-style auditorium. But here's where it got interesting: we set up a prediction market so attendees could "trade" on which startup they thought would win at the end of the night.

A prediction market is essentially a crowdsourced forecasting tool where participants buy and sell contracts based on event outcomes. The trading price reflects the collective probability that participants assign to each outcome. Think of it as real-time betting on future events; the more demand for a particular outcome, the higher its implied probability.

To make it more engaging and test the concept, we added other prediction questions: Will it rain tomorrow? Who will win the Giants game? What's the probability of a major data breach in healthcare this quarter?

The Key: Proper Incentives

Prediction markets work best when participants have skin in the game: something to gain or lose. We offered prizes and leaderboard clout to the best forecasters, which incentivized people to use their genuine knowledge and research rather than just guessing randomly.

The Results

The accuracy was remarkable. The market consistently predicted not just the startup pitch winners (judged at the end of the evening), but also sports games, weather patterns, traffic conditions, and short-term cyber events that we could verify within a month.

When we talked to the most accurate forecasters, I found something interesting: many were fantasy football players, poker enthusiasts, or people with backgrounds in meteorology, statistics, or microeconomics. What they had in common was probabilistic thinking: the ability to reason about uncertainty and update beliefs based on new information.

Why This Matters for CRQ

I've often thought about extending this to regular cybersecurity use cases. Imagine a public prediction market where the entire cybersecurity community collectively forecasts cyber events. Done right, this could provide base rates (or better) for our quantitative risk assessments.

The challenge? Confidentiality. Most companies can't share the detailed information needed to make such markets work effectively. It becomes a tragedy of the commons; everyone wants to use the data, but few can contribute without violating privacy or competitive concerns.

Still, the concept shows the power of structured, incentivized expert judgment. Even if we can't build industry-wide prediction markets, we can apply the same principles in our SME workshops: create the right incentives, structure the process, and tap into the collective wisdom of people who think probabilistically.

Want to Try It? Check out Good Judgment Open, a publicly accessible prediction market where you can practice forecasting on everything from politics to technology trends.

📚 What I'm Reading

Each issue, I'll share a few blog posts, research articles, or tools worth your time. Here are three standouts I've been diving into:

The Role of the Risk Sin Eater by Jack Freund: This one is a little older, but I still share it often with friends and coworkers. Jack is one of my favorite writers on risk, and in this piece he draws a sharp parallel between modern cyber risk approval practices and the centuries-old tradition of sin eaters, figures in rural Europe who symbolically absorbed the sins of the dead, urging us to abandon this outdated folklore and let business leaders own their risks.

The GRC Engineer Newsletter by Ayoub Fandi: I really enjoy the GRC Engineer newsletter by Ayoub Fandi, and this issue is one of my favorites. It offers a clear, practical framework for designing security-first controls that reduce real business risk, rather than just satisfying compliance requirements, and shows how this approach can make compliance easier and more meaningful.

Bringing Financial Discipline to Cyber-Risk Decisions by Laura Voicu: This is one of the most practical and accessible deep dives into aligning cyber risk decisions with financial rigor. Laura walks through a ransomware scenario using FAIR, NPV, IRR, and the Gordon-Loeb model to show how to move beyond ROSI and make smarter, defensible investments in security. If you're trying to speak the language of finance while making risk-informed decisions, this is a must-read.

Got recommendations? Send them my way, and I'll feature reader picks too.

🗂 From the Archives

Here are a few blog posts of mine you might enjoy:

“That Feels Too High”: A Risk Analyst's Survival Guide: This post explores why risk estimates often get challenged with comments like “that feels too high,” despite being backed by solid analysis. It outlines three common reasons for pushback: missing information, cognitive bias, and communication gaps, and offers a practical framework to diagnose and respond constructively, turning discomfort into a valuable part of the decision-making process.

Using Risk Assessment to Support Decision Making: This post reframes risk assessments as tools that support decision-making, emphasizing that without a clear decision to inform, defined by choice, preference, and information, a risk assessment often falls short. It offers a practical framework to help analysts focus their efforts, using real-world examples to show how clarifying the underlying decision can make assessments more useful and aligned with business goals.

🙋 Questions

So far in the industry, I've realized that companies still have the appetite to do cyber risk quantification (monetary terms) for a one-off event like a ransomware or a data breach. However, doing it for each and every cyber risk in their register seems difficult.

For example, I worked with an organization that had almost 1500 risks in their cyber risk register and they did it using qualitative analysis. How do you propose an org can use quantitative analysis for a big number like 1500 risks?
- From Varun W.

Great question! You've hit on one of the biggest practical barriers to adopting decision-based risk. The answer isn't to quantify all 1500 risks - it's to fundamentally rethink what belongs in a "risk register."

The problem: That 1500-item list likely contains a mix of vulnerabilities, controls, scenarios, and compliance items that aren't actually decision-relevant risks. Most organizations treat their risk register like a comprehensive inventory rather than a decision-support tool.

The solution: Start with portfolio thinking, not individual risk scoring:

Aggregate first - Group those 1500 items into major loss scenarios (ransomware, data breach, system outage, etc.). You probably have 8-12 actual business-impacting scenarios.
Quantify the scenarios that matter - Focus on the loss events that could actually influence executive decisions about budget allocation, insurance, or strategic direction.
Use the 80/20 rule - A small number of scenarios likely drive most of your actual risk exposure. Quantify those first.

The mindset shift is moving from "we have 1500 risks to manage" to "we have $X million in annual loss exposure across Y major scenarios - where should we invest to reduce it?"

Most of those 1500 items are probably controls or vulnerabilities that feed into the major scenarios, not separate risks requiring individual quantification.

What decision was that 1500-item register actually helping executives make?

✉️ Contact

Have a question about risk analysis or have a general inquiry? Here’s how to contact me:

Reach me at:

Reply to this newsletter, if you receive via email
Comment below
Connect on LinkedIn
Contact form

What specific risk analysis challenges are you facing? Hit reply and let me know - your question might become the focus of a future deep dive.

❤️ How You Can Help

✅ Tell me what topics you want covered: beginner, advanced, tools, AI use, anything
✅ Forward this to a colleague who's curious about CRQ
✅ Click the ❤️ or comment if you found this useful

If someone forwarded this to you, please subscribe to get future issues.

- Tony

Dr. Dayo Adetoye

Jul 4

What a great and extremely practical writeup. The Bayesian update setup is excellent, and the tips for using publicly available baseline, such as the Cyentia IRIS publication, as prior makes this very practical.

A useful follow-on would be to complete that with a systematic methodology for marrying the prior with internal context: near misses, SIEM telemetry, threat intelligence and control architecture and efficacy etc. Then to round that off with the techniques for enriching that with SME estimates, as you suggested, in the trifecta of data sources.

I thoroughly enjoyed the systematic framing.

This has gone into my bookmarks!

Expand full comment

1 reply by Tony Martin-Vegue

Steven Cardinal

Jul 28

Appreciate the write-up. I'm a bit confused on Use Case 3, though. Figures A1 and A2 on pg 34 or the IRIS report seems to show likelihood adjustments rather than loss adjustments. The example given for healthcare you've given as 1.19, but I see 1.34 for healthcare (1.19 is listed for Retail). Can you clarify that?

2 more comments...

Issue 1 | Bayesian thinking, building base rates with IRIS 2025 & cyber prediction markets

How to Start with What You Have and Get Better Over Time

In This Month’s Issue:

📖 Book Update: Progress on the Big Middle

The Three-Source Framework

Getting Unstuck

Vetting and Blending

🧠 Why Every Risk Analyst Should Think Like a Bayesian

Bayesian Thinking as a Mental Model

The Process:

A Real-World Example

📊 Using IRIS 2025 Data as Your Risk Analysis Starting Point

Four Ways to Use IRIS 2025 Right Now

Use Case 1: Establish Your Organization’s Base Rate

Step-by-Step Exercise

🔍 GenAI Prompt:

Use Case 2: Build Sector-Specific Frequency Models

Step-by-Step Exercise

🔍 GenAI Prompt:

Use Case 3: Calculate Sector and Revenue-Specific Loss Magnitudes

🔍 GenAI Prompt:

Use Case 4: Factor in Sector Risk Trajectory

Step-by-Step Exercise

🧠 Insight

🔍 GenAI Prompt:

Final Output: Your Sector Risk Profile

What This Means for You

💡The Bayesian Unlock

🏠 Homework: Put This Into Practice

📗 Book Excerpt: My Experiment with a Prediction Market

The Setup

The Key: Proper Incentives

The Results

Why This Matters for CRQ

📚 What I'm Reading

🗂 From the Archives

🙋 Questions

✉️ Contact

❤️ How You Can Help

Discussion about this post