Data: The World’s Most Valuable (and Vulnerable) Resource and What It Means for Cybersecurity

Kara Nortman
Venture Inside
Published in
6 min readMar 12, 2021

--

This piece originally ran on TechCrunch.

2020 was a hell of a year. When future generations learn about 2020, the pandemic, social tension, and political unrest will take up most of the oxygen. But for those learning about the history of cyber security, the year 2020 and a mid-size company from Austin, Texas — Solar Winds — will take center stage. Malicious code in one update of a trusted software provider was the trojan horse to access petabytes of private data across 18k organizations (and counting), including Fortune 500s and government entities.

Why will Solar Winds be so generationally important, and why am I talking about it? Because the impact of the hack is too large (and growing) and the mounting losses too substantial that every business leader must acknowledge what many in cyber security have been saying — moving forward, cyber strategy IS company strategy. It is not an audit, but an important part of C-Suite strategies and best practices ranging from employee onboarding to day-to-day, mundane coding.

I believe generational startups will be created from this reckoning with cyber security, just as they’ve been created coming out of market disruptions in the past. I’ve been thinking about this for a while, but it is more clear than ever that we will see cyber go on a tear this next decade. Forecasts suggest $100B of new market value by 2025 alone, putting total market size at close to $280B but I think this figure is conservative. Cyber is — and will be — a massive business.

One key driver of growth in the cyber market is really easy to understand, but really hard to solve for: Data. Cyber is often a second order value proposition, after speed of development, managing IT assets or data. We’re familiar with the idea that “data is the new oil.” Since that phrase was coined by mathematician Clive Humby 15 years ago, the total amount of data in the world has increased 74x. By 2025, IDC forecasts the data universe will consist of 175 zettabytes. In case you don’t know, one zettabyte is one trillion gigabytes. If you were to download 175 zettabytes of data on your computer, it would take you 1.8 billion years. Mindboggling![1]

And it only increases exponentially from here. From likes, posts, profile views, follows, RTs for end consumers to time on site, conversion rate, bounce rate on websites, to events, errors, and anomaly tracking in IOT — all of this data is logged and tracked. We’ve seen billion dollar companies built, taken public, and acquired that ingest and visualize all of the data we capture. The next generation of APIs startups are valuable proportionally to their ability to “talk” with apps in the ecosystem by sharing and ingesting data.

Of course — data makes us smarter. But with this proliferation, we’re seeing the downside of all of this data, or what Nick Halstead, founder of InfoSum (and Upfront Portfolio Company) observed about “data as oil”: “it’s sticky and gets all over the place.” As I’ve written before, improperly storing data isn’t new. What is new(ish) is how security is made harder by the immense quantity of data that exists and all of the different places you can put it. A developer and a credit card can spin up an instance in a CSP with petabytes of data in an afternoon. This is made harder still with over 40% of the global workforce working from home on insecure networks, on devices that are part of “bring your own device” (BYOD) programs, and creating, accessing, and storing data in multi-cloud environments. Gone are the days of ring fencing the perimeter and securing devices used to access the network, and database administrators that act as the data protectors and gatekeepers.

If you don’t know what data you have, let alone know where it is, you can’t protect it. Even if you have tools alerting you to vulnerabilities, you can’t trust these alerts really are your top priority if your tools don’t have a complete view of the universe of data.

Governments are getting involved to help enforce better behavior. GDPR in Europe in 2018 and CCPA in California in 2020 are the first of what’s coming to the rest of the US and the world. While each privacy act will have nuances, the general purpose is the same: give consumers greater ability to opt in/out of what is shared and captured by companies with which they interact, and fine organizations that don’t comply.

All of these factors cause every organization to ask some questions: Where is all of our data actually? How do I make sure it’s secure? How long should we hold on to it? Do we really need all of the data we have? At what point should we delete it? Legacy tools assess compliance and security periodically, like a financial audit, but only for data in known locations (it is, afterall, very challenging to find something you don’t know you are looking for) and are typically set up for structured vs. unstructured data (data sitting in lakes). Consequently, here are some quotes we have heard in the industry about current data loss prevention (DLP) tools:

● “DLP is the biggest unsolved problem in security”

● “Nothing out there does data system discovery”

● “Data discovery…I get asked about it…there’s nothing”

The massive amount of data that enterprises sit on today requires a new approach. This is not lost on founders nor investors. It’s getting A LOT of attention in startup land. Wiz, a cyber security company out of Israel, raised a $100M Series A from Index, Sequoia, Insight and others in December 2020. Their platform provides a visual representation of your cloud deployment across cloud service providers (CSPs) and levels of the tech stack (e.g., infrastructure, platform, containers, workloads) and generates a risk weighted view of vulnerabilities. Open Raven (an Upfront Portfolio company) raised $15M Series A led by Kleiner Perkins in June 2020. They are building a data wrangling solution with the belief that no organization actually knows where all of their data sits. First, they inventory all of your data and then classify it, where you determine what data you actually have and what is high risk. They believe you can’t afford to care about each and every one of the hundreds of millions of objects you store, just as you can’t care about the hundreds of alerts you get in security automation centers. You have to winnow it down using heuristics and rules to isolate what you really care about.They also set up companies to scale across different cloud environments cost effectively for both structured and unstructured data. Both of these companies are coming at the data visibility challenge from different angle, and there are a lot of other solutions needed in this space.

Just as many of us will use Covid as a reference point in time, I believe we will come to view cyber security before SolarWinds (BSW) and after SolarWinds (ASW). Cyber security has been consuming more and more of my passion and interest because it encompasses and impacts so much business strategy. It is often hidden in what we view as device management companies or data companies, but it is becoming front and center as key to unlocking the power of the great migration to the cloud. I’m betting that the ASW period will be the most fertile in the history of the cyber security market — and I’m excited to be part of it.

Huge thanks to my colleague Spencer Calvert who has been an invaluable partner in this project

--

--

Partner @ Upfront, Formerly Founder @ Moonfrye, IAC (Urbanspoon, Citysearch, M&A, Tinder), Battery Ventures