Select Page

Unstructured Data: There by the window; with the trunk

Author: Tim Steele
The elephant in the room
Unstructured data – the elephant in the room

We See It

Gartner unstructured data

Everyone knows unstructured data is hanging around; somewhere. Most everyone knows it comprises more than half of all data (80%). And likely, many know that it grows exponentially.

And yet, this six-ton object sits in the room with little mention. We see it out of the corner of our eye. But the agenda is full and we have little time to discuss elephants. We’re trying to keep people out of our jungle.

“They” See It

Sean Scott notes in Information & Data Manger that cybercriminals are aware that critical unstructured data is a much easier target for theft than structured data protected by corporate firewalls, identity and access controls, encryption, database activity monitoring and more.

Elephants Are Big & Bulky

"Junk" data

According to Robert Smallwood, Managing Director at the Institute for Information Governance:

Good business managers prefer “efficient” data. They would prefer to reduce costs and legal liabilities by getting rid of information that no longer has business value. 69% percent of stored data is junk.

Let’s Talk About the Elephant in the Room

The ROI of data privacy compliance can be expressed as 270% (Cisco). And as John Stufflebeam notes:

It’s hard to secure something if you don’t know you have it. That’s why the risks of unstructured data are so great. Employees create documents for many different reasons as part of their everyday work. It’s what those documents have in them, and where they’re kept, that must be considered.

Q: If we admit it’s harmful to have the elephant in the room, how do we get a six-ton animal through a door 32” wide by 80” tall?

A: A byte at a time

Heureka puts data governance policies into practice: inventory unstructured data, classify that which requires disposition, eliminate ROT and align unstructured data with Compliance and Governance initiatives. Getting that elephant through the door a byte-at-a-time involves:

  1. Define a data classification policy – objectives:
    ‐ Workflows: how data classification impacts employees who use sensitive data
    ‐ Data classification categories
    ‐ Data owners: business unit responsibilities, sensitive data classifications, permissions
    ‐ Data category handling: security standards, data storage, access/sharing rights, encryption & retention policies
  2. Discover & Index all unstructured data, and categories
  3. Classify & Tag files based on sensitive content (PII/PHI) classification categories
  4. Data hygiene – delete Redundant, Obsolete, Trivial data, move sensitive data to secure repositories
  5. Audit – review security policies and procedures to ensure all data is protected by risk-appropriate measures
  6. Repeat
Heureka process to disposition unstructured data

We would appreciate the opportunity to give you a tour