SiliconANGLE: How companies are scrambling to keep control of their private data from AI models

This week I got to write a very long piece on the state of current data leak protection specifically designed to protect enterprise AI usage.

Ever since artificial intelligence and large language models became popular earlier this year, organizations have struggled to keep control over accidentally or deliberately exposing their data used as model inputs. They aren’t always succeeding.

Two notable cases have splashed into the news this year that illustrate each of those types of exposure: A huge cache of 38 terabytes’ worth of customer data was accidentally made public by Microsoft via an open GitHub repository, and several Samsung engineers purposely put proprietary code into their ChatGPT queries.

I covered the waterfront of numerous vendors who have added this feature across their security products (or in some cases, startups focusing in this area).

What differentiates DLP between the before AI times and now is a fundamental focus in how this protection works. The DLP of yore involved checking network packets for patterns that matched high-risk elements, such as Social Security numbers, once they were about to leave a secure part of your data infrastructure. But in the new world order of AI, you have to feed the beast up front, and if that diet includes all sorts of private information, you need to be more proactive in your DLP.

I mentioned this to one of my readers, who had this to say about how our infrastructures have evolved over the years since we both began working in IT:

“In the late 90’s we had mostly dedicated business networks and systems, a fairly simple infrastructure setup. Then we went through web hosting and the needs to build DMZ networks. Then came shared web hosting facilities and shared cloud service offerings. Over the years cloud services have built massive API service offerings. Each step introduced an order of magnitude of complexity. Now with AI we’re dealing with massive amounts of data.”

If you are interested in how security vendors are approaching this issue, I would love to hear your comments after reading my post in SA. You can post them here.

Web Informant

David Strom's musings on technology

SiliconANGLE: How companies are scrambling to keep control of their private data from AI models

Leave a Reply Cancel reply