Next week I am giving a speech at the Inside AI/LIVE event in San Francisco. I have been working for Inside.com for nearly three years, producing a daily email newsletter on infosec topics. The speech will cover the current trends in how AI is both the bane and the boon of IT security. In my talk, I will point to some of the innovators in this space that I have found in my travels. I thought I would touch on what I will be talking about here.
Usually, when we first hear about AI, we tend to go towards what I call the “Skynet scenario.” For those of you who haven’t seen any of the Terminator movies, this is that point in the future where the machines take over and kill all of the humans, and we are left with Arnold-as-robot and Kyle Reese to save us all from extinction. That isn’t a great place to start thinking about the relationship between AI and security to be sure.
Certainly, we have heard many of the more recent notable AI fails, such as the gender-bias of the AI-based HR recruiting tool from Amazon, the self-driving Uber car that killed a pedestrian, and where Google Photo confused a skier with a mountain peak. But we need to get beyond these scenarios.
Perhaps a better place to start is to understand the workflow of machine learning (ML). Here we see that AI isn’t all that well suited to infosec. Why? Because the typical ML process tries to collect data, build an algorithm to model something that we think we know, and then use the model to predict some outcomes. That might work well for certain situations, but the infosec world is far too chaotic and too reliant on human interpretation of the data to work well with AI techniques.
On top of this is that the world of malware is undergoing a major transformation these days. Hackers are moving from being mere nuisances like script kiddies to professional criminals that are interested in making money from their exploits. Malware is getting more complex and the hackers are getting better at hiding their craft so that they can live longer inside our corporate networks and do more targeted damage. Adversaries are moving away from “spray and pray,” where they just blanket the globe with malware and towards “target and stay,” where they are more selective and parsimonious with their attacks. This is also a way to hide themselves from detection too.
One issue for using AI techniques is that malware attribution is hard, something that I wrote about in a blog post for IBM’s Security Intelligence last year. For example, the infamous WannaCry ransomware was eventually attributed to the North Koreans, although at first it seemed to come from Chinese agents. It took a lot of research to figure this out, and one tell was the metadata in the code which showed the Korean time zone. AI can be more of a hindrance than help sometimes.
Another problem for security-related AI is that oftentimes developers don’t think about security until they have written their code and they are in their testing phase. Certainly, security needs to be top-of-mind. This post makes some solid reasons why this needs to change.
In the past several years, Amazon, Google, (most recently Microsoft) and many other IaaS players have come out with their ML toolkits that are pretty impressive. For a few bucks a money, you can rent a very capable server and build your own ML models for a wide variety of circumstances. That assumes that a) you know what you are doing and b) that you have a solid-enough dataset that you can use for creating your model. Neither of those circumstances may match your mix of skills or situation.
So there is some hope in the AI/security space. Here are a few links to vendors that are trying to make better products using AI techniques.
First is a group that is using what is called homomorphic encryption. This solves the problem where you want to be able to share different pieces of the same database with different data owners yet encrypt the entire data so that no one can inadvertently compromise things. This technology has been the darling of academia for many years, but there are several startups including ICE Cybersecurity, Duality Technologies’ SecurePlus, Enveil’s ZeroReveal, Capnion’s Ghost PII, and Preveil’s email and file security solutions. A good example of this is the San Diego-based Community Information Exchange, where multiple social service agencies can share data on their clients without revealing personal information.
Google’s Chronicle business has a new security tool it calls Backstory. While still in limited release, it has the ability to ingest a great deal of data from your security logs and find patterns of compromise. In several cases, it identified intrusions that happened years ago for its clients – intrusions that had not been detected by other means. That is showing the power of AI for good!
Coinbase is using ML techniques to detect fraudulent users, such as those that upload fake IDs to try to open accounts. It matches patterns in these uploads, such as if someone uses a fake photo or makes a copy of someone else’s ID. And Cybraics has developed an AI engine that can be used to scan for vulnerabilities across your network.
Probably one of the more interesting AI/security applications is being developed by ZeroEyes. While not quite in production, it will detect weapons in near-real time, hopefully identifying someone before they commit a crime. This isn’t too far afield from the thesis of Minority Report’s pre-crime activities. We have certainly come a long way from those early Skynet days.
You can view the slide deck for my presentation at the conference below: