Is artificial intelligence vulnerable to hacking?



The paper The Limitations of Deep Learning in Adversarial Settings explores how neural networks might be corrupted by an attacker who can manipulate the data set that the neural network trains with. The authors experiment with a neural network meant to read handwritten digits, undermining its reading ability by distorting the samples of handwritten digits that the neural network is trained with.

I'm concerned that malicious actors might try hacking AI. For example

  • Fooling autonomous vehicles to misinterpret stop signs vs. speed limit.
  • Bypassing facial recognition, such as the ones for ATM.
  • Bypassing spam filters.
  • Fooling sentiment analysis of movie reviews, hotels, etc.
  • Bypassing anomaly detection engines.
  • Faking voice commands.
  • Misclassifying machine learning based-medical predictions.

What adversarial effect could disrupt the world? How we can prevent it?

Surya Sg

Posted 2018-06-19T11:53:14.547

Reputation: 555

6Consider that human intelligence is vulnerable to hacking – Gaius – 2018-06-20T12:10:29.887

Interesting. Are you interested in "adversarial settings risk models" or something closer to a traditional cyber-security answer but still squarely about A.I.? Best wishes. – Tautological Revelations – 2019-10-13T01:12:12.963



AI is vulnerable from two security perspectives the way I see it:

  1. The classic method of exploiting outright programmatic errors to achieve some sort of code execution on the machine that is running the AI or to extract data.

  2. Trickery through the equivalent of AI optical illusions for the particular form of data that the system is designed to deal with.

The first has to be mitigated in the same way as any other software. I'm uncertain if AI is any more vulnerable on this front than other software, I'd be inclined to think that the complexity maybe slightly heightens the risk.

The second is probably best mitigated by both the careful refinement of the system as noted in some of the other answers, but also by making the system more context-sensitive; many adversarial techniques rely on the input being assessed in a vacuum.

Christopher Griffith

Posted 2018-06-19T11:53:14.547

Reputation: 351

1The split between code vulnerabilities and usage vulnerabilities is good. However, the code vulnerabilities typically are minuscule in AI. The complexity of AI lies in the data, whether that's node weights in a neural network or trees in a random forest. There's just a small bit of code to feed the AI, and the chief risk there is not overfeeding it - a classic buffer overflow risk, easily mitigated by late 20th century techniques. – MSalters – 2018-06-20T09:10:59.067

@MSalters I think it is hard to draw a general conclusion because the code complexity can vary a lot between different types of AI agents (I think your comment is largely accurate for neural networks). Furthermore, although the data and manipulation thereof is probably the larger attack surface it would be unwise to discount the same sort of attacks that have allowed remote code execution via compromised image files in the past that exploited flaws in image viewing applications. The vector is the data being passed in, but the behavior still falls under the code vulnerability header, I think. – Christopher Griffith – 2018-06-22T16:47:44.133


Programmer vs Programmer

It's a "infinity war": Programmers vs Programmers. All thing can be hackable. Prevention is linked to the level of knowledge of the professional in charge of security and programmers in application security.

eg There are several ways to identify a user trying to mess up the metrics generated by Sentiment Analysis, but there are ways to circumvent those steps as well. It's a pretty boring fight.

Agent vs Agent

An interesting point that @DukeZhou raised is the evolution of this war, involving two artificial intelligence (agents). In that case, the battle is one of the most knowledgeable. Which is the best-trained model, you know?

However, to achieve perfection in the issue of vulnerability, artificial intelligence or artificial super intelligence surpass the ability to circumvent the human. It is as if the knowledge of all hacks to this day already existed in the mind of this agent and he began to develop new ways of circumventing his own system and developing protection. Complex, right?

I believe it's hard to have an AI who thinks: "Will the human going to use a photo instead of putting his face to be identified?"

How we can prevent it

Always having a human supervising the machine, and yet it will not be 100% effective. This disregarding the possibility that an agent can improve his own model alone.


So I think the scenario works this way: a programmer tries to circumvent the validations of an AI and the IA developer acquiring knowledge through logs and tests tries to build a smarter and safer model trying to reduce the chances of failure.

Guilherme IA

Posted 2018-06-19T11:53:14.547

Reputation: 691


How we can prevent it?

There are several works about AI verification. Automatic verifiers can prove the robustness properties of neural networks. It means that if the input X of the NN is perturbed not more that on a given limit ε (in some metric, e.g. L2), then the NN gives the same answer on it.

Such verifiers are done by:

This approach may help to check robustness properties of neural networks. The next step is to construct such a neural network, that has required robustness. Some of above papers contain also methods of how to do that.

There are different techniques to improve the robustness of neural networks:

At least the last one can provably make NN more robust. More literature can be found here.

Ilya Palachev

Posted 2018-06-19T11:53:14.547

Reputation: 287

2This sounds like an impossible claim... unless it's about some particular inputs X, rather than general inputs X? In which case, it seems to say next to nothing about hackability, since inputs need not be limited to perturbations of those in the training? – user541686 – 2018-06-19T21:54:00.973

1@Mehrdad: It's probably achievable in a probabilistic sense if the input space is sufficiently structured that you can randomly sample it. That is to say, you can probably establish that for 95% of possible inputs, 95% of disturbances smaller than ε do not affect the class label. This is equivalent to establishing that the border between output classes in input space is smooth, or that the largest part of the input space does not lie near a class border. Obviously some part of the input space has to lie near a class border. – MSalters – 2018-06-20T09:15:09.513

I'm not sure this would apply in the "adversarial" case described in the paper: There, (IIRC) a back-propagated gradient is added to the whole picture, so the change to the complete input can be quite large - even if the change for each individual pixel is barely noticeable. – Niki – 2018-06-20T10:35:44.453

@MSalters: I guess, yeah. But then that seems to devalue it a fair bit unless you can actually show the pictures that are on a class border should actually be on a class border... – user541686 – 2018-06-20T16:17:36.013

The sentence "The next step is to construct such a neural network, that has required robustness" is under research. In general it's very hard to get rid of NN non-robustness problem. But it's possible to enhance the robustness by adversarial training (see e.g. A. Kurakin et al., ICLR 2017), defensive distillation (see e.g. N. Papernot et al., SSP 2016), MMSTV defence (Maudry et al., ICLR 2018). At least the last one can provably make NN more robust.

– Ilya Palachev – 2018-06-21T17:29:16.020

@Mehrard Yes, inputs need to be limited to those in the training. If we don't provide robustness on the train set, it is meaningless to say anything about robustness in the general case. So those works start from the trivial goal: provide robustness at least for the train/test set. Even this trivial goal is not trivial to reach. – Ilya Palachev – 2018-06-22T09:18:06.403


I believe it is, no system is safe, however I am not sure if I can still say this after 20-30 years of AI development/evolution. Anyways, there are articles that showed humans fooling AI (Computer Vision).


Posted 2018-06-19T11:53:14.547

Reputation: 41


Is Artificial Intelligence Vulnerable to Hacking?

Invert your question for a moment and think:

What would make AI at less of a risk of hacking compared to any other kind of software?

At the end of the day, software is software and there will always be bugs and security issues. AIs are at risk to all the problems non-AI software is at risk to, being AI doesn't grant it some kind of immunity.

As for AI-specific tampering, AI is at risk to being fed false information. Unlike most programs, AI's functionality is determined by the data it consumes.

For a real world example, a few years ago Microsoft created an AI chatbot called Tay. It took the people of Twitter less than 24 hours to teach it to say "We're going to build a wall, and mexico is going to pay for it":

We're going to build a wall, and mexico is going to pay for it

(Image taken from the Verge article linked below, I claim no credit for it.)

And that's just the tip of the iceberg.

Some articles about Tay:

Now imagine that wasn't a chat bot, imagine that was an important piece of AI from a future where AI are in charge of things like not killing the occupants of a car (i.e. a self-driving car) or not killing a patient on the operating table (i.e. some kind of medical assistance equipment).

Granted, one would hope such AIs would be better secured against such threats, but supposing someone did find a way to feed such an AI masses of false information without being noticed (after all, the best hackers leave no trace), that genuinely could mean the difference between life and death.

Using the example of a self-driving car, imagine if false data could make the car think it needed to do an emergency stop when on a motorway. One of the applications for medical AI is life-or-death decisions in the ER, imagine if a hacker could tip the scales in favour of the wrong decision.

How we can prevent it?

Ultimately the scale of the risk depends on how reliant humans become on AI. For example, if humans took the judgement of an AI and never questioned it, they'd be opening themselves up to all sorts of manipulation. However, if they use the AI's analysis as just one part of the puzzle, it would become easier to spot when an AI is wrong, be it through accidental or malicious means.

In the case of a medical decision maker, don't just believe the AI, carry out physical tests and get some human opinions too. If two doctors disagree with the AI, throw out the AI's diagnosis.

In the case of a car, one possibility is to have several redundant systems that must essentially 'vote' about what to do. If a car had multiple AIs on separate systems that must vote about which action to take, a hacker would have to take out more than just one AI to get control or cause a stalemate. Importantly, if the AIs ran on different systems, the same exploitation used on one couldn't be done on another, further increasing the hacker's workload.


Posted 2018-06-19T11:53:14.547

Reputation: 210

1I like the idea of having several separate AI systems that have to reach agreement as a mitigation technique. Although then you'd have to be assured whatever voting mechanism they used couldn't be comprised to fake a decision. – Christopher Griffith – 2018-06-22T17:37:40.237

@ChristopherGriffith True, that is a risk. In the case of the car, the best way to mitigate that is to design the system so that an attacker would need physical access to manipulate it and make it hard to reach so the person would have to break in to the car to access it. Keeping a system offline is generally a good hacking countermeasure, though not always ideal. – Pharap – 2018-06-22T22:26:57.043


I concur with Akio that no system is completely safe, but the take away is AI systems are less prone to attacks when comparing with the old systems because of the ability to constantly improve.

As time passes by more people will get in the field bringing new ideas and hardware will be improving so that they are "strong AI."

Simbarashe Timothy Motsi

Posted 2018-06-19T11:53:14.547

Reputation: 370


Is artificial intelligence vulnerable to hacking?

hint; if you say that AI is vulnerable,then I disagree with you here by such statement. Artificial intelligence is divided into three categories nor phases that we are supposed to go through ie.

  • artificial narrow intelligence

  • artificial general intelligence

  • artificial super intelligence

Therefore,according to your statement; "I'm concerned that malicious actors might try hacking AI....."

given by the examples in your message body,we are at the level of artificial narrow intelligence,where by human hacker can twist his/malicious code to invade such applications,at this level.However,if we jump straight to the final level of Artificial Intelligence; then by all means;a human being can't invade nor hacker a super intelligent software program or a high tech super intelligent agent. For instance; a human hacker,does one thing at a time,there's nothing to stop an artificial intelligence dividing its focus and doing a lot of staff simultaneously,this is hard to second a guess a mind which works accurately like that

for your information

do not be taken up by what the media says about AI generally,simply because;they don't know that the big thing will happen that is new species that out compete humans

just imagine living in a new society that is high tech. Check out the cyber grand challenge

If you missed that event,then am sorry.


Posted 2018-06-19T11:53:14.547

Reputation: 1 349

I would imagine that even in a world with artificially super intelligent creations, there will still be ways to hack these systems using highly specialized tools that can simply outperform generalized AI systems at specific tasks. – krowe2 – 2018-06-20T15:24:42.717


Intelligence of any type is vulnerable to hacking, whether DNA based or artificial. First, let's define hacking. In this context, hacking is the exploitation of weaknesses to gain specific ends which may include status, financial gain, disruption of business or government, information that can be used for extortion, the upper hand in a business deal or election, or some other form of control or manipulation.

Here are examples of brain hacking strategies and their common objectives. Each of these has a digital system equivalent.

  • Government propaganda — predictable compliance
  • Scams — money
  • Spoofing — humorous public reaction
  • Roll playing — gain trust to acquire access or manipulate
  • Pain centers — exploit addiction for increased income

Some are concerned about what has been called The Singularity, where intelligent software entities may be able to hack humans and their social structures to gain their own ends. That humans could hack the intelligent agents of other humans is another obvious possibility. I don't think training data is the only point of attack.

  • Parameter matrices can be overwritten in a way that is difficult to detect.
  • Reinforcement signals can be tampered with.
  • Known pockets of error in input permutations can be exploited.
  • The deterministic nature of digital systems can be exploited by other deep learners by duplicating the trained system and seeking points of vulnerability off line before executing them over a network

The possibilities listed in the question deserve consideration, but this is my version of the list.

  • Murder by AV malfunction or spoofing identification systems in pharmacies or hospitals
  • Diversion of large quantities of shipped product to a recipient that did not pay for them
  • Social genocide by marginalizing specific groups of individuals

The only way to prevent it is to wait for a global extinction event, but there may be ways to mitigate it. Just as the program satan was written to find vulnerabilities in UNIX systems, intelligent systems may be devised to find vulnerabilities in other intelligent systems. Of course, just as programming models and conventional information systems can be designed with security in mind, reducing vulnerabilities to the degree reasonably possible from day one, AI systems can be designed with that objective in mind.

If you follow the information path of any system and consider the ways to read or write the signal at any point along the path, you can preemptively guard against those points of access. Clearly, taking care when acquiring data to use for training is key in the case mentioned in this question, and proper encryption along information pathways is needed, along with ensuring that no physical access is granted to unauthorized personnel, but I foresee battles between measures and countermeasures arising out of these concerns and opportunities.

Douglas Daseeco

Posted 2018-06-19T11:53:14.547

Reputation: 7 174


There are many ways to hack an AI. When I was kid I figured how to beat a chess computer. I always followed the same pattern, once you learn you can exploit it. The worlds best hacker is a 4 year old that wants something he will try different things until he establishes pattern in his parents. Anyway, Get an Ai to learn the patterns of a AI and given a given combination you can figure the outcome. There is also just plain flaws or back door in code either on purpose or by chance. There is also the possibility the AI will hack itself. It is called misbehaving, remember the small child again...

BTW simple way is to make AI always fails safe... something people forget.

Tony Lester

Posted 2018-06-19T11:53:14.547

Reputation: 1