What are causative and exploratory attacks in Adversarial Machine Learning?


I've been researching Adversarial Machine Learning and I know that causative attacks are when an attacker manipulates training data. An exploratory attack is when the attacker wants to find out about the machine learning model. However, there is not a lot of information on how an attacker can manipulate only training data and not the test data set.

I have read about scenarios where the attacker performs an exploratory attack to find out about the ML model and then perform malicious input in order to tamper with the training data so that the model gives the wrong output. However shouldn't such input manipulation affect both the test and training data set? How does such tampering only affect the training data set and not the test data set?


Posted 2019-11-13T22:33:44.617

Reputation: 105



When someone is able to do a causative attack it means there is a mechanism by which they are able to input data into the network. Maybe a website where people can input their images and it outputs a guess on what is in the picture and then you click if it got it right or not. If you continue to input images and lie to it it will obviously get worse and worse if they use the user input to add to the test set. Most people are careful and don't mess around with mixing new data into the testing sample. If they did something like mixed the user input training and test then resampled something like that could occur but most people don't do that. It's bad practice and even worse than leaving your NN open to tampering from malicious user input. Information isn't really added to the knowledge in the model till it is fed into the model and backpropagation occurs.

An exploratory attack is sending tons of inquiries to the model to gain information about the data set they have built into the model even to the point of extracting data about individuals pieces of data that are built into the model. Then, with this information, they could try to reconstruct the data set. They could attempt to trick the network by sending strange generated inputs.

In the paper Adversarial Machine Learning (2011), by Ling Huang et al., in section 2, the authors define these terms, under the category influence.


Causative - Causative attacks alter the training process through influence over the training data.

Exploratory - Exploratory attacks do not alter the training process but use other techniques, such as probing the detector, to discover information about it or its training data.

They also provide other related definitions.

Security violation

Integrity - Integrity attacks result in intrusion points being classified as normal (false negatives).

Availability - Availability attacks cause so many classification errors, both false negatives and false positives, that the system becomes effectively unusable.

Privacy - In a privacy violation, the adversary obtains information from the learner, compromising the secrecy or privacy of the system’s users.

Specificity (a continuous spectrum)

Targeted - In a targeted attack, the focus is on a single or small set of target points.

Indiscriminate - An indiscriminate adversary has a more flexible goal that involves a very general class of points, such as “any false negative.”

Michael Hearn

Posted 2019-11-13T22:33:44.617

Reputation: 522