## Methods for learning with noisy labels

2

I am looking for a specific deep learning method that can train a neural network model with both clean and noisy labels.

More precisely, I would like this method to be able to leverage noisy data as well, for instance by not fully "trusting" noisy data, or weighting samples, or deciding whether to use a specific sample at all for learning. But primarily, I am looking for inspiration.

Details:

• My task is sequence-to-sequence NLP,
• I have both clean pairs of sequences of (clean input, clean output) and noisy ones (noisy_input, noisy_output),
• I know for certain which samples in my data are noisy, and if possible, I would like the desired method to make use of this information

Edit: Noisy vs. negative examples

First, I wouldn't use the word "noisy" here because if you know which instances are "wrong" then these are not noise, they are negative examples.

My view is that the data I have are noisy examples, but not "negative". Using an example from machine translation from German to English:

clean (equivalent meaning)

DE Wenn es um die Medien geht, lebt Amerika in einem Paralleluniversum.
EN Regarding media, the US are living in a parallel universe.


noisy (meaning overlap)

DE Wenn es um die Medien geht, lebt Amerika in einem Paralleluniversum.
EN Regarding media, the US are weird.


negative (unrelated)

DE Wenn es um die Medien geht, lebt Amerika in einem Paralleluniversum.
EN Is Math related to science?


1There's a nice paper recently, Self Training With Noisy Labels, it might help. (It's for images though but idea is general) – Aditya – 2020-02-17T03:51:10.793

@Aditya Thanks, will have a look! – Mathias Müller – 2020-02-17T07:36:08.850

3

First, I wouldn't use the word "noisy" here because if you know which instances are "wrong" then these are not noise, they are negative examples. In my opinion "noisy" is when positive and negative cases are mixed together in a way that makes it difficult (or impossible) to distinguish between them. I think this matters because you're more likely to find similar use cases and relevant methods using this terminology.

I don't have a precise method to suggest but I would check the state of the art in machine translation: it's also a sequence-to-sequence task in which there are potential positive/negative cases. In particular there has been some work done in MT quality estimation, where the goal is to predict the quality of a translation for a sentence. This might be related because it's about labeling or quantifying how good a translation is, and I would assume that there are works which re-use labelled/scored translations (including potentially wrong ones) in order to obtain a better model. Unfortunately I don't have any pointers since I haven't followed the field recently.

Thanks for your answer! I do not think "negative examples" is a good term, since the ones I have are not entirely wrong, just partially. The labels can be partially wrong since they are sequences of words. Will edit my question to clarify. My apologies if I misinterpret the meaning of "negative example". – Mathias Müller – 2020-02-16T19:29:46.000

2@MathiasMüller ok I see it's not binary, so maybe you're right that the positive/negative terminology doesn't work. I'm still a bit skeptical about "noisy", but I don't have any better word. On the technical side this shares a lot of similarities with MT Quality Estimation: the "quality" can either be represented as a numerical score or with ordered classes. I vaguely remember works where the predicted "quality" was used to re-train a model somehow. – Erwan – 2020-02-16T23:36:34.383

Thanks Erwan! I'm doing research in MT at the moment, so I'm aware of quality estimation methods in general - but will have another look. What I have in mind though is a generic method to train a neural network that takes into account label quality. – Mathias Müller – 2020-02-17T07:42:36.417