What is the defining Set in NLP



I am reading the paper Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings here is the pdf.

On page 6, we read:

Step 1: Identify gender subspace. Inputs: word sets W , defining sets D_1 , ..., D_m. 

However, they paper before and after this statement does not mention what these defining sets are? Can anyone give me a definition or description of these sets?

Thank you.


Posted 2019-10-09T15:46:11.343

Reputation: 143



If you read the following sentence at the first line of section 6:

The debiasing algorithms are defined in terms of sets of words rather than just pairs, for generality, so that we can consider other biases such as racial or religious biases.

$D_1, D_2, \ldots, D_m$ are the set of words, in general, to be considered in the de-biasing algorithm, instead of considering a pair of words such as Computer Programmer and Houseworker for the de-biasing genders of man and woman. Hence, one of $D$s could be {'Computer Programmer', 'Houseworker'}, here, as an example.


Posted 2019-10-09T15:46:11.343

Reputation: 1 001

Ok! in your last sentence you have {'Computer Programmer', 'Houseworker'}. Can you give me an example of one with more than 2 elements in it? They have posted also their code on GitHub. https://colab.research.google.com/github/tolga-b/debiaswe/

– chikitin – 2019-10-09T18:00:50.473

@chikitin somewhere that you have more than three values. It depends on the application. For example, If you have three gender types like man, woman and neutral, you will have {'Computer Programmer', 'Houseworker', 'Helper'}. It is just an artificial example to show the case! – OmG – 2019-10-09T18:06:52.073

ok! Now, if we want to de-biase in addition to gender, against race and nationality. We need to apply the algorithm two more times in Cascade? But I thought this can be done in one shot, by defining D1 for gender, D2 for race, D3 for nationality, and setting k 3 in Step 1. – chikitin – 2019-10-09T18:42:00.277

@chikitin I don't think so. At least, the example does not show the case. – OmG – 2019-10-10T10:04:16.763

now worries. I contacted all the authors, one of the said look at the GitHub and figure it our yourself. and in the code you can see there is no mention of the 'defining' and I have not received a response from other authors. I wait few more days for others to respond! – chikitin – 2019-10-11T11:56:32.313