How should ethics be applied in data science



There was a recent furore with facebook experimenting on their users to see if they could alter user's emotions and now okcupid.

Whilst I am not a professional data scientist I read about data science ethics from Cathy O'Neill's book 'Doing Data Science' and would like to know if this is something that professionals are taught at academic level (I would expect so) or something that is ignored or is lightly applied in the professional world. Particularly for those who ended up doing data science accidentally.

Whilst the linked article touched on data integrity, the book also discussed the moral ethics behind understanding the impact of the data models that are created and the impact of those models which can have adverse effects when used inappropriately (sometimes unwittingly) or when the models are inaccurate, again producing adverse results.

The article discusses a code of practice and mentions the Data Science Association's Code of conduct, is this something that is in use? Rule 7 is of particular interest (quoted from their website):

(a) A person who consults with a data scientist about the possibility of forming a client-data scientist relationship with respect to a matter is a prospective client.

(b) Even when no client-data scientist relationship ensues, a data scientist who has learned information from a prospective client shall not use or reveal that information.

(c) A data scientist subject to paragraph (b) shall not provide professional data science services for a client with interests materially adverse to those of a prospective client in the same or a substantially related industry if the data scientist received information from the prospective client that could be significantly harmful to that person in the matter

Is this something that is practiced professionally? Many users blindly accept that we get some free service (mail, social network, image hosting, blog platform etc..) and agree to an EULA in order to have ads pushed at us.

Finally how is this regulated, I often read about users being up in arms when the terms of a service change but it seems that it requires some liberty organisation, class action or a senator to react to such things before something happens.

By the way I am not making any judgements here or saying that all data scientists behave like this, I'm interested in what is taught academically and practiced professionally.


I don't have enough rep to add an ethics tag or perhaps social experimentation and facebook, if someone could oblige – EdChum – 2014-07-23T14:06:20.877

I'm a bit confused as to what you're actually asking here. The title of this question seems inconsistent with the body and how ethics "should" be applied is way beyond the scope of this site. Academic programs are pretty diverse, but a question about professional standards accountability seems like it would be both answerable and reasonably scoped. Can you clarify or refocus your question? – Air – 2014-07-23T19:40:09.743


You may also find this meta discussion interesting, if you haven't read it already.

@AirThomas Thanks for that meta link, I wasn't sure if this would be n topic or not. I guess what I'd like to know is whether ethics are taught or imbued academically and in the professional workplace and whether a moral code of practice exists for data science, this is a naive and broad question I admit – EdChum – 2014-07-23T20:00:22.007

Please, re-edit your question. I feel you have more than one question in your mind. Can you appropriately list them (first paragraph for introducing what you have read, second paragraph for your opinion about it, and third paragraph for listing your questions)? – None – 2014-07-28T17:12:41.130

1I'd suggest removing the word "moral" from the title of the question. Ethics, by definition, implies moral aspect as foundational. – Aleksandr Blekh – 2014-07-29T12:38:08.847

1@AleksandrBlekh sure will do, thanks for feedback, I noted today that okcupid have just admitted experimenting on users. – EdChum – 2014-07-29T12:39:48.540



I think ethics in Data Science is important. There is a fundamental difference in using user data to better their experience and show relevant ads and using user data to trick people into clicking on ads for the sake of monetary profit. Personally I like ads that give me relevant information like deals on things I would buy anyway. However, showing me weight loss ads because I got dumped is creepy and unethical. As my friend Peter always says, "don't be creepy with data".


There don't seem to be any special ethics in data science - exploiting someone else's pain to make a buck is plain wrong. So is stealing their info to make a buck. It's not like being a doctor and deciciding whether to push the patient towards surgery with an outside chance of working when that's what they want but you know their spouse and kids just want to make make most of the time that they could have together. So far these things seem little taught - but if anything is required it is assistance to recognise when people are being unfairly exploited.

