Are there any ongoing projects which use the Stack Exchange for machine learning?



Are there any ongoing AI projects which use the Stack Exchange for machine learning?


Posted 2016-09-14T10:16:51.237

Reputation: 341



There certainly appear to have been research projects involving some form of text mining / information retrieval /etc. and StackExchange sites.

Some examples I was able to find through google/google scholar (unlikely to be anywhere near an exhaustive list):

More generally, Automated Question Answering systems appears to be a rather active area of research still, not a trivial / "solved" problem. StackExchange can be one source of data for such systems, but there are plenty of other sources of data too (Wikipedia, Quora, etc.).

Dennis Soemers

Posted 2016-09-14T10:16:51.237

Reputation: 7 644


DuckDuckGo learns answers to technical questions from StackExchange. Type a technical question like "ongoing projects use stackexchange" into DuckDuckGo and it will provide a highlighted summary of the answer on the right-hand side. And the duck has an open API for many (100s) more question answering data sources. Or you can go directly to the stackexchange api.

Projects can use the data from the SE open API as long as they comply with their TOU. Basically just make sure your users can tell that the data came from Stack Exchange. The copyright license may also limit your ability to alter the contents of the text, with say a learned abstractive summarizer. Perhaps that is why the just highlights keywords.

Data rights law is in flux, especially when it comes to the data you submitted to a site and the machine learning models derived from that data. New European data and privacy rules empower you to download or delete all data you submit to a site like stack exchange.


Posted 2016-09-14T10:16:51.237

Reputation: 151