Neo4j vs OrientDB vs Titan

13

2

I am working on a data-science project related on social relationship mining and need to store data in some graph databases. Initially I chose Neo4j as the database. But it seams Neo4j doesn't scale well. The alternative I found out are Titan and oriebtDB. I have gone through this comparison on these three Databases, But I would like to get more details on these databases. So Could some one help me in choosing the best one. Mainly I would like to compare performance, scaling, on line documentation/tutorials available, Python library support, query language complexity and graph algorithm support of these databases. Also is there any other good database options ?

Sreejithc321

Posted 2014-12-18T04:36:06.107

Reputation: 1 810

1

Also consider Graphlab (python based): http://graphlab.com/products/create/overview.html Here's a good blog post about it as well: http://bugra.github.io/work/notes/2014-04-06/graphs-databases-and-graphlab/ I can't help you with the Titan vs oriebtDB discussion though. Hopefully someone will chime in with that.

– nfmcclure – 2014-12-18T18:03:46.183

Also possible to use Spark and GraphX – sheldonkreger – 2014-12-20T21:00:53.057

2No, it's not; GraphX is not a database. – Emre – 2015-03-14T04:59:30.633

Since this was couple of months back, I assume you made some progress. Why not add your own answer (here or elsewhere?) – Jayan – 2015-06-02T16:59:31.707

Hi @Jayan as per our use-case we were initially thinking of storing entire data on Neo4j, but finally we choose MongoDB as the central database and stick with Neo4j for analyzing relationships only (not as central DB). And also we are exploring with Spark and Graphlab – Sreejithc321 – 2015-06-07T11:39:32.940

This article shows some details on scalability, special advantage for Titan. https://groups.google.com/forum/#%21topic/orient-database/CpPh42ukfH4

– Henry H. – 2015-01-12T10:31:34.573

Answers

1

I think you might have to keep overall Data pipelines and Machine learning pipelines in mind. For which you need a robust framework to move data between table like and graph like storage apart from powerful distributed processing. From my understanding Spark GraphX is promising to build these pipelines. Joseph Gonzalez's (one of the creator of GraphLab from CMU) talk on GraphX on youtube is worth watching.

Srini Vemula

Posted 2014-12-18T04:36:06.107

Reputation: 39