Wednesday, 7 December 2016

Graph Database OrientDB

Neo4j's book 'Graph Databases' provides an easy introduction to the topic (google for its PDF version). Try to concentrate on schema design principles supported by practical examples and skip learning their query language Cypher unless you want to use Neo4j for your projects.

On Dec 7th, 2016 there was a good webinar about migrating data from Neo4j to OrientDB. It was surprising to see what OrientDB has under the hood - nice SQL, indexing, clustering, sharding, security model and visual representation of actual graphs stored in the database. Also OrientDB could be used as a scheme-less or schema-based  graph-oriented database or a document-oriented database or use both models (graph and document) simultaneously. Unfortunately it doesn't run on Java ME yet (it's in its roadmap though). A shortened earlier version of this webinar is available on Youtube. Below is a copy of questions and answers popped up during the presentation:

Q: Is OrientDB enterprise edition available under an open source license as well as commercial?
A: OrientDB Enterprise Edition is only commercial, but the Community is licensed as Open Source with Apache2. You can try Enterprise Edition for 45 days before to decide.

Q: LIke a viral license (AGPL) that would allow us to use enterprise edition if we open source the code we use with orient?
A: OrientDB Community Edition is licensed as Apache2, so it's not viral. You can use it for any purpose, even embedding it at no cost.

Q: Which version of OrientDB are you using?
A: OrientDB v2.2.13

Q: Do you offer a startup program? So that small companies can use enterprise edition at no/low cost?
A: Absolutely. We provide 50% of discount for startups. No hidden costs, it's all on our web site.

Q: when using the object model are ther constraints to consider can i mix match models?
A: The object model (JPA-like) doesn't work very well with the graph one, so we sugget to pick one of them. The Graph model is more powerful and supported.

Q: Does the choice of schema mode influence the performance of standard or index-based lookups?
A: Yes, using the schema makes the database much smaller (property names are not saved in the record) and therefore a smaller database is faster.

Q: Inaccurate to say that Neo4j does not support inheritance or polymorphic queries- it does with labels.
A: Neo4j labels are not polymorphic and there is not such concept in Neo4j. For more information look at http://stackoverflow.com/questions/24873067/how-to-work-with-type-hierarchies-in-neo4j, specially at the last answer/comment.

Q: How would you compare orient's extended SQL to Gremlin?
A: SQL and Gremlin are quite different in many senses, SQL is declarative, Gremlin is more oriented to step-by-step traversal/filtering. Please consider that both OrientDB and Neo4j support Gremlin, that is a standard, so you can easily migrate from one to another

Q: How do you do variable depth queries in orientdb? e.g. min depth 1 to max depth 5
A: you can use a mix of TRAVERSE and SELECT, like "SELECT FROM (TRAVERSE ... WHILE $depth < 10) WHERE $depth > 3. Or you can use the MATCH syntax, where you have distinct WHILE and WHERE conditions, one for the traversal and the other for filtering

Q: Can you query across clusters? For example if the graph is fully connected?
A: Yes, OrientDB will manage the query for you. Of course the query performance depends on how many hops you do between clusters

Q: So a cluster is a cut of the graph essentially?
A: Yes, exactly

Q: If you can write to multiple mastes, how does orient handle consistency / transactions?
A: OrientdB supports distributed transactions that assure the concistency of the database by using a 2-phase locking protocol across the servers.

Q: If you can write to multiple mastes, how does orient handle consistency / transactions?
A: Consistency is based on MVCC and quorum based consensus.

Q: What requirements does OrientDB have for Java runtime? Will it run on Java ME?
A: In terms of runtime, it only needs Java SE (no Java ME supported for now)

Q: What version of Neo4j are you comparing OrientDB to? And what version of OrientDB are you talking about?
A: We compared last GA version of both products, so Neo4j 3.0.6 and OrientDB 2.2.13

Q: Does orient work with LDAP / active directory?
A: OrientDB supports Kerberos and you can import LDAP users

Q: Neo4j seems to be a much larger company (in fact I think they just raised a bunch of money). Why do you think it is possible for orient to have so many more features than Neo4j while being a much smaller company?
A: Receiving funding is not a guarantee the company will be there tomorrow. Look at what's happened to RethinkDB and other companies that have received funding, but weren't focused on building a sunstainable business. OrientDB company is profitable since 3 years ago and its investors are its clients. We believe this is the only healthy business :-)

Q: Does key/value model (like Redis) stays in memory? We use Redis for the speed (memory residence). How will the speed be affected if we drop Redis and move to Orient for Key/Value?
A: As "just" Key/Value DBMS, Redis is faster, so if you just need a K/V we suggest to use Redis. But if your domain is more complex and requires documents, graphs, etc, then the Multi-Model approach is the best in terms of global performance and complexity.

Q: What is the theoretical/practical limit to the number of classes in ODB?
A: in current release you can have up to 32.000 data files, so if you have one per class you can have 32.000 classes

Q: What are the most common use cases for your customers in production now? How do these differ from the use-cases of customers using Neo4j?
A: While Neo4j is "only" a Graph Database, OrientDB can be used on a wider number of use cases, especially when an Operational database is required. For Operational I mean a primary database, while Neo4j in 99% of the cases is used as a secondary database, mostly for analytics with data loaded from a RDBMS (the primary one).

Q: Does the graph have size limits? Please give an example of a BIG graph already deployed in OrientDB.
A: These are the limitations: http://orientdb.com/docs/2.2/Limits.html. You can create up to 302,231,454,903 Trillion of vertices and edges, it should be enough :-) The biggest installation is for an energy company with +100 servers.

No comments:

Post a Comment

Online Encyclopedia of Statistical Science (Free)

Please, click on the chart below to go to the source: