OrientDB vs Neo4j
Comparisons (querys and functionality)
Curtis Mosters
@02.12.2014
Content
• Schema
• Indexes
• Comparison
• Query/Speed
• Functionality
• Results
2OrientDB vs Neo4j - Comparison
Prototype Comparison
Schema
ApplnPerson
WROTE
Abstract
HAS_ABSTRACT
ID:INTEGER
name:String
ID:INTEGER
title:String
ID:INTEGER
abstract:String
Indexes
• Appln.title
• LUCENE FULLTEXT
• Appln.ID
• SBTREE UNIQUE (in Neo4j the usual INDEX)
• Person.title
• LUCENE FULLTEXT
• Person.ID
• SBTREE UNIQUE (in Neo4j the usual INDEX)
4OrientDB vs Neo4j - Comparison
ComparisonPrototype
Querys and used systems
• comparing the speed of both on typical requests
• Linux 64-bit (same instance on AWS)
• OrientDB v.2.0M2
• Neo4j v.2.1.5
• Speed tests are done in the same order the slides/rows are
• One database per instance  2 instances
• Servers are idling and just OrientDB/Neo4j running
• Querys are tested by hand on the command line (not in the studio)
• Querys always having the same results on both databases
• Times are always given in milliseconds (ms) if not specified
• Both databases using the StandardAnalyzer from Lucene
• Cache cleared after querys
ComparisonPrototype
System cache notes
• OrientDB is always clearing the cache when restarted
• Neo4j does not clear the cache
• So in the Neo4j column I in some cases tested with cleared system cache and sometimes
without
• If there is just one column on Neo4j it is „No System cache cleared“
Comparison (Query/Speed)
OrientDB vs Neo4j - Comparison 7
ComparisonPrototype
Import
OrientDB
• Official supported methods
• OrientDB-ETL/JDBC
• Java API
• Clean Java code
• ETL tool is performant but at last tests having
issues with edge creation
• Not using Multi-Threading
• Not using Mapping
Neo4j
• Official supported methods
• LOAD CSV command
• Java API
• Groovy
• Batch-Importer
• Talend
• No really „easy“ way but Java is the fastest and
most reliable way
• Using Multi-Threading and Mapping
OrientDB vs Neo4j - Comparison8
~300mio lines {APPLNs,TITLEs,PERSONs} with edges and indexes
25 hours 19 hours
ComparisonPrototype
Startup/Shutdown speed
OrientDB
• Nearly always the same time when starting or
shutting down the server
• 2 sec – 10 sec
Neo4j
• Different times when starting and especially by
shutting down the server when task is still
running
• 3 sec – 3 min (no infos)
OrientDB vs Neo4j - Comparison9
Good for testing and later reliability
ComparisonPrototype
Query #1
OrientDB Neo4j
OrientDB vs Neo4j - Comparison10
Checking Single ID lookup
? SELECT FROM Appln WHERE ID=? MATCH (a:Appln)WHERE a.ID=? RETURN a
1412 27 71 939
763773 9 30 44
234526 15 26 43
858584 10 25 44
536367 11 25 43
2323 17 18 31
5267 1 15 24
73573 14 29 35
585985 10 25 34
797977 10 26 35
Average 12,4 (10 of 10) 29 (0 of 10)
No system cache cleared System cache cleared
ComparisonPrototype
Query #2
OrientDB Neo4j
OrientDB vs Neo4j - Comparison11
Checking Fulltext Lucene Lookup
?
Note on Neo4j:
more than one word needs to
be put in a new property
statement, e.g. instead of
'title:super efficient'
we take 'title:super OR
title:efficient'
SELECT FROM (SELECT title,ID FROM ApplnWHERE title
LUCENE "?" ORDER BY ID) LIMIT 10
START n=node:titles('title:?') RETURN n.title,n.IDORDER BY n.ID
LIMIT 10
solar 10172 801 137088
panel 263698 121494 161215
druck 25582 9679 11290
machine 1146339 297645 357818
cell 253565 55397 26298
automatic vehicle 961054 131772 163794
super efficient 53380 8432 8707
motor 398803 79527 46687
airplane 14066 892 390
windshield 8969 1004 536
Average 313 sec (5,2 min) (0 of 10) 70 sec (10 of 10)
No system cache cleared System cache cleared
ComparisonPrototype
Query #3.1
OrientDB Neo4j
OrientDB vs Neo4j - Comparison12
Checking Fulltext Lucene Lookup Overall Count on 1 indices
?
Note on Neo4j:
more than one word needs to
be put in a new property
statement, e.g. instead of
'title:super efficient'
we take 'title:super OR
title:efficient'
SELECT $totalHits
FROMAppln
WHERE title LUCENE "?" LIMIT 1
START n=node:titles("title:?")
RETURN count(*)
solar 4611 215263
panel 3318 77442
druck 2890 12503
machine 1846 198479
cell 2351 34685
automatic vehicle 1063 49283
super efficient 984 4054
motor 465 47085
airplane 1172 429
windshield 62 585
Average 9 of 10 1 of 10
ComparisonPrototype
Query #3.2
OrientDB Neo4j
OrientDB vs Neo4j - Comparison13
Checking Fulltext Lucene Lookup Overall Count on 2 indices
?
Note on Neo4j:
more than one word needs to
be put in a new property
statement, e.g. instead of
'title:super efficient'
we take 'title:super OR
title:efficient'
SELECT $totalHits
FROMAppln
WHERE [title,abstract] LUCENE "?" LIMIT 1
START n=node:titles ('title:?')
MATCH (n)-[:HAS_ABSTRACT]->(a)WHERE a.abstract =~ ".*?.*"
RETURN count(*)
solar 227234
panel
druck
machine
cell
automatic vehicle
super efficient
motor
airplane
windshield
Average
ComparisonPrototype
Query #4
OrientDB Neo4j
OrientDB vs Neo4j - Comparison14
Internal ID function node lookup
?
OrientDB
?
Neo4j
SELECT title FROM #11:? / SELECT name FROM #12:? START n=node(?) RETURN n.title / START n=node(?) RETURN
n.name
11:0 0 1 10 816
11:141 141 1 13 27
11:26526 26526 3 13 28
11:2526 2526 2 12 27
11:6262 6262 1 12 28
12:0 76594275 1 11 25
12:515 76594790 2 14 23
12:4115 76598390 3 14 25
12:52627 76646902 2 13 26
12:47484 76641759 1 13 25
Average 2 (10 of 10) 13 (0 of 10)
No system cache cleared System cache cleared
ComparisonPrototype
Query #5
OrientDB Neo4j
OrientDB vs Neo4j - Comparison15
Count Applns of a specific Person
?
OrientDB
?
Neo4j
SELECT out(WROTE).size()
FROM #?
START p=node(?)
MATCH (p)-[:WROTE]->(a)
RETURN count(*)
12:0 76594275 8 81 980
12:1 76594276 1 18 42
12:2 76594277 1 20 41
12:3 76594278 1 18 38
12:4 76594279 1 17 39
12:5 76594280 1 23 41
12:6 76594281 1 21 37
12:7 76594282 1 17 43
12:8 76594283 1 18 45
12:9 76594284 1 17 41
Average 1 (10 of 10) 25 (0 of 10)
No system cache cleared System cache cleared
ComparisonPrototype
Query #6
OrientDB Neo4j
OrientDB vs Neo4j - Comparison16
Searching for 3 Applns of one specific Person
?
OrientDB
?
Neo4j
select out.@class as sourceClass,out.@rid as source ,out.name
as sourceName,in.@class as targetClass,in.@rid as target,in.ID
as targetID ,in.nrEpodoc as targetName from (select
expand(outE('WROTE')) from #?) order by targetID ASC limit 3
START p=node(?)
MATCH (p)-[:WROTE]->(a)
RETURN labels(p) as sourceClass, id(p) as source, p.name as
sourceName, labels(a) as targetClass, id(a) as target, a.nrEpodoc
as targetNameORDER BY a.ID ASC LIMIT 3
12:0 76594275 1051 107 212
12:1 76594276 3 39 77
12:2 76594277 2 40 68
12:3 76594278 2 38 60
12:4 76594279 3 41 58
12:5 76594280 53 59 55
12:6 76594281 56 53 59
12:7 76594282 7 38 56
12:8 76594283 5 38 62
12:9 76594284 2 33 66
Average 118 (8 of 10) 49 (2 of 10)
No system cache cleared System cache cleared
ComparisonPrototype
Query #7
OrientDB Neo4j
OrientDB vs Neo4j - Comparison17
Searching for Appln.title and Appln.abstract
return Person.name matching both
?
Title
SELECT FROM (SELECT title,abstract,ID from Appln
where [title,abstract] LUCENE "?" ORDER BY ID) LIMIT 3
START p=node:titles('title:?')
MATCH (p)-[:HAS_ABSTRACT]->(a) WHERE a.abstract
=~ ".*?.*"
RETURN p.title,a.abstract,a.ID ORDER BY a.ID LIMIT 3
panel 1733261 424789
Average
ComparisonPrototype
Query #7
OrientDB Neo4j
OrientDB vs Neo4j - Comparison18
Searching a Person.name + searching on Appln.title for Appln of that specific Person
return Person.name matching both
?
Title
START p=node:people('name:?')
MATCH (p)-[:WROTE]->(a) WHERE a.title =~ ".*?.*"
RETURN p.name,a.title,a.IDORDER BY a.ID LIMIT 3
machine 99538
Average
ComparisonPrototype
Query #8
OrientDB Neo4j
OrientDB vs Neo4j - Comparison19
Searching for an Abstract of an Appln
?
Note on Neo4j:
more than one word needs to
be put in a new property
statement, e.g. instead of
'title:super efficient'
we take 'title:super OR
title:efficient'
select @rid,abstract,ID as titleID,in(HAS_ABSTRACT).title as
title,in(HAS_ABSTRACT).ID as AbstrID fromAbstract where
abstract LUCENE "method" LIMIT 3
START n=node:abstracts("abstract:method")
WITH n limit 3
MATCH (x:Appln)-[:HAS_ABSTRACT]->(n)
RETURN n.ID,x.ID
solar
panel
druck
machine
cell
automatic vehicle
super efficient
motor
airplane
windshield
Average
ComparisonPrototype
Query #9
OrientDB Neo4j
OrientDB vs Neo4j - Comparison20
Counting the Applns of Person.names containing a specific name
? SELECT sum(out(WROTE).size())
FROM Person
WHERE name LUCENE "?" LIMIT -1
START p=node:people('name:?')
MATCH (p)-[:WROTE]->(a)
RETURN count(a)
bosch 7475 3771
intel 13261 7461
siemens 19302 16297
audi 3888 1844
volkswagen 2872 1298
toyota 23223 13561
sony 16520 11449
panasonic 6314 2287
microsoft 2849 1313
apple 3127 1088
Average 0 of 10 10 of 10
Comparison (Functionality)
OrientDB vs Neo4j - Comparison 21
ComparisonPrototype
Database Overview
OrientDB
• Schema, naming policies, overall records,
cluster infos and many more infos
• Whole page in 0,1 sec
Neo4j
• No schema infos except naming policies
• Counting single label nodes takes ~10 min
OrientDB vs Neo4j - Comparison22
Easy and fast way to check state of the database Neo4j‘s supported way to get infos on all
labels in one query just gives a Heap Error
(maybe too much data?)
ComparisonPrototype
Graph Explorer
OrientDB
• Good overview, straightforward and fast
• Nodes can be edited, edges added
• Never-ending-graph like
Neo4j
• Showing nodes/edges and when being clicked
some infos about
• No other features, not even zooming or
dragging all elements
OrientDB vs Neo4j - Comparison23
Good for checking graph issues as near as possible to the database
v.2 only!
ComparisonPrototype
Result view
OrientDB
• Great overview and paging possible to lower
showup and query speed
• If you miss setting a „LIMIT“ it‘s set for you!
• Using new GraphTab for visual things (v.2!)
Neo4j
• Graph andTable view
• Miss setting a LIMIT? Go smoking 
• Graph just able to see up to 10 nodes
• Table view endless scrolling
OrientDB vs Neo4j - Comparison24
Getting an overview is quite important to check specific query issues
ComparisonPrototype
Function integration
OrientDB
• Good overview and management
• Integrated in the Studio
• No restart needed
• Functions can even be copied to another db
Neo4j
• Server plugins [1]
• Needs to be written in Java and inherited from
ServerPlugin class
• No overview
• Not fail-save
• No easy change/access
• Requires Server restart
• Many lines for simple things
OrientDB vs Neo4j - Comparison25
Needed for exchange information with the prototype
ComparisonPrototype
Query style
OrientDB
• Simple querys really short
• Hard to write querys when they are getting
complex
• Bad overview and using variable names not
intuitive
Neo4j
• Simple querys really long due to needed
cypher statements
• Easy to write also complex querys
• Using variables name is very intuivite and
always keeping up the overview
OrientDB vs Neo4j - Comparison26
Useful for result checking and testings
ComparisonPrototype
Lucene Index
OrientDB
• Still a „new“ addon
• Prior v.2 plugin needed
• With v.2 integreated in OrientDB
• Use it as if you set an usual index
• Index can easily be changed at any time
• Analyzer can be easily changed
Neo4j
• Neo4j does not always use Lucene as indexer
• Needs to be set before importing data
• Works together via node_auto_index
configuration
• Changing index or set index to Lucene after the
import is not viable in terms of time aspects
• Analyzer is not easy to change
OrientDB vs Neo4j - Comparison27
Important for full text search the new graph tab builds up
ComparisonPrototype
Security
OrientDB
• Different security levels (like in MySQL)
Neo4j
• None
OrientDB vs Neo4j - Comparison28
Good for integrating more databases and setting access levels
ComparisonPrototype
Disc usage
OrientDB
• Db size = 120 GB
• Classes in different files
• Classes can also be easily deleted by external
deletion
Neo4j
• Db size = 40 GB
• Nodes, properties and relations in separate
files
• Specific data can only be deleted by Neo4j
commands
OrientDB vs Neo4j - Comparison29
Good for testing and later reliability
ComparisonPrototype
Future Perspective
OrientDB
• OrientDB still „new“ on the market, many
features still coming
• Still much place for improvements
• Brings the possibility to replace MySQL
Neo4j
• Neo4j „oldest“ Graph database and nearly any
feature in there
• Algorithms already improved as best as
possible
• No possiblity to replace a current system, just
an extension for using graphs
OrientDB vs Neo4j - Comparison30
To see ahead of the current state
ComparisonPrototype
Costs
OrientDB
• Good support for free available
• Commercial support much cheaper than Neo4j
• EnterpriseVersion available with good
monitoring features
Neo4j
• Commercial support needed to setup a well
defined database
• Features like clustering only available when
paying (e.g. important for our where clause)
OrientDB vs Neo4j - Comparison31
Important for startups
ComparisonPrototype
Support / Production speed / Own Ideas
OrientDB
• Good support via
• E-Mail
• Google Group (anyone from the team helping)
• Gitter
• Github
• Every 2-3 weeks new release
• Own Issues answered in 1-2 day
• Own ideas are discussed, every day 30-40
comments in Github
Neo4j
• Poor support for the most popular graph db
• Google Group only semi-active community
• Just one member from Neo4j helping there
• Every 1-2 month new release
• Own issues answered ~1 week
• Own ideas are mainly ignored, every day 20-30
comments in Github
OrientDB vs Neo4j - Comparison32
Important for later issue solvings
Results (Speed)
Measure OrientDB Neo4j
Import no use of MT/mapping full use of MT/mapping
Startup/Shutdown Speed x -
Query #1 Checking Single ID lookup x -
Query #2 Checking Fulltext Lucene Lookup - x
Query #3.1 Checking Fulltext Lucene Lookup Overall Count on 1 indices x -
Query #3.2 Checking Fulltext Lucene Lookup Overall Count on 2 indices - -
Query #4 Internal ID function node lookup x -
Query #5 Count Applns of a specific Person x -
Query #6 Searching for 3 Applns of one specific Person single bolter making poor average value always quite same speed
Query #7 Searching a Person.name + searching on Appln.title for Appln - -
Query #8 Searching for an Abstract of an Appln - -
Query #9 Counting the Applns of Person.names containing a specific name - x
Results 4 3
OrientDB vs Neo4j - Comparison 33
Results (Misc)
Measure OrientDB Neo4j
Database Overview x
Graph Explorer x
Result View x
Function Integreation x
Query style x
Lucene Index x
Security x
Disc Usage every class in single file using less disk space
Future Perspective x
Costs x
Support / Production Speed / Own ideas x
Results 9 1
OrientDB vs Neo4j - Comparison 34
Results
• OrientDB working on fixing the very slow querys
• OrientDB has inconsistent query speed somtimes (super high and super low)
• OrientDB Studio is on a really next level
• Neo4j Studio nearly useless compared to OrientDB‘s
OrientDB vs Neo4j - Comparison 35
Supporters
• I want to give a special thanks to Michael Hunger, without him the Neo4j
import would still have trouble
• I also want to thank Enrico Risa for his help and fast implementation of
Lucene improvements
• Keep up the great work!
36OrientDB vs Neo4j - Comparison
Links
• [1] http://docs.neo4j.org/chunked/stable/server-plugins.html
• [2] http://docs.neo4j.org/refcard/2.0/
37OrientDB vs Neo4j - Comparison

OrientDB vs Neo4j - Comparison of query/speed/functionality

  • 1.
    OrientDB vs Neo4j Comparisons(querys and functionality) Curtis Mosters @02.12.2014
  • 2.
    Content • Schema • Indexes •Comparison • Query/Speed • Functionality • Results 2OrientDB vs Neo4j - Comparison
  • 3.
  • 4.
    Indexes • Appln.title • LUCENEFULLTEXT • Appln.ID • SBTREE UNIQUE (in Neo4j the usual INDEX) • Person.title • LUCENE FULLTEXT • Person.ID • SBTREE UNIQUE (in Neo4j the usual INDEX) 4OrientDB vs Neo4j - Comparison
  • 5.
    ComparisonPrototype Querys and usedsystems • comparing the speed of both on typical requests • Linux 64-bit (same instance on AWS) • OrientDB v.2.0M2 • Neo4j v.2.1.5 • Speed tests are done in the same order the slides/rows are • One database per instance  2 instances • Servers are idling and just OrientDB/Neo4j running • Querys are tested by hand on the command line (not in the studio) • Querys always having the same results on both databases • Times are always given in milliseconds (ms) if not specified • Both databases using the StandardAnalyzer from Lucene • Cache cleared after querys
  • 6.
    ComparisonPrototype System cache notes •OrientDB is always clearing the cache when restarted • Neo4j does not clear the cache • So in the Neo4j column I in some cases tested with cleared system cache and sometimes without • If there is just one column on Neo4j it is „No System cache cleared“
  • 7.
  • 8.
    ComparisonPrototype Import OrientDB • Official supportedmethods • OrientDB-ETL/JDBC • Java API • Clean Java code • ETL tool is performant but at last tests having issues with edge creation • Not using Multi-Threading • Not using Mapping Neo4j • Official supported methods • LOAD CSV command • Java API • Groovy • Batch-Importer • Talend • No really „easy“ way but Java is the fastest and most reliable way • Using Multi-Threading and Mapping OrientDB vs Neo4j - Comparison8 ~300mio lines {APPLNs,TITLEs,PERSONs} with edges and indexes 25 hours 19 hours
  • 9.
    ComparisonPrototype Startup/Shutdown speed OrientDB • Nearlyalways the same time when starting or shutting down the server • 2 sec – 10 sec Neo4j • Different times when starting and especially by shutting down the server when task is still running • 3 sec – 3 min (no infos) OrientDB vs Neo4j - Comparison9 Good for testing and later reliability
  • 10.
    ComparisonPrototype Query #1 OrientDB Neo4j OrientDBvs Neo4j - Comparison10 Checking Single ID lookup ? SELECT FROM Appln WHERE ID=? MATCH (a:Appln)WHERE a.ID=? RETURN a 1412 27 71 939 763773 9 30 44 234526 15 26 43 858584 10 25 44 536367 11 25 43 2323 17 18 31 5267 1 15 24 73573 14 29 35 585985 10 25 34 797977 10 26 35 Average 12,4 (10 of 10) 29 (0 of 10) No system cache cleared System cache cleared
  • 11.
    ComparisonPrototype Query #2 OrientDB Neo4j OrientDBvs Neo4j - Comparison11 Checking Fulltext Lucene Lookup ? Note on Neo4j: more than one word needs to be put in a new property statement, e.g. instead of 'title:super efficient' we take 'title:super OR title:efficient' SELECT FROM (SELECT title,ID FROM ApplnWHERE title LUCENE "?" ORDER BY ID) LIMIT 10 START n=node:titles('title:?') RETURN n.title,n.IDORDER BY n.ID LIMIT 10 solar 10172 801 137088 panel 263698 121494 161215 druck 25582 9679 11290 machine 1146339 297645 357818 cell 253565 55397 26298 automatic vehicle 961054 131772 163794 super efficient 53380 8432 8707 motor 398803 79527 46687 airplane 14066 892 390 windshield 8969 1004 536 Average 313 sec (5,2 min) (0 of 10) 70 sec (10 of 10) No system cache cleared System cache cleared
  • 12.
    ComparisonPrototype Query #3.1 OrientDB Neo4j OrientDBvs Neo4j - Comparison12 Checking Fulltext Lucene Lookup Overall Count on 1 indices ? Note on Neo4j: more than one word needs to be put in a new property statement, e.g. instead of 'title:super efficient' we take 'title:super OR title:efficient' SELECT $totalHits FROMAppln WHERE title LUCENE "?" LIMIT 1 START n=node:titles("title:?") RETURN count(*) solar 4611 215263 panel 3318 77442 druck 2890 12503 machine 1846 198479 cell 2351 34685 automatic vehicle 1063 49283 super efficient 984 4054 motor 465 47085 airplane 1172 429 windshield 62 585 Average 9 of 10 1 of 10
  • 13.
    ComparisonPrototype Query #3.2 OrientDB Neo4j OrientDBvs Neo4j - Comparison13 Checking Fulltext Lucene Lookup Overall Count on 2 indices ? Note on Neo4j: more than one word needs to be put in a new property statement, e.g. instead of 'title:super efficient' we take 'title:super OR title:efficient' SELECT $totalHits FROMAppln WHERE [title,abstract] LUCENE "?" LIMIT 1 START n=node:titles ('title:?') MATCH (n)-[:HAS_ABSTRACT]->(a)WHERE a.abstract =~ ".*?.*" RETURN count(*) solar 227234 panel druck machine cell automatic vehicle super efficient motor airplane windshield Average
  • 14.
    ComparisonPrototype Query #4 OrientDB Neo4j OrientDBvs Neo4j - Comparison14 Internal ID function node lookup ? OrientDB ? Neo4j SELECT title FROM #11:? / SELECT name FROM #12:? START n=node(?) RETURN n.title / START n=node(?) RETURN n.name 11:0 0 1 10 816 11:141 141 1 13 27 11:26526 26526 3 13 28 11:2526 2526 2 12 27 11:6262 6262 1 12 28 12:0 76594275 1 11 25 12:515 76594790 2 14 23 12:4115 76598390 3 14 25 12:52627 76646902 2 13 26 12:47484 76641759 1 13 25 Average 2 (10 of 10) 13 (0 of 10) No system cache cleared System cache cleared
  • 15.
    ComparisonPrototype Query #5 OrientDB Neo4j OrientDBvs Neo4j - Comparison15 Count Applns of a specific Person ? OrientDB ? Neo4j SELECT out(WROTE).size() FROM #? START p=node(?) MATCH (p)-[:WROTE]->(a) RETURN count(*) 12:0 76594275 8 81 980 12:1 76594276 1 18 42 12:2 76594277 1 20 41 12:3 76594278 1 18 38 12:4 76594279 1 17 39 12:5 76594280 1 23 41 12:6 76594281 1 21 37 12:7 76594282 1 17 43 12:8 76594283 1 18 45 12:9 76594284 1 17 41 Average 1 (10 of 10) 25 (0 of 10) No system cache cleared System cache cleared
  • 16.
    ComparisonPrototype Query #6 OrientDB Neo4j OrientDBvs Neo4j - Comparison16 Searching for 3 Applns of one specific Person ? OrientDB ? Neo4j select out.@class as sourceClass,out.@rid as source ,out.name as sourceName,in.@class as targetClass,in.@rid as target,in.ID as targetID ,in.nrEpodoc as targetName from (select expand(outE('WROTE')) from #?) order by targetID ASC limit 3 START p=node(?) MATCH (p)-[:WROTE]->(a) RETURN labels(p) as sourceClass, id(p) as source, p.name as sourceName, labels(a) as targetClass, id(a) as target, a.nrEpodoc as targetNameORDER BY a.ID ASC LIMIT 3 12:0 76594275 1051 107 212 12:1 76594276 3 39 77 12:2 76594277 2 40 68 12:3 76594278 2 38 60 12:4 76594279 3 41 58 12:5 76594280 53 59 55 12:6 76594281 56 53 59 12:7 76594282 7 38 56 12:8 76594283 5 38 62 12:9 76594284 2 33 66 Average 118 (8 of 10) 49 (2 of 10) No system cache cleared System cache cleared
  • 17.
    ComparisonPrototype Query #7 OrientDB Neo4j OrientDBvs Neo4j - Comparison17 Searching for Appln.title and Appln.abstract return Person.name matching both ? Title SELECT FROM (SELECT title,abstract,ID from Appln where [title,abstract] LUCENE "?" ORDER BY ID) LIMIT 3 START p=node:titles('title:?') MATCH (p)-[:HAS_ABSTRACT]->(a) WHERE a.abstract =~ ".*?.*" RETURN p.title,a.abstract,a.ID ORDER BY a.ID LIMIT 3 panel 1733261 424789 Average
  • 18.
    ComparisonPrototype Query #7 OrientDB Neo4j OrientDBvs Neo4j - Comparison18 Searching a Person.name + searching on Appln.title for Appln of that specific Person return Person.name matching both ? Title START p=node:people('name:?') MATCH (p)-[:WROTE]->(a) WHERE a.title =~ ".*?.*" RETURN p.name,a.title,a.IDORDER BY a.ID LIMIT 3 machine 99538 Average
  • 19.
    ComparisonPrototype Query #8 OrientDB Neo4j OrientDBvs Neo4j - Comparison19 Searching for an Abstract of an Appln ? Note on Neo4j: more than one word needs to be put in a new property statement, e.g. instead of 'title:super efficient' we take 'title:super OR title:efficient' select @rid,abstract,ID as titleID,in(HAS_ABSTRACT).title as title,in(HAS_ABSTRACT).ID as AbstrID fromAbstract where abstract LUCENE "method" LIMIT 3 START n=node:abstracts("abstract:method") WITH n limit 3 MATCH (x:Appln)-[:HAS_ABSTRACT]->(n) RETURN n.ID,x.ID solar panel druck machine cell automatic vehicle super efficient motor airplane windshield Average
  • 20.
    ComparisonPrototype Query #9 OrientDB Neo4j OrientDBvs Neo4j - Comparison20 Counting the Applns of Person.names containing a specific name ? SELECT sum(out(WROTE).size()) FROM Person WHERE name LUCENE "?" LIMIT -1 START p=node:people('name:?') MATCH (p)-[:WROTE]->(a) RETURN count(a) bosch 7475 3771 intel 13261 7461 siemens 19302 16297 audi 3888 1844 volkswagen 2872 1298 toyota 23223 13561 sony 16520 11449 panasonic 6314 2287 microsoft 2849 1313 apple 3127 1088 Average 0 of 10 10 of 10
  • 21.
  • 22.
    ComparisonPrototype Database Overview OrientDB • Schema,naming policies, overall records, cluster infos and many more infos • Whole page in 0,1 sec Neo4j • No schema infos except naming policies • Counting single label nodes takes ~10 min OrientDB vs Neo4j - Comparison22 Easy and fast way to check state of the database Neo4j‘s supported way to get infos on all labels in one query just gives a Heap Error (maybe too much data?)
  • 23.
    ComparisonPrototype Graph Explorer OrientDB • Goodoverview, straightforward and fast • Nodes can be edited, edges added • Never-ending-graph like Neo4j • Showing nodes/edges and when being clicked some infos about • No other features, not even zooming or dragging all elements OrientDB vs Neo4j - Comparison23 Good for checking graph issues as near as possible to the database v.2 only!
  • 24.
    ComparisonPrototype Result view OrientDB • Greatoverview and paging possible to lower showup and query speed • If you miss setting a „LIMIT“ it‘s set for you! • Using new GraphTab for visual things (v.2!) Neo4j • Graph andTable view • Miss setting a LIMIT? Go smoking  • Graph just able to see up to 10 nodes • Table view endless scrolling OrientDB vs Neo4j - Comparison24 Getting an overview is quite important to check specific query issues
  • 25.
    ComparisonPrototype Function integration OrientDB • Goodoverview and management • Integrated in the Studio • No restart needed • Functions can even be copied to another db Neo4j • Server plugins [1] • Needs to be written in Java and inherited from ServerPlugin class • No overview • Not fail-save • No easy change/access • Requires Server restart • Many lines for simple things OrientDB vs Neo4j - Comparison25 Needed for exchange information with the prototype
  • 26.
    ComparisonPrototype Query style OrientDB • Simplequerys really short • Hard to write querys when they are getting complex • Bad overview and using variable names not intuitive Neo4j • Simple querys really long due to needed cypher statements • Easy to write also complex querys • Using variables name is very intuivite and always keeping up the overview OrientDB vs Neo4j - Comparison26 Useful for result checking and testings
  • 27.
    ComparisonPrototype Lucene Index OrientDB • Stilla „new“ addon • Prior v.2 plugin needed • With v.2 integreated in OrientDB • Use it as if you set an usual index • Index can easily be changed at any time • Analyzer can be easily changed Neo4j • Neo4j does not always use Lucene as indexer • Needs to be set before importing data • Works together via node_auto_index configuration • Changing index or set index to Lucene after the import is not viable in terms of time aspects • Analyzer is not easy to change OrientDB vs Neo4j - Comparison27 Important for full text search the new graph tab builds up
  • 28.
    ComparisonPrototype Security OrientDB • Different securitylevels (like in MySQL) Neo4j • None OrientDB vs Neo4j - Comparison28 Good for integrating more databases and setting access levels
  • 29.
    ComparisonPrototype Disc usage OrientDB • Dbsize = 120 GB • Classes in different files • Classes can also be easily deleted by external deletion Neo4j • Db size = 40 GB • Nodes, properties and relations in separate files • Specific data can only be deleted by Neo4j commands OrientDB vs Neo4j - Comparison29 Good for testing and later reliability
  • 30.
    ComparisonPrototype Future Perspective OrientDB • OrientDBstill „new“ on the market, many features still coming • Still much place for improvements • Brings the possibility to replace MySQL Neo4j • Neo4j „oldest“ Graph database and nearly any feature in there • Algorithms already improved as best as possible • No possiblity to replace a current system, just an extension for using graphs OrientDB vs Neo4j - Comparison30 To see ahead of the current state
  • 31.
    ComparisonPrototype Costs OrientDB • Good supportfor free available • Commercial support much cheaper than Neo4j • EnterpriseVersion available with good monitoring features Neo4j • Commercial support needed to setup a well defined database • Features like clustering only available when paying (e.g. important for our where clause) OrientDB vs Neo4j - Comparison31 Important for startups
  • 32.
    ComparisonPrototype Support / Productionspeed / Own Ideas OrientDB • Good support via • E-Mail • Google Group (anyone from the team helping) • Gitter • Github • Every 2-3 weeks new release • Own Issues answered in 1-2 day • Own ideas are discussed, every day 30-40 comments in Github Neo4j • Poor support for the most popular graph db • Google Group only semi-active community • Just one member from Neo4j helping there • Every 1-2 month new release • Own issues answered ~1 week • Own ideas are mainly ignored, every day 20-30 comments in Github OrientDB vs Neo4j - Comparison32 Important for later issue solvings
  • 33.
    Results (Speed) Measure OrientDBNeo4j Import no use of MT/mapping full use of MT/mapping Startup/Shutdown Speed x - Query #1 Checking Single ID lookup x - Query #2 Checking Fulltext Lucene Lookup - x Query #3.1 Checking Fulltext Lucene Lookup Overall Count on 1 indices x - Query #3.2 Checking Fulltext Lucene Lookup Overall Count on 2 indices - - Query #4 Internal ID function node lookup x - Query #5 Count Applns of a specific Person x - Query #6 Searching for 3 Applns of one specific Person single bolter making poor average value always quite same speed Query #7 Searching a Person.name + searching on Appln.title for Appln - - Query #8 Searching for an Abstract of an Appln - - Query #9 Counting the Applns of Person.names containing a specific name - x Results 4 3 OrientDB vs Neo4j - Comparison 33
  • 34.
    Results (Misc) Measure OrientDBNeo4j Database Overview x Graph Explorer x Result View x Function Integreation x Query style x Lucene Index x Security x Disc Usage every class in single file using less disk space Future Perspective x Costs x Support / Production Speed / Own ideas x Results 9 1 OrientDB vs Neo4j - Comparison 34
  • 35.
    Results • OrientDB workingon fixing the very slow querys • OrientDB has inconsistent query speed somtimes (super high and super low) • OrientDB Studio is on a really next level • Neo4j Studio nearly useless compared to OrientDB‘s OrientDB vs Neo4j - Comparison 35
  • 36.
    Supporters • I wantto give a special thanks to Michael Hunger, without him the Neo4j import would still have trouble • I also want to thank Enrico Risa for his help and fast implementation of Lucene improvements • Keep up the great work! 36OrientDB vs Neo4j - Comparison
  • 37.
    Links • [1] http://docs.neo4j.org/chunked/stable/server-plugins.html •[2] http://docs.neo4j.org/refcard/2.0/ 37OrientDB vs Neo4j - Comparison