FOSDEM 2012
                                                         Graph Processing Room
                                                         5 Feb 2012




         Cypher Query Language
                              Andrés Taylor and Alistair Jones




Wednesday, February 8, 2012
What is Cypher?


                    • Graph Query Language for Neo4j
                    • Aims to make querying simple

Wednesday, February 8, 2012
Motivation
       Something new?


                    • Existing Neo4j query mechanisms were not
                          simple enough

                         • Too verbose (Java API)
                         • Too prescriptive (Gremlin)

Wednesday, February 8, 2012
Motivation
       SQL?


                    • Unable to express paths
                      • these are crucial for graph-based
                              reasoning

                    • neo4j is schema/table free

Wednesday, February 8, 2012
Motivation
       SPARQL?


                    • SPARQL designed for a different data model
                      • namespaces
                      • properties as nodes


Wednesday, February 8, 2012
Design




Wednesday, February 8, 2012
Design Decisions
             Declarative
                        Imperative         Declarative
      follow relationship              specify starting point
  breadth-first vs depth-first         specify desired outcome

              explicit algorithm       algorithm adaptable
                                         based on query




Wednesday, February 8, 2012
Design Decisions
             Pattern matching




Wednesday, February 8, 2012
Design Decisions
             Pattern matching


                                  A


                              B       C


Wednesday, February 8, 2012
Design Decisions                    A
             Pattern matching
                                B       C




Wednesday, February 8, 2012
Design Decisions                    A
             Pattern matching
                                B       C




Wednesday, February 8, 2012
Design Decisions                    A
             Pattern matching
                                B       C




Wednesday, February 8, 2012
Design Decisions                    A
             Pattern matching
                                B       C




Wednesday, February 8, 2012
Design Decisions
             ASCII-art patterns




Wednesday, February 8, 2012
Design Decisions
             ASCII-art patterns




                              () --> ()

Wednesday, February 8, 2012
Design Decisions
             ASCII-art patterns


                              A   B




Wednesday, February 8, 2012
Design Decisions
             ASCII-art patterns


                               A      B

                              (A) --> (B)

Wednesday, February 8, 2012
Design Decisions
             ASCII-art patterns

                                  LOVES
                              A           B




Wednesday, February 8, 2012
Design Decisions
             ASCII-art patterns

                                   LOVES
                               A           B

                              A -[:LOVES]-> B



Wednesday, February 8, 2012
Design Decisions
             ASCII-art patterns


                    A         B   C




Wednesday, February 8, 2012
Design Decisions
             ASCII-art patterns


                    A         B   C

                   A --> B --> C

Wednesday, February 8, 2012
Design Decisions
             ASCII-art patterns

                                  A


                              B       C




Wednesday, February 8, 2012
Design Decisions
             ASCII-art patterns

                                  A


                              B       C

                 A --> B --> C, A --> C

Wednesday, February 8, 2012
Design Decisions
             ASCII-art patterns

                                  A


                              B       C

                 A --> B --> C, A --> C
                  A --> B --> C <-- A
Wednesday, February 8, 2012
Design Decisions
             Variable length paths

                                      A         B

                                  A                 B

                              A                         B
                                          ...



Wednesday, February 8, 2012
Design Decisions
             Variable length paths

                                      A         B

                                  A                 B

                              A                         B
                                          ...

                              A -[*]-> B
Wednesday, February 8, 2012
Design Decisions
             Optional relationships


                              A       B




Wednesday, February 8, 2012
Design Decisions
             Optional relationships


                               A      B

                              A -[?]-> B

Wednesday, February 8, 2012
Design Decisions
             Closures

      start london = node(1), moscow = node(2)
      match path = london -[*]-> moscow
      where all(city in nodes(path) where city.capital = true)




Wednesday, February 8, 2012
Design Decisions
             Parsed, not an internal DSL



              Execution Semantics      Serialisation

                         Type System   Portability




Wednesday, February 8, 2012
Design Decisions
             Familiar for SQL users


                               select
                                          start
                                from
                                         match
                               where
                                         where
                              group by
                                         return
                              order by




Wednesday, February 8, 2012
Implementation




Wednesday, February 8, 2012
Implementation

             Execution Plan


  start n=node(0)
  return n

  Parameters()
  Nodes(n)
  Extract([n])
  ColumnFilter([n])




Wednesday, February 8, 2012
Implementation

             Execution Plan
  start n=node(0)
  match n-[*]-> b
  return n.name, n, count(*)
  order by n.age

  Parameters()
  Nodes(n)
  PatternMatch(n-[*]->b)
  Extract([n.name, n])
  EagerAggregation( keys: [n.name, n], aggregates: [count(*)])
  Extract([n.age])
  Sort(n.age ASC)
  ColumnFilter([n.name,n,count(*)])


Wednesday, February 8, 2012
Implementation

             Execution Plan
  start n=node(0)
  match n-[*]-> b
  return n.name, n, count(*)
  order by n.name

  Parameters()
  Nodes(n)
  PatternMatch(n-[*]->b)
  Extract([n.name, n])
  Sort(n.name ASC,n ASC)
  EagerAggregation( keys: [n.name, n], aggregates: [count(*)])
  ColumnFilter([n.name,n,count(*)])


Wednesday, February 8, 2012
Thanks for Listening!
                                Questions?
   Andrés Taylor andres.taylor@neotechnology.com @andres_taylor
       Alistair Jones alistair.jones@neotechnology.com @apcj




Wednesday, February 8, 2012

Cypher Query Language

  • 1.
    FOSDEM 2012 Graph Processing Room 5 Feb 2012 Cypher Query Language Andrés Taylor and Alistair Jones Wednesday, February 8, 2012
  • 2.
    What is Cypher? • Graph Query Language for Neo4j • Aims to make querying simple Wednesday, February 8, 2012
  • 3.
    Motivation Something new? • Existing Neo4j query mechanisms were not simple enough • Too verbose (Java API) • Too prescriptive (Gremlin) Wednesday, February 8, 2012
  • 4.
    Motivation SQL? • Unable to express paths • these are crucial for graph-based reasoning • neo4j is schema/table free Wednesday, February 8, 2012
  • 5.
    Motivation SPARQL? • SPARQL designed for a different data model • namespaces • properties as nodes Wednesday, February 8, 2012
  • 6.
  • 7.
    Design Decisions Declarative Imperative Declarative follow relationship specify starting point breadth-first vs depth-first specify desired outcome explicit algorithm algorithm adaptable based on query Wednesday, February 8, 2012
  • 8.
    Design Decisions Pattern matching Wednesday, February 8, 2012
  • 9.
    Design Decisions Pattern matching A B C Wednesday, February 8, 2012
  • 10.
    Design Decisions A Pattern matching B C Wednesday, February 8, 2012
  • 11.
    Design Decisions A Pattern matching B C Wednesday, February 8, 2012
  • 12.
    Design Decisions A Pattern matching B C Wednesday, February 8, 2012
  • 13.
    Design Decisions A Pattern matching B C Wednesday, February 8, 2012
  • 14.
    Design Decisions ASCII-art patterns Wednesday, February 8, 2012
  • 15.
    Design Decisions ASCII-art patterns () --> () Wednesday, February 8, 2012
  • 16.
    Design Decisions ASCII-art patterns A B Wednesday, February 8, 2012
  • 17.
    Design Decisions ASCII-art patterns A B (A) --> (B) Wednesday, February 8, 2012
  • 18.
    Design Decisions ASCII-art patterns LOVES A B Wednesday, February 8, 2012
  • 19.
    Design Decisions ASCII-art patterns LOVES A B A -[:LOVES]-> B Wednesday, February 8, 2012
  • 20.
    Design Decisions ASCII-art patterns A B C Wednesday, February 8, 2012
  • 21.
    Design Decisions ASCII-art patterns A B C A --> B --> C Wednesday, February 8, 2012
  • 22.
    Design Decisions ASCII-art patterns A B C Wednesday, February 8, 2012
  • 23.
    Design Decisions ASCII-art patterns A B C A --> B --> C, A --> C Wednesday, February 8, 2012
  • 24.
    Design Decisions ASCII-art patterns A B C A --> B --> C, A --> C A --> B --> C <-- A Wednesday, February 8, 2012
  • 25.
    Design Decisions Variable length paths A B A B A B ... Wednesday, February 8, 2012
  • 26.
    Design Decisions Variable length paths A B A B A B ... A -[*]-> B Wednesday, February 8, 2012
  • 27.
    Design Decisions Optional relationships A B Wednesday, February 8, 2012
  • 28.
    Design Decisions Optional relationships A B A -[?]-> B Wednesday, February 8, 2012
  • 29.
    Design Decisions Closures start london = node(1), moscow = node(2) match path = london -[*]-> moscow where all(city in nodes(path) where city.capital = true) Wednesday, February 8, 2012
  • 30.
    Design Decisions Parsed, not an internal DSL Execution Semantics Serialisation Type System Portability Wednesday, February 8, 2012
  • 31.
    Design Decisions Familiar for SQL users select start from match where where group by return order by Wednesday, February 8, 2012
  • 32.
  • 33.
    Implementation Execution Plan start n=node(0) return n Parameters() Nodes(n) Extract([n]) ColumnFilter([n]) Wednesday, February 8, 2012
  • 34.
    Implementation Execution Plan start n=node(0) match n-[*]-> b return n.name, n, count(*) order by n.age Parameters() Nodes(n) PatternMatch(n-[*]->b) Extract([n.name, n]) EagerAggregation( keys: [n.name, n], aggregates: [count(*)]) Extract([n.age]) Sort(n.age ASC) ColumnFilter([n.name,n,count(*)]) Wednesday, February 8, 2012
  • 35.
    Implementation Execution Plan start n=node(0) match n-[*]-> b return n.name, n, count(*) order by n.name Parameters() Nodes(n) PatternMatch(n-[*]->b) Extract([n.name, n]) Sort(n.name ASC,n ASC) EagerAggregation( keys: [n.name, n], aggregates: [count(*)]) ColumnFilter([n.name,n,count(*)]) Wednesday, February 8, 2012
  • 36.
    Thanks for Listening! Questions? Andrés Taylor andres.taylor@neotechnology.com @andres_taylor Alistair Jones alistair.jones@neotechnology.com @apcj Wednesday, February 8, 2012