The Briefing Room
Connecting the Dots: How a Graph Database Enables Discovery
Twitter Tag: #briefr The Briefing Room
Welcome
Host:
Eric Kavanagh
eric.kavanagh@bloorgroup.com
Twitter Tag: #briefr The Briefing Room
!   Reveal the essential characteristics of enterprise software,
good and bad
!   Provide a forum for detailed analysis of today s innovative
technologies
!   Give vendors a chance to explain their product to savvy
analysts
!   Allow audience members to pose serious questions... and get
answers!
Mission
Twitter Tag: #briefr The Briefing Room
JUNE: Database
July: CLOUD
August: HIGH PERFORMANCE ANALYTICS
September: ANALYTICS
Twitter Tag: #briefr The Briefing Room
Database
SQL
NoSQL
Graph
Object
Grid
NewSQL
Cloud
Document
And lots more…
Twitter Tag: #briefr The Briefing Room
Analyst: Robin Bloor
Robin Bloor is
Chief Analyst at
The Bloor Group	
	
robin.bloor@bloorgroup.com
Twitter Tag: #briefr The Briefing Room
!   Objectivity develops NoSQL platforms
!   Its flagship product is Objectivity/DB, a distributed
database management system
!   Objectivity also offers InfiniteGraph, a graph database built
on top of Objectivity/DB, and GraphMyLife, a mobile
application that provides search capabilities across social
networks
Objectivity
Twitter Tag: #briefr The Briefing Room
Leon Guzenda
Leon Guzenda is one of the founding members of
Objectivity and one of the original architects of
Objectivity/DB. He currently works with Objectivity’s
major customers to help them develop and deploy
complex applications and systems that use
Objectivity/DB. He also liaises with technology
partners and industry groups to help ensure that
Objectivity/DB remains at the forefront of database
and distributed computing technology.
Leon has more than 35 years experience in the
software industry. At Automation Technology Products
he managed the development of the ODBMS for the
Cimplex solid modeling and numerical control
system. Before that, he was Principal Project Director
for International Computers Ltd. in the United
Kingdom. He was also design and development
manager for ICL’s 2900 IDMS product. Leon has a
B.S. degree in Electronic Engineering from the
University of Wales.
Connecting The Dots
How A Graph Database Enables The Discovery Of Extra Value In An
Existing Enterprise Or Big Data Repository
Leon Guzenda
Bloor Briefing Room
June 25, 2013
Ø  Current Big Data Analytics
Ø  Graph Analytics
Ø  InfiniteGraph
Ø  The Big Data Connection Platform
Objectivity Inc.
• Objectivity, Inc. is headquartered in San Jose, CA.
• Objectivity has over two decades of Big Data and NoSQL experience
• We develop NoSQL platforms for managing and discovering relationships and
patterns in complex data:
–  Objectivity/DB - an object database that manages localized, centralized or
distributed databases
–  InfiniteGraph - a massively scalable graph database built on Objectivity/DB
that enables organizations to find, store and exploit the relationships in their
data
–  GraphMyLife – a mobile App that allows users to combine multiple social
networks to search, discover and share information
l  Millions of deployments - Our technology is embedded in hundreds of enterprise
and government systems and commercial products
Copyright © Objectivity, Inc. 2013
A Typical Deployment – HUMInt
A Typical “Big Data” Analytics Setup
Data Aggregation and Analytics Applications
Commodity Linux Platforms and/or High Performance Computing Clusters
Structured Semi-Structured Unstructured
Graph
DB
Object
DB
Doc DB
K-V
Store
Hadoop
Column
Store
Data W/
H
RDBMS
Copyright © Objectivity, Inc. 2012
Not Only SQL – A group of 4 primary technologies
Simple Highly
Interconnected
Copyright © Objectivity, Inc. 2013
Graph Analytics
Incremental Analytics Improvements Aren’t Enough
All current solutions use the same basic architectural model
•  None of the current solutions have a way to store connections between
entities in different silos.
•  Most analytic technology focuses on the content of the data nodes, rather
than the many kinds of connections between the nodes and the data in those
connections.
•  Why? Because traditional and earlier NoSQL solutions are bad at handling
relationships.
•  Graph databases can efficiently store, manage and query the many kinds of
relationships hidden in the data.
Copyright © Objectivity, Inc. 2013
Graph (Relationship) Analytics...
A SQL Shortcoming
Think about the SQL query for finding all links between the two “blue” rows... it's hard!!
Table_A Table_B Table_C Table_D Table_E Table_F Table_G
There are some kinds of complex relationship handling problems that SQL
wasn't designed for.
Copyright © Objectivity, Inc. 2013
...Graph Analytics
InfiniteGraph - The solution can be found with a few lines of code
A SQL Shortcoming
A3 G4
Table_A Table_B Table_C Table_D Table_E Table_F Table_G
Copyright © Objectivity, Inc. 2013
Applications for Graph Analytics
LOGISTICS HEALTHCARE INFORMATICS
MARKET ANALYSIS SOCIAL NETWORK ANALYSIS
Copyright © Objectivity, Inc. 2013
Representing the Graph...
Combatant A
Civilian Q
Situation Y
Civilian P
Bank X
Civilian S
Civilian R
Events/Places People/Orgs Facts
Situation X
The existing intelligence data might look like this:
Target T
Cafe C S Seen Near TA Banks at X
A Called P
A Seen At Y
A Seen Near X P Emailed S
P Called Q Q Seen Near T
P Called R R Seen Near T
X Paid S
A Eats At
Copyright © Objectivity, Inc. 2013
Representing the Graph...
Combatant A
Civilian Q
Situation Y
Civilian P
Civilian S
Civilian R
Events/Places People/Orgs Facts
Situation X
Target T
We start by identifying the nodes (Vertices) and the connections (Edges)
NODES CONNECTIONS
S Seen Near TA Banks at X
A Called P
A Seen At Y
A Seen Near X P Emailed S
P Called Q Q Seen Near T
P Called R R Seen Near T
X Paid SBank X
Cafe C
A Eats At
Copyright © Objectivity, Inc. 2013
VERTEX EDGE
2 N
...Representing the Graph..
“Nodes” “Connections”
Copyright © Objectivity, Inc. 2013
...Representing the Graph..
Situation X Combatant ASeen Near
Civilian P
Called
Called
Seen At Situation Y
Civilian Q
Target T
Seen Near
Emailed
Banks At
Bank X
Civilian S
Seen Near
Called
Civilian R
Seen Near
Paid
Eats At
Cafe C
VERTEX EDGE“Nodes” “Connections”
Copyright © Objectivity, Inc. 2013
...Analyzing the Graph...
Situation X Combatant ASeen Near
Civilian P
Called
Called
Seen At Situation Y
Civilian Q
Target T
Seen Near
Emailed
Banks At
Bank X
Civilian S
Seen Near
Called
Civilian R
Seen Near
Paid
Eats At
Cafe C
Copyright © Objectivity, Inc. 2013
...Analyzing the Graph...
Situation X Combatant ASeen Near
Civilian P
Called
Called
Seen At Situation Y
Civilian Q
Target T
Seen Near
Emailed
Banks At
Bank X
Civilian S
Seen Near
Called
Civilian R
Seen Near
Paid
Eats At
Cafe C
Copyright © Objectivity, Inc. 2013
...Threat Analysis
Situation X Combatant ASeen Near
Civilian P
Called
Called
Seen At Situation Y
Civilian Q
Target T
Seen Near
Emailed
Banks At
Bank X
Civilian S
Seen Near
Called
Civilian R
Seen Near
Paid
SUSPECTS
NEEDS PROTECTION
Copyright © Objectivity, Inc. 2013
Graph Databases Can Connect The Dots
DATABASE(S)
GRAPH DATABASE
Copyright © Objectivity, Inc. 2013
Visual Analytics
Copyright © Objectivity, Inc. 2013
Graphs Can Scale Very Quickly
Copyright © Objectivity, Inc. 2013
We often hear about the “trillion row” database. Amazon S3 has reached 2 trillion,
but one Objectivity site:
• Processes 10s of trillions of objects per day
• Supports over 1000 analysts around the clock.
Consider a graph where each node has 10 connections:
• At 6 degrees of freedom, finding a path between two nodes may require traversing
a million links.
• 9 degrees of freedom requires a billion traversals
• 12 degrees of freedom requires a trillion traversals
• 15 degrees of freedom requires a quadrillion traversals...
THE BIG DATA CONNECTION PLATFORM
•  A high performance distributed database engine that supports analyst-time decision
support and actionable intelligence
•  Cost effective link analysis – flexible deployment on commodity resources (hardware
and OS)
•  Efficient, scalable, risk averse technology – enterprise proven
•  High Speed parallel ingest to load graph data quickly
•  Parallel, distributed queries
•  Flexible plugin architecture
•  Complementary technology
•  Fast proof of concept – easy to use Graph API
InfiniteGraph - The Enterprise Graph Database
Copyright © Objectivity, Inc. 2013
InfiniteGraph Capabilities
Parallel Graph Traversal Inclusive or Exclusive Selection
X
X
Shortest or All Paths Between Objects
Start Start
Start Finish Start
Compute Cost To Date
Visualize
Computational & Visualization Plug-Ins
Copyright © Objectivity, Inc. 2013
Commonly Used Graph Algorithms...
l  Connectedness
l  Node degree
l  Shortest Path
l  Average path length
l  Transitive Closure
l  Graph diameter (or Span)
l  Centrality (Betweeness, Degree and Closeness)
l  In the graph below, node D has the highest betweeness centrality
Data Visualization
& Analytics
Big Data
Connection
Platform
*Now	
  	
  HP	
   *Now	
  	
  IBM	
  
Conventional & Relationship Analytics
ORACLE
Big Data
Solutions
+
A Typical Deployment Supplements Traditional or Big Data Systems With Graph Analytics
Copyright © Objectivity, Inc. 2013
Online Demo - Call Detail Record Analysis
Used in Law Enforcement, Counter-Terrorism and Customer Resource Management
GraphMyLife™ Demo/Overview
Thank You!
InfiniteGraph – For highly interconnected
data that has data in the connections
Please take a look at objectivity.com
For InfiniteGraph Online Demos, White Papers, Free
Downloads, Samples, Tutorials and the GraphMyLife App
Twitter Tag: #briefr The Briefing Room
Perceptions & Questions
Analyst:
Robin Bloor
The Bloor Group
The Bloor Group
NoSQL Confusion
As the graph
indicates, NoSQL is a
very confusing
descriptor.
WHAT CAN A GIVEN
DATABASE ACTUALLY
DO?
The important question is
The Bloor Group
Database Types
1
2
3
4
Big Table: No JOIN DBMS
Relational Database: Optimized for data stored in
sets
Document Database: Optimized for data stored in
hierarchical structures
Graph Database: Optimized for data stored in graphs
or directed graphs (networks)
There are 4 types of database (engine):
The Bloor Group
Workload Types
1
2
3
4
Big Table: Select, Project
Relational Database: Select, Project, Join
Document Database: Search and Join
Graph Database: Graph walking
There are 4 types of associated workload:
The Bloor Group
A Database For All Seasons?
It is feasible to build an engine that handles all
these workloads, but not feasible to have one
that handles them all well
This is because performance is highly
dependent on how you physically store the data
Different workloads mean different physical
data storage
Currently, the unexploited engines are the
graph database and the document database
The Bloor Group
And In The Future?
We are beginning to
see the emergence of
the triple store
Graph databases are
suited to being triple
stores
We do not yet know if
a new database type
will emerge
The Bloor Group
Questions
!   We tend not to think much of graph database
applications, partly because we have not had an
easy way to search graphs of data.
!   In my view we have reached a situation where
there will always be multiple “data engines.” Is
that Objectivity’s view?
!   While we have focused on Objectivity as a
Graph Database, what other workloads can it be
applied to?
!   Which sectors/businesses are currently in
Objectivity’s “sweet spot”?
The Bloor Group
Questions
!   Data analytics is currently the motivation for a
good deal of database purchases. What kind of
data analytics does one carry out on a graph
database?
!   Have you encountered significant interest in
triple stores and semantic searches?
!   Which companies/products do you regard as
competitors/partners?
Twitter Tag: #briefr The Briefing Room
Twitter Tag: #briefr The Briefing Room
July: CLOUD
August: HIGH PERFORMANCE ANALYTICS
September: ANALYTICS
Upcoming Topics
www.insideanalysis.com
Twitter Tag: #briefr The Briefing Room
Thank You
for Your
Attention

Connecting the Dots—How a Graph Database Enables Discovery

  • 1.
    The Briefing Room Connectingthe Dots: How a Graph Database Enables Discovery
  • 2.
    Twitter Tag: #briefrThe Briefing Room Welcome Host: Eric Kavanagh eric.kavanagh@bloorgroup.com
  • 3.
    Twitter Tag: #briefrThe Briefing Room !   Reveal the essential characteristics of enterprise software, good and bad !   Provide a forum for detailed analysis of today s innovative technologies !   Give vendors a chance to explain their product to savvy analysts !   Allow audience members to pose serious questions... and get answers! Mission
  • 4.
    Twitter Tag: #briefrThe Briefing Room JUNE: Database July: CLOUD August: HIGH PERFORMANCE ANALYTICS September: ANALYTICS
  • 5.
    Twitter Tag: #briefrThe Briefing Room Database SQL NoSQL Graph Object Grid NewSQL Cloud Document And lots more…
  • 6.
    Twitter Tag: #briefrThe Briefing Room Analyst: Robin Bloor Robin Bloor is Chief Analyst at The Bloor Group robin.bloor@bloorgroup.com
  • 7.
    Twitter Tag: #briefrThe Briefing Room !   Objectivity develops NoSQL platforms !   Its flagship product is Objectivity/DB, a distributed database management system !   Objectivity also offers InfiniteGraph, a graph database built on top of Objectivity/DB, and GraphMyLife, a mobile application that provides search capabilities across social networks Objectivity
  • 8.
    Twitter Tag: #briefrThe Briefing Room Leon Guzenda Leon Guzenda is one of the founding members of Objectivity and one of the original architects of Objectivity/DB. He currently works with Objectivity’s major customers to help them develop and deploy complex applications and systems that use Objectivity/DB. He also liaises with technology partners and industry groups to help ensure that Objectivity/DB remains at the forefront of database and distributed computing technology. Leon has more than 35 years experience in the software industry. At Automation Technology Products he managed the development of the ODBMS for the Cimplex solid modeling and numerical control system. Before that, he was Principal Project Director for International Computers Ltd. in the United Kingdom. He was also design and development manager for ICL’s 2900 IDMS product. Leon has a B.S. degree in Electronic Engineering from the University of Wales.
  • 9.
    Connecting The Dots HowA Graph Database Enables The Discovery Of Extra Value In An Existing Enterprise Or Big Data Repository Leon Guzenda Bloor Briefing Room June 25, 2013 Ø  Current Big Data Analytics Ø  Graph Analytics Ø  InfiniteGraph Ø  The Big Data Connection Platform
  • 10.
    Objectivity Inc. • Objectivity, Inc.is headquartered in San Jose, CA. • Objectivity has over two decades of Big Data and NoSQL experience • We develop NoSQL platforms for managing and discovering relationships and patterns in complex data: –  Objectivity/DB - an object database that manages localized, centralized or distributed databases –  InfiniteGraph - a massively scalable graph database built on Objectivity/DB that enables organizations to find, store and exploit the relationships in their data –  GraphMyLife – a mobile App that allows users to combine multiple social networks to search, discover and share information l  Millions of deployments - Our technology is embedded in hundreds of enterprise and government systems and commercial products Copyright © Objectivity, Inc. 2013
  • 11.
  • 12.
    A Typical “BigData” Analytics Setup Data Aggregation and Analytics Applications Commodity Linux Platforms and/or High Performance Computing Clusters Structured Semi-Structured Unstructured Graph DB Object DB Doc DB K-V Store Hadoop Column Store Data W/ H RDBMS Copyright © Objectivity, Inc. 2012
  • 13.
    Not Only SQL– A group of 4 primary technologies Simple Highly Interconnected Copyright © Objectivity, Inc. 2013
  • 14.
  • 15.
    Incremental Analytics ImprovementsAren’t Enough All current solutions use the same basic architectural model •  None of the current solutions have a way to store connections between entities in different silos. •  Most analytic technology focuses on the content of the data nodes, rather than the many kinds of connections between the nodes and the data in those connections. •  Why? Because traditional and earlier NoSQL solutions are bad at handling relationships. •  Graph databases can efficiently store, manage and query the many kinds of relationships hidden in the data. Copyright © Objectivity, Inc. 2013
  • 16.
    Graph (Relationship) Analytics... ASQL Shortcoming Think about the SQL query for finding all links between the two “blue” rows... it's hard!! Table_A Table_B Table_C Table_D Table_E Table_F Table_G There are some kinds of complex relationship handling problems that SQL wasn't designed for. Copyright © Objectivity, Inc. 2013
  • 17.
    ...Graph Analytics InfiniteGraph -The solution can be found with a few lines of code A SQL Shortcoming A3 G4 Table_A Table_B Table_C Table_D Table_E Table_F Table_G Copyright © Objectivity, Inc. 2013
  • 18.
    Applications for GraphAnalytics LOGISTICS HEALTHCARE INFORMATICS MARKET ANALYSIS SOCIAL NETWORK ANALYSIS Copyright © Objectivity, Inc. 2013
  • 19.
    Representing the Graph... CombatantA Civilian Q Situation Y Civilian P Bank X Civilian S Civilian R Events/Places People/Orgs Facts Situation X The existing intelligence data might look like this: Target T Cafe C S Seen Near TA Banks at X A Called P A Seen At Y A Seen Near X P Emailed S P Called Q Q Seen Near T P Called R R Seen Near T X Paid S A Eats At Copyright © Objectivity, Inc. 2013
  • 20.
    Representing the Graph... CombatantA Civilian Q Situation Y Civilian P Civilian S Civilian R Events/Places People/Orgs Facts Situation X Target T We start by identifying the nodes (Vertices) and the connections (Edges) NODES CONNECTIONS S Seen Near TA Banks at X A Called P A Seen At Y A Seen Near X P Emailed S P Called Q Q Seen Near T P Called R R Seen Near T X Paid SBank X Cafe C A Eats At Copyright © Objectivity, Inc. 2013
  • 21.
    VERTEX EDGE 2 N ...Representingthe Graph.. “Nodes” “Connections” Copyright © Objectivity, Inc. 2013
  • 22.
    ...Representing the Graph.. SituationX Combatant ASeen Near Civilian P Called Called Seen At Situation Y Civilian Q Target T Seen Near Emailed Banks At Bank X Civilian S Seen Near Called Civilian R Seen Near Paid Eats At Cafe C VERTEX EDGE“Nodes” “Connections” Copyright © Objectivity, Inc. 2013
  • 23.
    ...Analyzing the Graph... SituationX Combatant ASeen Near Civilian P Called Called Seen At Situation Y Civilian Q Target T Seen Near Emailed Banks At Bank X Civilian S Seen Near Called Civilian R Seen Near Paid Eats At Cafe C Copyright © Objectivity, Inc. 2013
  • 24.
    ...Analyzing the Graph... SituationX Combatant ASeen Near Civilian P Called Called Seen At Situation Y Civilian Q Target T Seen Near Emailed Banks At Bank X Civilian S Seen Near Called Civilian R Seen Near Paid Eats At Cafe C Copyright © Objectivity, Inc. 2013
  • 25.
    ...Threat Analysis Situation XCombatant ASeen Near Civilian P Called Called Seen At Situation Y Civilian Q Target T Seen Near Emailed Banks At Bank X Civilian S Seen Near Called Civilian R Seen Near Paid SUSPECTS NEEDS PROTECTION Copyright © Objectivity, Inc. 2013
  • 26.
    Graph Databases CanConnect The Dots DATABASE(S) GRAPH DATABASE Copyright © Objectivity, Inc. 2013
  • 27.
    Visual Analytics Copyright ©Objectivity, Inc. 2013
  • 28.
    Graphs Can ScaleVery Quickly Copyright © Objectivity, Inc. 2013 We often hear about the “trillion row” database. Amazon S3 has reached 2 trillion, but one Objectivity site: • Processes 10s of trillions of objects per day • Supports over 1000 analysts around the clock. Consider a graph where each node has 10 connections: • At 6 degrees of freedom, finding a path between two nodes may require traversing a million links. • 9 degrees of freedom requires a billion traversals • 12 degrees of freedom requires a trillion traversals • 15 degrees of freedom requires a quadrillion traversals...
  • 29.
    THE BIG DATACONNECTION PLATFORM
  • 30.
    •  A highperformance distributed database engine that supports analyst-time decision support and actionable intelligence •  Cost effective link analysis – flexible deployment on commodity resources (hardware and OS) •  Efficient, scalable, risk averse technology – enterprise proven •  High Speed parallel ingest to load graph data quickly •  Parallel, distributed queries •  Flexible plugin architecture •  Complementary technology •  Fast proof of concept – easy to use Graph API InfiniteGraph - The Enterprise Graph Database Copyright © Objectivity, Inc. 2013
  • 31.
    InfiniteGraph Capabilities Parallel GraphTraversal Inclusive or Exclusive Selection X X Shortest or All Paths Between Objects Start Start Start Finish Start Compute Cost To Date Visualize Computational & Visualization Plug-Ins Copyright © Objectivity, Inc. 2013
  • 32.
    Commonly Used GraphAlgorithms... l  Connectedness l  Node degree l  Shortest Path l  Average path length l  Transitive Closure l  Graph diameter (or Span) l  Centrality (Betweeness, Degree and Closeness) l  In the graph below, node D has the highest betweeness centrality
  • 33.
    Data Visualization & Analytics BigData Connection Platform *Now    HP   *Now    IBM   Conventional & Relationship Analytics ORACLE Big Data Solutions + A Typical Deployment Supplements Traditional or Big Data Systems With Graph Analytics Copyright © Objectivity, Inc. 2013
  • 34.
    Online Demo -Call Detail Record Analysis Used in Law Enforcement, Counter-Terrorism and Customer Resource Management
  • 35.
  • 36.
    Thank You! InfiniteGraph –For highly interconnected data that has data in the connections Please take a look at objectivity.com For InfiniteGraph Online Demos, White Papers, Free Downloads, Samples, Tutorials and the GraphMyLife App
  • 37.
    Twitter Tag: #briefrThe Briefing Room Perceptions & Questions Analyst: Robin Bloor
  • 38.
  • 39.
    The Bloor Group NoSQLConfusion As the graph indicates, NoSQL is a very confusing descriptor. WHAT CAN A GIVEN DATABASE ACTUALLY DO? The important question is
  • 40.
    The Bloor Group DatabaseTypes 1 2 3 4 Big Table: No JOIN DBMS Relational Database: Optimized for data stored in sets Document Database: Optimized for data stored in hierarchical structures Graph Database: Optimized for data stored in graphs or directed graphs (networks) There are 4 types of database (engine):
  • 41.
    The Bloor Group WorkloadTypes 1 2 3 4 Big Table: Select, Project Relational Database: Select, Project, Join Document Database: Search and Join Graph Database: Graph walking There are 4 types of associated workload:
  • 42.
    The Bloor Group ADatabase For All Seasons? It is feasible to build an engine that handles all these workloads, but not feasible to have one that handles them all well This is because performance is highly dependent on how you physically store the data Different workloads mean different physical data storage Currently, the unexploited engines are the graph database and the document database
  • 43.
    The Bloor Group AndIn The Future? We are beginning to see the emergence of the triple store Graph databases are suited to being triple stores We do not yet know if a new database type will emerge
  • 44.
    The Bloor Group Questions !  We tend not to think much of graph database applications, partly because we have not had an easy way to search graphs of data. !   In my view we have reached a situation where there will always be multiple “data engines.” Is that Objectivity’s view? !   While we have focused on Objectivity as a Graph Database, what other workloads can it be applied to? !   Which sectors/businesses are currently in Objectivity’s “sweet spot”?
  • 45.
    The Bloor Group Questions !  Data analytics is currently the motivation for a good deal of database purchases. What kind of data analytics does one carry out on a graph database? !   Have you encountered significant interest in triple stores and semantic searches? !   Which companies/products do you regard as competitors/partners?
  • 46.
    Twitter Tag: #briefrThe Briefing Room
  • 47.
    Twitter Tag: #briefrThe Briefing Room July: CLOUD August: HIGH PERFORMANCE ANALYTICS September: ANALYTICS Upcoming Topics www.insideanalysis.com
  • 48.
    Twitter Tag: #briefrThe Briefing Room Thank You for Your Attention