NoSQL for SQL
Server developers
using Couchbase
TriNUG Data SIG
3/7/2018
Who is this guy?
• Brant Burnett - @btburnett3
• Systems Architect at CenterEdge Software
• .NET since 1.0, SQL Server since 7.0
• MCSD, MCDBA
• Experience from desktop apps to large
scale cloud services
NoSQL Credentials
• Couchbase user since 2012 (v1.8)
• Couchbase Expert
• Open source contributions:
• Couchbase .NET SDK
• Couchbase.Extensions for .NET Core
• Couchbase LINQ provider (Linq2Couchbase)
• CouchbaseFakeIt
• couchbase-index-manager
Content
Attributions
• Matthew Groves
Developer Advocate at Couchbase
@mgroves on Twitter
• Raju Suravarjjala
Couchbase R&D
• Keshav Murthy
Couchbase R&D
Agenda
What’s NoSQL?
Why NoSQL?
Some popular NoSQL options
SQL Server to Couchbase Mindmap
Live Demo: ASP.NET Core Microservice
Querying and Indexing Couchbase
Let’s get some
questions…
first???
This Photo by Unknown Author is licensed under CC BY-NC-ND
What’s
NoSQL?
It’s just the opposite of SQL,
right?
Document
• Couchbase
• MongoDB
• DynamoDB
• CosmosDB
Graph
• OrientDB
• Neo4J
• GraphBase
• CosmosDB
Key-Value
• Couchbase
• Riak
• BerkeleyDB
• Redis
• CosmosDB
Wide Column
• Hbase
• Cassandra
• Hypertable
• CosmosDB
NoSQL Document
Databases
• Get and set documents by key
• Imagine a giant folder full of JSON
files
• If you know the filename, you can
get or update the content
• Often offer map/reduce or query
functionality
Why NoSQL?
SQL Server can do it all, right?
Why NoSQL?
Scalability
• Horizontal scaling
• Scaling SQL horizontally is difficult
• NoSQL is designed for horizontal scaling
• Couchbase shards horizontally automatically
and transparently
• Scaling vertically has limits
• Scaling SQL usually has downtime
• Big Data!
This Photo by Unknown Author is licensed under CC BY
Why NoSQL?
Availability
• Multi-node design protects against
failure
• COUCH = Cluster of Unreliable
Commodity Hardware
• Zero down time upgrades
• Zero down time scaling
Why NoSQL?
Performance
• Designed for very high throughput
• Multi-node architecture shares the load
• Can be more cost effective than read-
replicas
• Couchbase has a memory-first architecture
• No more separate memory cache!
Why NoSQL? Agility
• Most NoSQL systems are “schema-less”
• Schema changes can slow the team
• Schema changes can cause downtime for SQL
• Updating large tables can lock the whole
table until complete
• Many modern architectures don’t really need
database-enforced schema anymore
This Photo by Unknown Author is licensed under CC BY-SA
Okay, so why SQL?
• Small scale
• On-premise, single server
• ACID transactions are required
• a.k.a. two-phase commit
• Transaction logs
• High-consistency DB backups
• Keeping it old school
Which NoSQL system
is right for me?
I prefer Couchbase, but
here’s some (hopefully)
unbiased info
Popular NoSQL
Choices
Selection criteria:
• Document database
• Runs on Microsoft / Microsoft-friendly
• Popular – db-engines.com
Disclaimer
• I am primarily a Couchbase user
• Please let me know if I’ve made any mistakes
• Most of the information here comes from
Matthew Groves (@mgroves)
This Guy
Evaluation
Criteria
Querying
•Quality of query support
•Ease of query language
Scaling
•Ease of scaling
•Cost effectiveness of scaling
Usability
•Ease of installation
•Ease of maintenance
•Ease of development
Speed
•How fast is it?
Deployment
•OS choices
•Cloud provider choices
Support
•Support options
•Licensing
•Community
•Longevity
Querying
• Proprietary JSON-based query
• Text search
• .NET and .NET Core SDK
• LINQ provider in .NET
This Photo by Unknown Author is licensed under CC BY-SA
Scaling
• Multiple node types
• Sharding options
• Master-slave / Primary-secondary
Usability
• MongoDB Compass
• Scaling / Replication decisions
This Photo by Unknown Author is licensed under CC BY-SA
Speed
• Indexing for queries
• In-memory options
Deployment
• Windows
• MacOS
• Linux
• Azure
• AWS
• Docker
This Photo by Unknown Author is licensed under CC BY-SA
Support
• MongoDB, Inc.
• Open-source (AGPL & Apache)
• Huge community / popularity
This Photo by Unknown Author is licensed under CC BY
Querying
• N1QL (SQL for JSON)
• MapReduce
• Full Text Search (FTS)
• .NET and .NET Core SDK
• Linq2Couchbase
This Photo by Unknown Author is licensed under CC BY-SA
Scaling
• Single node type
• Multi-master / "masterless"
• Auto-sharding
• Replication
• XDCR
Usability
• Built in web console
• .NET SDK idioms
• N1QL is just SQL
• Scaling
• mongoose and ottoman
This Photo by Unknown Author is licensed under CC BY-SA
Speed
• Memory-first architecture
• Indexing for N1QL
• YCSB benchmarks
• Memory-optimized indexes
Deployment
• Windows
• MacOS
• Linux
• Azure
• AWS
• Docker
This Photo by Unknown Author is licensed under CC BY-SA
Support
• Couchbase, Inc.
• Open-source (Apache 2)
• Active community
This Photo by Unknown Author is licensed under CC BY
Querying
• REST focused
• MapReduce views
• Mango query language (2.x)
This Photo by Unknown Author is licensed under CC BY-SA
Scaling
• Single node type
• Multi-master
• HAProxy
• Sharding configuration
• Replication
Usability
• Built in Futon / Fauxton
• Scaling / Replication work
• Cluster Setup Wizard (2.x)
• There is no "official" .NET SDK
• .NET SDKs are REST wrappers
This Photo by Unknown Author is licensed under CC BY-SA
Speed
• Indexing for Mango queries
• MapReduce
• Caching is external
Deployment
• Windows
• MacOS
• Linux
• Azure
• AWS
• Docker
This Photo by Unknown Author is licensed under CC BY-SA
Support
• Open-source (Apache)
• Cloudant (IBM)
This Photo by Unknown Author is licensed under CC BY
Querying
• SQL (limited)
• Stored Procedures (JS)
• Triggers (JS)
• UDFs (JS)
• .NET and .NET Core SDK
This Photo by Unknown Author is licensed under CC BY-SA
Scaling
• Handled by Azure
• Geographic distribution / affinity
• Consistency options
• Guaranteed zero data loss during
failovers
Usability
• Azure doing work for you
• .NET and .NET Core SDK
• Local emulator
• Transactions (with sprocs)
• Mongo compatible API
This Photo by Unknown Author is licensed under CC BY-SA
Speed
• Guaranteed ~10ms latency reads
• Guaranteed ~15ms latency writes
Deployment
• Windows (emulator)
• Azure
This Photo by Unknown Author is licensed under CC BY-SA
Support
• Microsoft
• Popularity is increasing, so more
community growth should be
expected
This Photo by Unknown Author is licensed under CC BY
So which do I
choose?
This Photo by Unknown Author is licensed under CC BY-NC-ND
I don't have
your answer. I
just have more
questions.
• Mobile?
• Querying?
• Cost /
licensing?
• Ops / DevOps?
• Hobby / Side?
• Resume?
• Speed?
• Transactions?
• Security?
• Integrations?
SQL Server to
Couchbase Mindmap
SQL is English for
Relational Database
SQL Invented by Don
Chamberlin & Raymond
Boyce at IBM
N1QL is English for JSON
N1QL was invented by
Gerald Sangudi at
Couchbase
SQL
Instance
Database
Table
Row
Column
Index
Datatypes
N1QL
Cluster
Bucket
Bucket, Keyspace
Document
Attribute
Index
JSON Datatypes
SQL
Input and Output: Set(s) of
Tuples
N1QL
Input and Output:
Set(s) of JSON
N1QL STMT
CREATE BUCKET
None
CREATE INDEX
None
SELECT
INSERT
UPDATE
DELETE
MERGE
Subqueries
JOIN
GROUP BY
ORDER BY
OFFSET, LIMIT
EXPLAIN
PREPARE
EXECUTE
GRANT ROLE
REVOKE ROLE
FLUSH
Tuples
SQL Model
Set of
JSON
N1QL Model
Set of
Tuples
Set of
JSON
N1QL Tooling
Web Console
Monitoring
Profiling
Dev workbench
SDK
Simba, CdataSQL Tooling
ODBC, JDBC, .NET
Hibernate, Entity Framework
N1QLResources
http://query.couchbase.com
SQL Indexes
Primary Key
Secondary Key
Composite
Range Partitioned
Expression
(Functional)
Spatial
Search
N1QL Indexes
Primary
Secondary
Composite
Partial
Expression (Functional)
Array Index
Replica(HA)
Adaptive
Spatial
SQL Logic
3 valued logic
TRUE, FALSE, NULL
N1QL Logic
4 valued logic
TRUE, FALSE, NULL,
MISSING
SQL Transactions
ACID
Multi-Statement
Savepoints
Commit/Rollback
N1QL Transactions
Single Document
atomicity
SQL Datatypes
Numeric
Decimal
Character
Date/Time
BLOB
Spatial
JSON
N1QL Datatypes
Numeric
Boolean
Character
Array
Object
Null
Conversion Functions
JSON
SQL Optimizer
Rule Based
Cost Based
Index Selection
Query Rewrites
NL, Hash, Merge join
N1QL Optimizer
Rule based
Index Selection
NL Join
SQL ACID
Atomic
Consistent
Isolated
Durable
N1QL BASE
Single doc Atomic
Consistent Data*
Optimistic Concurrency
Tunable Durability
N1QL Index Scan Consistency*
NOT_BOUNDED
AT_PLUS
REQUEST_PLUS
SQL Engine
(SMP
Scale UP)
N1QL
Engine
(MPP
Cluste Scale
OUT)
Additional SQL Features
Triggers
Stored Procedures
XML
Constraints
RAC
SQL STMT
CREATE DATABASE
CREATE TABLE
CREATE INDEX
ALTER TABLE
SELECT
INSERT
UPDATE
DELETE
MERGE
Subqueries
JOIN
GROUP BY
ORDER BY
OFFSET, FETCH, TOP
EXPLAIN
sp_prepare
sp_execute
GRANT
REVOKE
TRUNCATE
Couchbase Architecture
STORAGE
Couchbase Server 1
SHARD
7
SHARD
9
SHARD
5
SHARDSHARDSHARD
Managed Cache
Cluster
ManagerCluster
Manager
Managed Cache
Storage
Data
Service STORAGE
Couchbase Server 2
Managed Cache
Cluster
ManagerCluster
Manager
Query
Service STORAGE
Couchbase Server 3
SHARD
7
SHARD
9
SHARD
5
SHARDSHARDSHARD
Managed Cache
Cluster
ManagerCluster
Manager
Index
Service STORAGE
Couchbase Server 4
SHARD
7
SHARD
9
SHARD
5
SHARDSHARDSHARD
Managed Cache
Cluster
ManagerCluster
Manager
Search
Service STORAGE
Couchbase Server 5
SHARD
7
SHARD
9
SHARD
5
SHARDSHARDSHARD
Managed Cache
Cluster
ManagerCluster
Manager
Analytics
Service*
Managed Cache
Storage
SDK SDK
Architecture
Product SQL Server Couchbase
System Architecture SMP
Always-On readable secondary
replicas
MPP
MDS: Multi Dimensional Scaling
Data Service: MPP, Hash Partitioning
Indexing, FTS: MDS Scale-out
Query: MDS Scale-out
Query SQL, search, JSON extensions N1QL, Key-value, Full text search
Analytics (preview)
High Availability Always-On,
Log Shipping,
DB mirroring (deprecated)
Built-in intranode replication (up to 3 copies)
Built-in XDCR (cross data center replication)
Transactions ACID
Multi-statement
Single document atomicity
Data Service consistency, Index - eventual consistency
optimistic locking (CAS)
additional confirmation for durability
Drivers JDBC, ODBC, .NET, LINQ Couchbase SDK (Java, .NET, LINQ, PHP, Python, Go), Simba
JDBC/ODBC
Data Model Normalized, Denormalized Denormalized JSON model
Database Objects
Data Feature/Type SQL Server Couchbase
Database Database Bucket
Table Table Bucket, Keyspace
Row Row Document
Column Column Field/Attribute
Partition Partition (manual) Partition (hash automatic)
Data Types
Data Type SQL Server Couchbase JSON
Numbers int, bigint, smallint, tinyint, float, real,
decimal, numeric, money, smallmoney
JSON Number { "id": 5, "balance":2942.59 }
String char, varchar, nchar, nvarchar, text, ntext JSON String { "name": "Joe", "city": "Morrisville" }
Boolean bit JSON Boolean { "premium": true, ”pending": false}
Date/Time datetime, smalldatetime, datetime2,
date, time, datetimeoffset
JSON ISO 8601 string with extract,
convert and arithmetic functions
{ “soldat”: "2017-10-12T13:47:41.068-07:00" }
spatial data geometry, geography Supports nearest neighbor and spatial
distance.
"geometry": {"type": "Point", "coordinates": [-
104.99404, 39.75621]}
MISSING Not applicable, fixed schema MISSING { }
NULL NULL JSON Null { "last_address": null }
Objects Flexible JSON Objects { "address": {"street": "1 Main Street",
"city": Morrisville, "zip":"94824“} }
Arrays Flexible JSON Arrays { "hobbies": ["tennis", "skiing", "lego"] }
Understanding MISSING
Couchbase
MISSING
Value of a field absent in the JSON document or literal.
{“name”: ”joe”} Everything but the field “name” is missing from the document.
IS MISSING
Returns true if the document does not have status field
FROM CUSTOMER WHERE status IS MISSING;
IS NOT MISSING Returns true if the document has status field (even if null)
FROM CUSTOMER WHERE status IS NOT MISSING;
MISSING vs NULL MISSING is a known missing quantity
NULL is a known UNKNOWN.
Valid JSON: {“status”: null}
MISSING value Simply make the field of any type disappear by setting it to MISSING
UPDATE CUSTOMER SET status = MISSING WHERE cxid = “xyz232”
Couchbase 4-valued Boolean Logic
A B A OR B A AND B
TRUE NULL TRUE NULL
FALSE NULL FALSE NULL
TRUE MISSING TRUE MISSING
FALSE MISSING TRUE MISSING
NULL MISSING NULL MISSING
NULL NULL NULL NULL
MISSING MISSING MISSING MISSING
Statements
Feature SQL Server Couchbase
CREATE TABLE CREATE TABLE couchbase-cli bucket-create
ALTER TABLE ALTER TABLE UPDATE customer SET, UNSET
CREATE INDEX i1 on t(a, b, c DESC); CREATE INDEX i1 on t(a, b, c DESC); CREATE INDEX i1 on t(a, b, c DESC);
INSERT INTO INSERT INTO INSERT INTO
SELECT SELECT SELECT
JOINS JOINS JOIN – INNER JOIN, LEFT OUTER JOIN
GROUP BY, HAVING GROUP BY, HAVING GROUP BY, HAVING
ORDER BY a ASC, b DESC ORDER BY a ASC, b DESC ORDER BY a ASC, b DESC
OFFSET, LIMIT OFFSET, FETCH, TOP OFFSET, LIMIT
Subqueries Subqueries Subqueries
Data Modeling
Relationship SQL Server Couchbase
1:1
 Foreign Key
 Denormalize
 Embedded Object (implicit)
 Document Key reference
1:N  Foreign Key
 Embedded Array of Objects (implicit)
 Document Key reference
N:M  Foreign Key
 Embedded Array of Objects
 Arrays of objects with references
Joins
JOIN Type SQL Server Couchbase
INNER JOIN Full ANSI join ON clause requires document key
reference. Equi-join only
LEFT OUTER JOIN Full ANSI join ON clause requires document key
reference. Equi-join only
RIGHT OUTER JOIN Full ANSI join Unsupported
FULL OUTER JOIN Full ANSI join Unsupported
Transactions
Feature SQL Server Couchbase
Index updates Index is synchronously maintained Index is asynchronously maintained
Multi Statement Transaction
Yes
BEGIN, COMMIT, ROLLBACK, SAVE
No
Atomicity Multi update, Multi statement Single Document
Consistency
Consistent
Includes Dirty read support
Data access is always consistent
Index has multiple consistency levels
(NOT_BOUNDED, AT_PLUS, REQUEST_PLUS)
Isolation
Pessimistic locking with distributed lock
manager
Optimistic locking with CAS checking
Durability Durable Durable with confirmation after replication
Do you really need
transactions?
• Oftentimes, you don’t
• Good design can usually reduce the need
• If you’re using a microservice architecture,
distributed transactions have major downsides
• Bye-bye high availability
• Difficult
• Consider eventual consistency using saga patterns
This Photo by Unknown Author is licensed under CC BY-SA
Making JSON
Document POCOs
Basic POCO
public class Airline
{
public string Callsign { get; set; }
public string Country { get; set; }
[JsonProperty("iata")]
public string IATA { get; set; }
[JsonProperty("iaco")]
public string IACO { get; set; }
public int Id { get; set; }
public string Name { get; set; }
public string Type { get; set; }
}
Key: airline_10
{
"callsign": "MILE-AIR",
"country": "United States",
"iata": "Q5",
"icao": "MLA",
"id": 10,
"name": "40-Mile Air",
"type": "airline"
}
Nesting Subdocuments
Key: airport_1254
{
"airportname": "Calais Dunkerque",
"city": "Calais",
"country": "France",
"faa": "CQF",
"geo": {
"alt": 12,
"lat": 50.962097,
"lon": 1.954764
},
"icao": "LFAC",
"id": 1254,
"type": "airport",
"tz": "Europe/Paris"
}
// JSON decorators not shown for simplicity
public class Airport
{
public string AirportName { get; set; }
public string City { get; set; }
public string Country { get; set; }
public string FAA { get; set; }
public Coordinate Geo { get; set; }
public string ICAO { get; set; }
public int Id { get; set; }
public string Timezone { get; set; }
public string Type { get; set; }
}
public class Coordinate
{
public int Altitude { get; set; }
public double Latitude { get; set; }
public double Longitude { get; set; }
}
Nesting Arrays as Lists
Key: route_10000
{
"airline": "AF",
"airlineid": "airline_137",
"destinationairport": "MRS",
"distance": 2881.617376098415,
"equipment": "320",
"id": 10000,
"schedule": [
{
"day": 0,
"flight": "AF198",
"utc": "10:13:00"
},
{
"day": 0,
"flight": "AF547",
"utc": "19:14:00"
}
],
"sourceairport": "TLV",
"stops": 0,
"type": "route"
}
// JSON decorators not shown for simplicity
public class Route
{
public string Airline { get; set; }
public string AirlineId { get; set; }
public string DestinationAirport { get; set; }
public double Distance { get; set; }
public string Equipment { get; set; }
public int Id { get; set; }
public List<Schedule> Schedule { get; set; }
public string SourceAirport { get; set; }
public int Stops { get; set; }
public string Type { get; set; }
}
public class Schedule
{
public int Day { get; set; }
public string Flight { get; set; }
public TimeSpan UTC { get; set; }
}
Modeling for Joins
Key: route_10000
{
"airline": "AF",
"airlineid": "airline_137",
"destinationairport": "MRS",
"distance": 2881.617376098415,
"equipment": "320",
"id": 10000,
"schedule": [
{
"day": 0,
"flight": "AF198",
"utc": "10:13:00"
},
{
"day": 0,
"flight": "AF547",
"utc": "19:14:00"
}
],
"sourceairport": "TLV",
"stops": 0,
"type": "route"
}
Key: airline_137
{
"callsign": "AIRFRANS",
"country": "France",
"iata": "AF",
"icao": "AFR",
"id": 137,
"name": "Air France",
"type": "airline"
}
Note: Instead of the full key, you may
just store the dynamic parts, such as
“137”, and rebuild the key as needed
Good Practices
[DocumentTypeFilter(TypeString)]
public class Airline
{
public const string TypeString = "airline";
public string Callsign { get; set; }
public string Country { get; set; }
[JsonProperty("iata")]
public string IATA { get; set; }
[JsonProperty("iaco")]
public string IACO { get; set; }
public int Id { get; set; }
public string Name { get; set; }
// Type is now read only to maintain consistency
public string Type => TypeString;
}
Key: airline_10
{
"callsign": "MILE-AIR",
"country": "United States",
"iata": "Q5",
"icao": "MLA",
"id": 10,
"name": "40-Mile Air",
"type": "airline"
}
Live Demo!
This should be interesting…
https://github.com/brantburnett/Couchbase.Net.StepByStep
This Photo by Unknown Author is licensed under CC BY-NC-SA
Querying Couchbase Using N1QL
(Pronounced “nickel”)
Start off
Simple
SELECT *
FROM `travel-sample` as airline
WHERE airline.type = 'airline'
ORDER BY airline.name
LIMIT 10
O backtick, backtick!
Wherefore art thou a
backtick?
• ANSI SQL delimits identifiers with double quotes
• SELECT * FROM "table-name"
• T-SQL also delimits identifiers with square
brackets
• SELECT * FROM [table-name]
• Both of these are used in JSON!
• {"array": ["string1", "string2"]}
• So, N1QL uses the backtick instead
• SELECT * FROM `bucket-name`
This Photo by Unknown Author is licensed under CC BY-NC-ND
JOIN
SELECT airline.name, route.stops, route.schedule
FROM `travel-sample` AS route
INNER JOIN `travel-sample` AS airline
ON KEYS route.airlineid
WHERE route.type = 'route'
AND route.sourceairport = 'ATL'
AND route.destinationairport = 'ABE'
ORDER BY airline.name, route.stops
JOIN
with
LINQ
from route in db.Query<Route>()
join airline in db.Query<Airline>()
on route.AirlineId equals N1QlFunctions.Key(airline)
where route.SourceAirport == "ATL"
&& route.DestinationAirport == "ABE"
orderby airline.Name, route.Stops
select new
{
AirlineName = airline.Name,
route.Stops,
route.Schedule
};
What about
nested arrays?
Key: route_10000
{
"airline": "AF",
"airlineid": "airline_137",
"destinationairport": "MRS",
"distance": 2881.617376098415,
"equipment": "320",
"id": 10000,
"schedule": [
{
"day": 0,
"flight": "AF198",
"utc": "10:13:00"
},
{
"day": 0,
"flight": "AF547",
"utc": "19:14:00"
}
],
"sourceairport": "TLV",
"stops": 0,
"type": "route"
}
UNNEST
SELECT route.airline, schedule.*
FROM `travel-sample` as route
UNNEST route.schedule as schedule
WHERE route.type = 'route’
AND route.sourceairport = 'MRS’
AND route.destinationairport = 'TLV'
LIMIT 10
UNNEST
with
LINQ
from route in context.Query<Route>()
from schedule in route.Schedules
where route.SourceAirport == "MRS"
&& route.DestinationAirport == "TLV"
select new
{
route.Airline,
schedule
};
Other N1QL
Specific
Features
• Turn an array of document keys into an array of
documents (NEST)
• Subqueries on nested arrays (ARRAY..END)
• Test nested arrays for matching values
(ANY..END)
• Most are implemented transparently by LINQ
Indexing Couchbase
SQL Server vs Couchbase Indexes
Airline
SQL Table
Airport
SQL Table
travel-sample
Bucket
Airline Indexes
Airport Indexes
Bucket Indexes
Index
Predicates
• Predicates only include certain documents in the index
• Reduces index size
• Queries must include the predicate to use the index
CREATE INDEX airlinesByName
ON `travel-sample` (name)
WHERE type = 'airline'
Indexing
Expressions
• Any deterministic function or operation may be used
• Queries must use the same expression to use the index
CREATE INDEX airlinesByNameInsensitive
ON `travel-sample` (LOWER(name))
WHERE type = 'airline'
Indexing
Date/Times
• LINQ assumes DateTime props are stored as ISO8601
• LINQ wraps DateTimes in STR_TO_MILLIS()
• Be sure any indexes also use STR_TO_MILLIS()
CREATE INDEX beersByUpdated
ON `beer-sample` (
STR_TO_MILLIS(updated))
WHERE type = 'beer'
Indexing
Arrays
• A nested array may be indexed
• Limit one array per index
• Queries must use array subqueries to use the index
CREATE INDEX routesByFlight
ON `travel-sample` (airline,
DISTINCT ARRAY p.flight
FOR p IN schedule END)
WHERE type = 'route'
SQL Server
Index
Consistency
SQL Server updates
indexes as part of the
transaction
Follow-on queries
immediately include
any mutations from the
previous query
Locks are used to
manage isolation
between simultaneous
transactions
Couchbase
Index
Consistency
Couchbase indexes
are updated
asynchronously
Follow-on queries
may or may not
include the
preceding mutations
Promotes high
mutation
performance
Querying
With
Consistency –
not_bounded
Returns results based on what has
been indexed at the time of the
query
Fastest, and the default
Most likely to return out-of-date
results
Querying
With
Consistency –
at_plus
Supply a mutation token, which is returned
from previous mutations
Ensures that the mutations have been
indexed
Slower than not_bounded
More up-to-date results
Also known as Read Your Own Write
(RYOW)
Querying
With
Consistency –
request_plus
Ensures that indexes are fully
updated with all mutations prior to
time when the query is submitted
Slowest, only use if necessary
Most up-to-date results
Resources
https://developer.couchbase.com/
https://forums.couchbase.com/
https://github.com/couchbaselabs/Linq2Couchbase
http://centeredgesoftware.com/
@btburnett3 on Twitter
Questions,
Take Two
This Photo by Unknown Author is licensed under CC BY-NC-ND
Thanks for Coming!

NoSQL for SQL Server Developers using Couchbase

  • 1.
    NoSQL for SQL Serverdevelopers using Couchbase TriNUG Data SIG 3/7/2018
  • 2.
    Who is thisguy? • Brant Burnett - @btburnett3 • Systems Architect at CenterEdge Software • .NET since 1.0, SQL Server since 7.0 • MCSD, MCDBA • Experience from desktop apps to large scale cloud services
  • 3.
    NoSQL Credentials • Couchbaseuser since 2012 (v1.8) • Couchbase Expert • Open source contributions: • Couchbase .NET SDK • Couchbase.Extensions for .NET Core • Couchbase LINQ provider (Linq2Couchbase) • CouchbaseFakeIt • couchbase-index-manager
  • 4.
    Content Attributions • Matthew Groves DeveloperAdvocate at Couchbase @mgroves on Twitter • Raju Suravarjjala Couchbase R&D • Keshav Murthy Couchbase R&D
  • 5.
    Agenda What’s NoSQL? Why NoSQL? Somepopular NoSQL options SQL Server to Couchbase Mindmap Live Demo: ASP.NET Core Microservice Querying and Indexing Couchbase
  • 6.
    Let’s get some questions… first??? ThisPhoto by Unknown Author is licensed under CC BY-NC-ND
  • 7.
    What’s NoSQL? It’s just theopposite of SQL, right? Document • Couchbase • MongoDB • DynamoDB • CosmosDB Graph • OrientDB • Neo4J • GraphBase • CosmosDB Key-Value • Couchbase • Riak • BerkeleyDB • Redis • CosmosDB Wide Column • Hbase • Cassandra • Hypertable • CosmosDB
  • 8.
    NoSQL Document Databases • Getand set documents by key • Imagine a giant folder full of JSON files • If you know the filename, you can get or update the content • Often offer map/reduce or query functionality
  • 9.
    Why NoSQL? SQL Servercan do it all, right?
  • 10.
    Why NoSQL? Scalability • Horizontalscaling • Scaling SQL horizontally is difficult • NoSQL is designed for horizontal scaling • Couchbase shards horizontally automatically and transparently • Scaling vertically has limits • Scaling SQL usually has downtime • Big Data! This Photo by Unknown Author is licensed under CC BY
  • 11.
    Why NoSQL? Availability • Multi-nodedesign protects against failure • COUCH = Cluster of Unreliable Commodity Hardware • Zero down time upgrades • Zero down time scaling
  • 12.
    Why NoSQL? Performance • Designedfor very high throughput • Multi-node architecture shares the load • Can be more cost effective than read- replicas • Couchbase has a memory-first architecture • No more separate memory cache!
  • 13.
    Why NoSQL? Agility •Most NoSQL systems are “schema-less” • Schema changes can slow the team • Schema changes can cause downtime for SQL • Updating large tables can lock the whole table until complete • Many modern architectures don’t really need database-enforced schema anymore This Photo by Unknown Author is licensed under CC BY-SA
  • 14.
    Okay, so whySQL? • Small scale • On-premise, single server • ACID transactions are required • a.k.a. two-phase commit • Transaction logs • High-consistency DB backups • Keeping it old school
  • 15.
    Which NoSQL system isright for me? I prefer Couchbase, but here’s some (hopefully) unbiased info
  • 16.
    Popular NoSQL Choices Selection criteria: •Document database • Runs on Microsoft / Microsoft-friendly • Popular – db-engines.com
  • 17.
    Disclaimer • I amprimarily a Couchbase user • Please let me know if I’ve made any mistakes • Most of the information here comes from Matthew Groves (@mgroves) This Guy
  • 18.
    Evaluation Criteria Querying •Quality of querysupport •Ease of query language Scaling •Ease of scaling •Cost effectiveness of scaling Usability •Ease of installation •Ease of maintenance •Ease of development Speed •How fast is it? Deployment •OS choices •Cloud provider choices Support •Support options •Licensing •Community •Longevity
  • 19.
    Querying • Proprietary JSON-basedquery • Text search • .NET and .NET Core SDK • LINQ provider in .NET This Photo by Unknown Author is licensed under CC BY-SA
  • 20.
    Scaling • Multiple nodetypes • Sharding options • Master-slave / Primary-secondary
  • 21.
    Usability • MongoDB Compass •Scaling / Replication decisions This Photo by Unknown Author is licensed under CC BY-SA
  • 22.
    Speed • Indexing forqueries • In-memory options
  • 23.
    Deployment • Windows • MacOS •Linux • Azure • AWS • Docker This Photo by Unknown Author is licensed under CC BY-SA
  • 24.
    Support • MongoDB, Inc. •Open-source (AGPL & Apache) • Huge community / popularity This Photo by Unknown Author is licensed under CC BY
  • 25.
    Querying • N1QL (SQLfor JSON) • MapReduce • Full Text Search (FTS) • .NET and .NET Core SDK • Linq2Couchbase This Photo by Unknown Author is licensed under CC BY-SA
  • 26.
    Scaling • Single nodetype • Multi-master / "masterless" • Auto-sharding • Replication • XDCR
  • 27.
    Usability • Built inweb console • .NET SDK idioms • N1QL is just SQL • Scaling • mongoose and ottoman This Photo by Unknown Author is licensed under CC BY-SA
  • 28.
    Speed • Memory-first architecture •Indexing for N1QL • YCSB benchmarks • Memory-optimized indexes
  • 29.
    Deployment • Windows • MacOS •Linux • Azure • AWS • Docker This Photo by Unknown Author is licensed under CC BY-SA
  • 30.
    Support • Couchbase, Inc. •Open-source (Apache 2) • Active community This Photo by Unknown Author is licensed under CC BY
  • 31.
    Querying • REST focused •MapReduce views • Mango query language (2.x) This Photo by Unknown Author is licensed under CC BY-SA
  • 32.
    Scaling • Single nodetype • Multi-master • HAProxy • Sharding configuration • Replication
  • 33.
    Usability • Built inFuton / Fauxton • Scaling / Replication work • Cluster Setup Wizard (2.x) • There is no "official" .NET SDK • .NET SDKs are REST wrappers This Photo by Unknown Author is licensed under CC BY-SA
  • 34.
    Speed • Indexing forMango queries • MapReduce • Caching is external
  • 35.
    Deployment • Windows • MacOS •Linux • Azure • AWS • Docker This Photo by Unknown Author is licensed under CC BY-SA
  • 36.
    Support • Open-source (Apache) •Cloudant (IBM) This Photo by Unknown Author is licensed under CC BY
  • 37.
    Querying • SQL (limited) •Stored Procedures (JS) • Triggers (JS) • UDFs (JS) • .NET and .NET Core SDK This Photo by Unknown Author is licensed under CC BY-SA
  • 38.
    Scaling • Handled byAzure • Geographic distribution / affinity • Consistency options • Guaranteed zero data loss during failovers
  • 39.
    Usability • Azure doingwork for you • .NET and .NET Core SDK • Local emulator • Transactions (with sprocs) • Mongo compatible API This Photo by Unknown Author is licensed under CC BY-SA
  • 40.
    Speed • Guaranteed ~10mslatency reads • Guaranteed ~15ms latency writes
  • 41.
    Deployment • Windows (emulator) •Azure This Photo by Unknown Author is licensed under CC BY-SA
  • 42.
    Support • Microsoft • Popularityis increasing, so more community growth should be expected This Photo by Unknown Author is licensed under CC BY
  • 43.
    So which doI choose? This Photo by Unknown Author is licensed under CC BY-NC-ND
  • 44.
    I don't have youranswer. I just have more questions. • Mobile? • Querying? • Cost / licensing? • Ops / DevOps? • Hobby / Side? • Resume? • Speed? • Transactions? • Security? • Integrations?
  • 45.
  • 46.
    SQL is Englishfor Relational Database SQL Invented by Don Chamberlin & Raymond Boyce at IBM N1QL is English for JSON N1QL was invented by Gerald Sangudi at Couchbase SQL Instance Database Table Row Column Index Datatypes N1QL Cluster Bucket Bucket, Keyspace Document Attribute Index JSON Datatypes SQL Input and Output: Set(s) of Tuples N1QL Input and Output: Set(s) of JSON N1QL STMT CREATE BUCKET None CREATE INDEX None SELECT INSERT UPDATE DELETE MERGE Subqueries JOIN GROUP BY ORDER BY OFFSET, LIMIT EXPLAIN PREPARE EXECUTE GRANT ROLE REVOKE ROLE FLUSH Tuples SQL Model Set of JSON N1QL Model Set of Tuples Set of JSON N1QL Tooling Web Console Monitoring Profiling Dev workbench SDK Simba, CdataSQL Tooling ODBC, JDBC, .NET Hibernate, Entity Framework N1QLResources http://query.couchbase.com SQL Indexes Primary Key Secondary Key Composite Range Partitioned Expression (Functional) Spatial Search N1QL Indexes Primary Secondary Composite Partial Expression (Functional) Array Index Replica(HA) Adaptive Spatial SQL Logic 3 valued logic TRUE, FALSE, NULL N1QL Logic 4 valued logic TRUE, FALSE, NULL, MISSING SQL Transactions ACID Multi-Statement Savepoints Commit/Rollback N1QL Transactions Single Document atomicity SQL Datatypes Numeric Decimal Character Date/Time BLOB Spatial JSON N1QL Datatypes Numeric Boolean Character Array Object Null Conversion Functions JSON SQL Optimizer Rule Based Cost Based Index Selection Query Rewrites NL, Hash, Merge join N1QL Optimizer Rule based Index Selection NL Join SQL ACID Atomic Consistent Isolated Durable N1QL BASE Single doc Atomic Consistent Data* Optimistic Concurrency Tunable Durability N1QL Index Scan Consistency* NOT_BOUNDED AT_PLUS REQUEST_PLUS SQL Engine (SMP Scale UP) N1QL Engine (MPP Cluste Scale OUT) Additional SQL Features Triggers Stored Procedures XML Constraints RAC SQL STMT CREATE DATABASE CREATE TABLE CREATE INDEX ALTER TABLE SELECT INSERT UPDATE DELETE MERGE Subqueries JOIN GROUP BY ORDER BY OFFSET, FETCH, TOP EXPLAIN sp_prepare sp_execute GRANT REVOKE TRUNCATE
  • 47.
    Couchbase Architecture STORAGE Couchbase Server1 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster ManagerCluster Manager Managed Cache Storage Data Service STORAGE Couchbase Server 2 Managed Cache Cluster ManagerCluster Manager Query Service STORAGE Couchbase Server 3 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster ManagerCluster Manager Index Service STORAGE Couchbase Server 4 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster ManagerCluster Manager Search Service STORAGE Couchbase Server 5 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster ManagerCluster Manager Analytics Service* Managed Cache Storage SDK SDK
  • 48.
    Architecture Product SQL ServerCouchbase System Architecture SMP Always-On readable secondary replicas MPP MDS: Multi Dimensional Scaling Data Service: MPP, Hash Partitioning Indexing, FTS: MDS Scale-out Query: MDS Scale-out Query SQL, search, JSON extensions N1QL, Key-value, Full text search Analytics (preview) High Availability Always-On, Log Shipping, DB mirroring (deprecated) Built-in intranode replication (up to 3 copies) Built-in XDCR (cross data center replication) Transactions ACID Multi-statement Single document atomicity Data Service consistency, Index - eventual consistency optimistic locking (CAS) additional confirmation for durability Drivers JDBC, ODBC, .NET, LINQ Couchbase SDK (Java, .NET, LINQ, PHP, Python, Go), Simba JDBC/ODBC Data Model Normalized, Denormalized Denormalized JSON model
  • 49.
    Database Objects Data Feature/TypeSQL Server Couchbase Database Database Bucket Table Table Bucket, Keyspace Row Row Document Column Column Field/Attribute Partition Partition (manual) Partition (hash automatic)
  • 50.
    Data Types Data TypeSQL Server Couchbase JSON Numbers int, bigint, smallint, tinyint, float, real, decimal, numeric, money, smallmoney JSON Number { "id": 5, "balance":2942.59 } String char, varchar, nchar, nvarchar, text, ntext JSON String { "name": "Joe", "city": "Morrisville" } Boolean bit JSON Boolean { "premium": true, ”pending": false} Date/Time datetime, smalldatetime, datetime2, date, time, datetimeoffset JSON ISO 8601 string with extract, convert and arithmetic functions { “soldat”: "2017-10-12T13:47:41.068-07:00" } spatial data geometry, geography Supports nearest neighbor and spatial distance. "geometry": {"type": "Point", "coordinates": [- 104.99404, 39.75621]} MISSING Not applicable, fixed schema MISSING { } NULL NULL JSON Null { "last_address": null } Objects Flexible JSON Objects { "address": {"street": "1 Main Street", "city": Morrisville, "zip":"94824“} } Arrays Flexible JSON Arrays { "hobbies": ["tennis", "skiing", "lego"] }
  • 51.
    Understanding MISSING Couchbase MISSING Value ofa field absent in the JSON document or literal. {“name”: ”joe”} Everything but the field “name” is missing from the document. IS MISSING Returns true if the document does not have status field FROM CUSTOMER WHERE status IS MISSING; IS NOT MISSING Returns true if the document has status field (even if null) FROM CUSTOMER WHERE status IS NOT MISSING; MISSING vs NULL MISSING is a known missing quantity NULL is a known UNKNOWN. Valid JSON: {“status”: null} MISSING value Simply make the field of any type disappear by setting it to MISSING UPDATE CUSTOMER SET status = MISSING WHERE cxid = “xyz232”
  • 52.
    Couchbase 4-valued BooleanLogic A B A OR B A AND B TRUE NULL TRUE NULL FALSE NULL FALSE NULL TRUE MISSING TRUE MISSING FALSE MISSING TRUE MISSING NULL MISSING NULL MISSING NULL NULL NULL NULL MISSING MISSING MISSING MISSING
  • 53.
    Statements Feature SQL ServerCouchbase CREATE TABLE CREATE TABLE couchbase-cli bucket-create ALTER TABLE ALTER TABLE UPDATE customer SET, UNSET CREATE INDEX i1 on t(a, b, c DESC); CREATE INDEX i1 on t(a, b, c DESC); CREATE INDEX i1 on t(a, b, c DESC); INSERT INTO INSERT INTO INSERT INTO SELECT SELECT SELECT JOINS JOINS JOIN – INNER JOIN, LEFT OUTER JOIN GROUP BY, HAVING GROUP BY, HAVING GROUP BY, HAVING ORDER BY a ASC, b DESC ORDER BY a ASC, b DESC ORDER BY a ASC, b DESC OFFSET, LIMIT OFFSET, FETCH, TOP OFFSET, LIMIT Subqueries Subqueries Subqueries
  • 54.
    Data Modeling Relationship SQLServer Couchbase 1:1  Foreign Key  Denormalize  Embedded Object (implicit)  Document Key reference 1:N  Foreign Key  Embedded Array of Objects (implicit)  Document Key reference N:M  Foreign Key  Embedded Array of Objects  Arrays of objects with references
  • 55.
    Joins JOIN Type SQLServer Couchbase INNER JOIN Full ANSI join ON clause requires document key reference. Equi-join only LEFT OUTER JOIN Full ANSI join ON clause requires document key reference. Equi-join only RIGHT OUTER JOIN Full ANSI join Unsupported FULL OUTER JOIN Full ANSI join Unsupported
  • 56.
    Transactions Feature SQL ServerCouchbase Index updates Index is synchronously maintained Index is asynchronously maintained Multi Statement Transaction Yes BEGIN, COMMIT, ROLLBACK, SAVE No Atomicity Multi update, Multi statement Single Document Consistency Consistent Includes Dirty read support Data access is always consistent Index has multiple consistency levels (NOT_BOUNDED, AT_PLUS, REQUEST_PLUS) Isolation Pessimistic locking with distributed lock manager Optimistic locking with CAS checking Durability Durable Durable with confirmation after replication
  • 57.
    Do you reallyneed transactions? • Oftentimes, you don’t • Good design can usually reduce the need • If you’re using a microservice architecture, distributed transactions have major downsides • Bye-bye high availability • Difficult • Consider eventual consistency using saga patterns This Photo by Unknown Author is licensed under CC BY-SA
  • 58.
  • 59.
    Basic POCO public classAirline { public string Callsign { get; set; } public string Country { get; set; } [JsonProperty("iata")] public string IATA { get; set; } [JsonProperty("iaco")] public string IACO { get; set; } public int Id { get; set; } public string Name { get; set; } public string Type { get; set; } } Key: airline_10 { "callsign": "MILE-AIR", "country": "United States", "iata": "Q5", "icao": "MLA", "id": 10, "name": "40-Mile Air", "type": "airline" }
  • 60.
    Nesting Subdocuments Key: airport_1254 { "airportname":"Calais Dunkerque", "city": "Calais", "country": "France", "faa": "CQF", "geo": { "alt": 12, "lat": 50.962097, "lon": 1.954764 }, "icao": "LFAC", "id": 1254, "type": "airport", "tz": "Europe/Paris" } // JSON decorators not shown for simplicity public class Airport { public string AirportName { get; set; } public string City { get; set; } public string Country { get; set; } public string FAA { get; set; } public Coordinate Geo { get; set; } public string ICAO { get; set; } public int Id { get; set; } public string Timezone { get; set; } public string Type { get; set; } } public class Coordinate { public int Altitude { get; set; } public double Latitude { get; set; } public double Longitude { get; set; } }
  • 61.
    Nesting Arrays asLists Key: route_10000 { "airline": "AF", "airlineid": "airline_137", "destinationairport": "MRS", "distance": 2881.617376098415, "equipment": "320", "id": 10000, "schedule": [ { "day": 0, "flight": "AF198", "utc": "10:13:00" }, { "day": 0, "flight": "AF547", "utc": "19:14:00" } ], "sourceairport": "TLV", "stops": 0, "type": "route" } // JSON decorators not shown for simplicity public class Route { public string Airline { get; set; } public string AirlineId { get; set; } public string DestinationAirport { get; set; } public double Distance { get; set; } public string Equipment { get; set; } public int Id { get; set; } public List<Schedule> Schedule { get; set; } public string SourceAirport { get; set; } public int Stops { get; set; } public string Type { get; set; } } public class Schedule { public int Day { get; set; } public string Flight { get; set; } public TimeSpan UTC { get; set; } }
  • 62.
    Modeling for Joins Key:route_10000 { "airline": "AF", "airlineid": "airline_137", "destinationairport": "MRS", "distance": 2881.617376098415, "equipment": "320", "id": 10000, "schedule": [ { "day": 0, "flight": "AF198", "utc": "10:13:00" }, { "day": 0, "flight": "AF547", "utc": "19:14:00" } ], "sourceairport": "TLV", "stops": 0, "type": "route" } Key: airline_137 { "callsign": "AIRFRANS", "country": "France", "iata": "AF", "icao": "AFR", "id": 137, "name": "Air France", "type": "airline" } Note: Instead of the full key, you may just store the dynamic parts, such as “137”, and rebuild the key as needed
  • 63.
    Good Practices [DocumentTypeFilter(TypeString)] public classAirline { public const string TypeString = "airline"; public string Callsign { get; set; } public string Country { get; set; } [JsonProperty("iata")] public string IATA { get; set; } [JsonProperty("iaco")] public string IACO { get; set; } public int Id { get; set; } public string Name { get; set; } // Type is now read only to maintain consistency public string Type => TypeString; } Key: airline_10 { "callsign": "MILE-AIR", "country": "United States", "iata": "Q5", "icao": "MLA", "id": 10, "name": "40-Mile Air", "type": "airline" }
  • 64.
    Live Demo! This shouldbe interesting… https://github.com/brantburnett/Couchbase.Net.StepByStep This Photo by Unknown Author is licensed under CC BY-NC-SA
  • 65.
    Querying Couchbase UsingN1QL (Pronounced “nickel”)
  • 66.
    Start off Simple SELECT * FROM`travel-sample` as airline WHERE airline.type = 'airline' ORDER BY airline.name LIMIT 10
  • 67.
    O backtick, backtick! Whereforeart thou a backtick? • ANSI SQL delimits identifiers with double quotes • SELECT * FROM "table-name" • T-SQL also delimits identifiers with square brackets • SELECT * FROM [table-name] • Both of these are used in JSON! • {"array": ["string1", "string2"]} • So, N1QL uses the backtick instead • SELECT * FROM `bucket-name` This Photo by Unknown Author is licensed under CC BY-NC-ND
  • 68.
    JOIN SELECT airline.name, route.stops,route.schedule FROM `travel-sample` AS route INNER JOIN `travel-sample` AS airline ON KEYS route.airlineid WHERE route.type = 'route' AND route.sourceairport = 'ATL' AND route.destinationairport = 'ABE' ORDER BY airline.name, route.stops
  • 69.
    JOIN with LINQ from route indb.Query<Route>() join airline in db.Query<Airline>() on route.AirlineId equals N1QlFunctions.Key(airline) where route.SourceAirport == "ATL" && route.DestinationAirport == "ABE" orderby airline.Name, route.Stops select new { AirlineName = airline.Name, route.Stops, route.Schedule };
  • 70.
    What about nested arrays? Key:route_10000 { "airline": "AF", "airlineid": "airline_137", "destinationairport": "MRS", "distance": 2881.617376098415, "equipment": "320", "id": 10000, "schedule": [ { "day": 0, "flight": "AF198", "utc": "10:13:00" }, { "day": 0, "flight": "AF547", "utc": "19:14:00" } ], "sourceairport": "TLV", "stops": 0, "type": "route" }
  • 71.
    UNNEST SELECT route.airline, schedule.* FROM`travel-sample` as route UNNEST route.schedule as schedule WHERE route.type = 'route’ AND route.sourceairport = 'MRS’ AND route.destinationairport = 'TLV' LIMIT 10
  • 72.
    UNNEST with LINQ from route incontext.Query<Route>() from schedule in route.Schedules where route.SourceAirport == "MRS" && route.DestinationAirport == "TLV" select new { route.Airline, schedule };
  • 73.
    Other N1QL Specific Features • Turnan array of document keys into an array of documents (NEST) • Subqueries on nested arrays (ARRAY..END) • Test nested arrays for matching values (ANY..END) • Most are implemented transparently by LINQ
  • 74.
  • 75.
    SQL Server vsCouchbase Indexes Airline SQL Table Airport SQL Table travel-sample Bucket Airline Indexes Airport Indexes Bucket Indexes
  • 76.
    Index Predicates • Predicates onlyinclude certain documents in the index • Reduces index size • Queries must include the predicate to use the index CREATE INDEX airlinesByName ON `travel-sample` (name) WHERE type = 'airline'
  • 77.
    Indexing Expressions • Any deterministicfunction or operation may be used • Queries must use the same expression to use the index CREATE INDEX airlinesByNameInsensitive ON `travel-sample` (LOWER(name)) WHERE type = 'airline'
  • 78.
    Indexing Date/Times • LINQ assumesDateTime props are stored as ISO8601 • LINQ wraps DateTimes in STR_TO_MILLIS() • Be sure any indexes also use STR_TO_MILLIS() CREATE INDEX beersByUpdated ON `beer-sample` ( STR_TO_MILLIS(updated)) WHERE type = 'beer'
  • 79.
    Indexing Arrays • A nestedarray may be indexed • Limit one array per index • Queries must use array subqueries to use the index CREATE INDEX routesByFlight ON `travel-sample` (airline, DISTINCT ARRAY p.flight FOR p IN schedule END) WHERE type = 'route'
  • 80.
    SQL Server Index Consistency SQL Serverupdates indexes as part of the transaction Follow-on queries immediately include any mutations from the previous query Locks are used to manage isolation between simultaneous transactions
  • 81.
    Couchbase Index Consistency Couchbase indexes are updated asynchronously Follow-onqueries may or may not include the preceding mutations Promotes high mutation performance
  • 82.
    Querying With Consistency – not_bounded Returns resultsbased on what has been indexed at the time of the query Fastest, and the default Most likely to return out-of-date results
  • 83.
    Querying With Consistency – at_plus Supply amutation token, which is returned from previous mutations Ensures that the mutations have been indexed Slower than not_bounded More up-to-date results Also known as Read Your Own Write (RYOW)
  • 84.
    Querying With Consistency – request_plus Ensures thatindexes are fully updated with all mutations prior to time when the query is submitted Slowest, only use if necessary Most up-to-date results
  • 85.
  • 86.
    Questions, Take Two This Photoby Unknown Author is licensed under CC BY-NC-ND
  • 87.

Editor's Notes

  • #7 What are some of the questions you have that you’d like to see answered in this presentation? Who is a SQL Server user? Who has used NoSQL? Couchbase? MongoDB?
  • #8 Key-Value = Have a key, set a value Document = Similar to key value, but with more complex data documents for each key, and often include querying capabilities Wide Column = Very large rows and columns but no relationships Graph = Store relationships between objects, easy to answer questions like “who like bananas and wears hoodies?”
  • #11 Why design a service on SQL today that you think will need NoSQL scale next year?
  • #12 CouchDB is not Couchbase, two completely different systems (though they are both NoSQL document databases)
  • #13 Read replicas only help scale out read load, not read and write
  • #14 Schema-less is something of a misnomer, it’s really just schema that’s not enforced at the database level Database schema is more of a throwback to the days when multiple applications were reading and writing from the same database, so it was more difficult to maintain consistency
  • #15 Transactional backups = backup that always captures state between ACID transactions
  • #17 MongoDB - #1 Couchbase - #3 CouchDB - #4 Azure CosmosDB - #5 Notable exclusions: Amazon DynamoDB #2 – Limited to AWS (though it can be used from .NET) Cassandra, Redis – Not document DBs RavenDB – Has been very popular for .NET historically, but is currently #69 on db-engines.com
  • #20 Query syntax is limited in terms of joins, unions, etc Text search is present, but it's limited compared to elasticsearch, etc .NET SDK follows .NET idioms pretty well, there's a bit of weirdness having to convert objects to BsonDocuments, but it's not that bad LINQ provider built into Mongo .NET SDK, but limited due to the underlying query capabilities EF Core Provider???
  • #21 Scaling is possible, you set up multiple types of nodes, configure sharding Replication is a master/slave setup, meaning that a single member of the cluster is the master and is the only one allowed to modify data And that includes between data centers
  • #22 Compass = GUI Decision fatigue Insecure by default: anonymous admin access
  • #23 Indexing is important for querying Mongo has an in-memory option and an on-disk option So that's a basic tradeoff you can make
  • #24 Windows / Mac / Linux AWS / Azure / Google support (it is VMs) Docker DBaaS – lot of managed partners like Mongolab, MongoSoup, etc MongoDB Atlas is mongo's own managed DBaaS
  • #25 Licensing! It's AGPL, and some enterprises don't like that. In fact, they list it as a possible weakness/threat in their S-1
  • #26 I really like N1QL, it’s incredibly easy to learn for a SQL user I already know how to write SQL, so I can apply that a nosql database Linq2Couchbase is a linq provider that generates N1QL (it's not officially supported yet)
  • #27 Multi-master
  • #28 Security: with 4.x it was *mostly* secure by default You need to setup a password, but you can create buckets without passwords With 5.x it is completely secure by default Mongoose and ottoman allow switch from MongoDB to Couchbase in NodeJS
  • #29 Explain memcached and couchdb
  • #30 Windows / Mac / Linux AWS / Azure / Google support (it is VMs) Docker no DBaaS, managed database (yet)
  • #31 Gitter.im and forums
  • #32 Mongo-inspired query syntax, based on Cloudant Query this is in version 2.x though MapReduce which is great for performance, but it is javascript and doesn't accommodate adhoc queries Mango is "mongo inspired" I don't know if it's mongo compatible or meant to be mongo compatible
  • #33 You can setup a cluster relatively easy But you need to run a proxy in front of it, like HAProxy (couchdb recommends) Sharding you need to configure the number of shards per database Replication and conflict management is something that couchdb is good at When removing a node from a cluster you have to make sure to move shards away So there is some manual work involved in managing scale
  • #34 Futon is web console in 1.x Fauxton is web console in 2.x Insecure by default: anonymous admin access
  • #35 CouchDb's design assumes that caching will be handled by the operating system, by the browser, by a proxy you setup Not by couchdb itself
  • #36 Windows / Mac / Linux AWS / Azure / Google support (VMs) Docker (there isn't an official 2.x on docker hub yet) DBaaS – Cloudant
  • #37 Cloudant is compatible Couchbase Lite and Sync Gateway are "compatible" with couchdb (version 2 of Couchbase Lite will probably change that) But interop between them is not supported
  • #38 There is a SQL language for CosmosDB but it is very limited No intra-document joins, no GROUP BY, no insert/update/delete Cosmos has sprocs, triggers, udfs, but you have to write them in ECMAScript 2015 (JavaScript)
  • #39 Azure handles the scaling for you You set request units per second or per minute and cosmos db will scale to handle that Cosmosdb has 5 consistency options, so you can explicitly trade off between strong consistency and eventual consistency, "guarantee" is Microsoft's wording
  • #41 "guarantee" is Microsoft's wording
  • #42 This is azure only You can run the emulator on windows, the emulator is not meant for production But I'm sure some joker is going to try it The emulator is also available in docker so hypothetically you could deploy that anywhere
  • #43 Microsoft support only, of course This is not open source (which used to be implied with Microsoft, but I feel like I have to say that now)
  • #45 Mobile? look at Couchbase, Sync Gateway, Couchbase Mobile, and maybe CouchDb Querying? Look at Couchbase, CosmosDb Cost? Mongo, CouchDb, Couchbase CE Ops/DevOps proficient? Mongo, CouchDb Ops/DevOps deficient? CosmosDb, Couchbase, MongoDb Atlas Hobby? Mongo, CouchDb, Couchbase Resume? Go with the popular one. Speed? All pretty good Transactions? SQL Security? Be careful with Mongo, CouchDb, not because they are insecure, but because they are insecure by default Integrations? This is huge. Databases generally don't just sit behind one app. They need to integrate with other software. Maybe stick to relational, unless a connector exists.
  • #47 Point out Don Chamberlin consults with Couchbase now
  • #48 Multiple services can be placed on the same node (required for Community Edition) Single dev node, up to lots and lots of nodes (definitely clusters of 70-80 nodes, I’ve heard rumors of over 200 nodes)
  • #49 SMP: symmetric multiprocessing, scale up vertically MPP: massively parallel processing XDCR and MDS = Enterprise only Note: Always-On is EXPENSIVE ODBC = Only for integrations with apps that require flat data
  • #51 Date/times are sometimes also stored as Unix timestamps instead of ISO8601
  • #52 Also note IS VALUED and IS NOT VALUED
  • #54 Note similarities!!!
  • #60 Note camel casing Note use of JsonProperty to give more precise names
  • #63 Key doesn’t need to be stored in it’s complete form To save space, you just need enough data to recreate the key in your query Storing the whole key is usually simpler
  • #64 You can create custom DocumentFilter attributes to filter in other ways They can also be applied programmatically instead of declaratively
  • #66 Non-first normal form query language
  • #67 Note how incredibly similar to an ANSI SQL query this is Difference from T-SQL is primarily LIMIT instead of TOP
  • #69 Note the ON KEYS clause, which identifies what document keys on the left to select This clause can be built using an expression from attributes in the route, if needed
  • #73 Second from statement is equivalent to .SelectMany() functional syntax
  • #77 Filtering on the “type” attribute makes an index more like a SQL table index
  • #78 This index is a case insensitive version of the previous index
  • #79 Using STR_TO_MILLIS accounts for time zone offsets in ISO 8601, which simple string comparisons does not
  • #80 This index allows you to find routes that contain a specific flight number, even though the flight number is nested within an array