SQL
START!
2022
Data Integrity with
SQL Database Ledger
Gianluca Hotz
ghotz@ugiss.org | @glhotz
Who am I?
• Gianluca Hotz | @glhotz | ghotz@ugiss.org
• Independent Consultant
• 25+ years on SQL Server (from 4.21 back in 1996)
• Database modeling & development, sizing & administration, modernization
(upgrades & migrations), performance tuning, security
• Community
• 24 years Microsoft MVP SQL Server/Data Platform (from 1998)
• VMware Experts SQL Server
• Founder and president UGISS (ex «PASS Chapter»)
Partners
Ledger technologies enable digital trust
Companies are moving from
intermediaries and manual auditing that
are slow and costly…
…to ledger technologies that reduce
cost, saves time, and lowers risk
Blockchain market growth predictions are
growing
https://www.forbes.com/sites/robertanzalone/2020/05/13/will-enterprise-
blockchain-survive-a-new-report-says-that-the-blockchain-technology-market-
will-reach-21-billion-by-2025/#7a5f793954b8
Gartner Predicts 2019: Blockchain Technologies
Transaction
Node(s)
Member 2
Validator
Node(s)
Transaction
Node(s)
Member 3
Validator
Node(s)
Transaction
Node(s)
Member 1
Validator
Node(s)
No data integrity
off-chain
Blockchains overkill for centralized scenarios
• Decentralization requires all parties to host
nodes on the network to participate in
consensus
• Governance rules must be established by the
consortium and deployed/managed
• Latency associated with network consensus
can impact transaction throughput (<1000 TPS
for Ethereum)
• Off-chain storage patterns for querying data
are a typical pattern, but data integrity is lost
in the process
• Bespoke development with immature tooling
makes development and management
challenging
Makes data in SQL
tamper-evident
through cryptography
Provides a historical
record of all changes,
verified through
cryptographic proofs
The same SQL you
already know across
Azure and on-premises
Azure SQL Database Ledger
Append-Only
Ledger Table
Trusted
Storage
Database Ledger
Block N Block N-1 Block N-2
Updatable
Ledger Table
History Table
Ledger View
Ledger Tables
• Updatable allow insert/update/delete
• History of updated/deleted rows preserved in
history table and easy-to-query Ledger View
• Integrity of updatable/history tables maintained
through cryptographic links of the Database Ledger
• System can periodically upload digital receipts to a
customer-configured trusted storage service
• Customer can use digital receipts to verify the
integrity of the data
• Append-Only allow only insert
• no need for a history table
Database Ledger
• Incrementally capture database state
• At logical level
• blockchain and Merkle Tree data structures
• Captures also transaction metadata
• e.g., timestamp, user
• Blocks and transaction information in
system tables
• sys.database_ledger_transactions
• sys.database_ledger_blocks
• Block are closed
• every 30 seconds
• manually executing
sys.sp_generate_database_ledger_digest
Database Digest
• Hash of last block in the Ledger
• Represents state of all Ledger tables
• Must be kept in a reliable and immutable storage
• To prevent information tampering
• Can be generated manually or automatically
• Automatically generated, can also be saved automatically
• Immutable Blob Storage
• Azure Confidential Ledger (ACL)
Reliable (trusted) Storage
«Immutable Blob Storage»
• Storage “Write Once, Read Many”
based on policies
• BLOBs can be set as read-only for
specified range
• Data blocked only functionally based
on policy
• Support for audit logging but log
creator must be trusted
• Microsoft is the TCB
• Trusted Computing Base
«Azure Confidential Ledger» (ACL)
• Storage "Write Once, Read Many"
permanently
• BLOBs written in the Ledger cannot be
edited
• Uses tamper-proof “Confidential
Enclaves”
• Create serialized Ledger files and
transaction receipts that can be
verified by customers
• Microsoft is outside the TCB
• source code is open source (Confidential
Consortium Framework)
Ledger verification
• Tampering possible even if changes forbidden, especially on-premises
• e.g. direct modification of data files, DBCC WRITEPAGE, SQL Server process
"hijacking", etc.
• Verify recomputes all hashes and compares them with digest
• When to verify
• when necessary (e.g. suspected tampering, formal audit, litigation)
• on a recurring basis (e.g. daily, hourly)
• Verification via system procedure depends on save mode
• automatic: passing a BLOB Storage address
• manual: passing JSON document containig the digest
Ledger auditing
• New SQL Audit events
• ENABLE LEDGER
• ALTER LEDGER
• GENERATE LEDGER DIGEST
• VERIFY LEDGER
• LEDGER_OPERATION_GROUP
Demo
Enabling SQL Database Ledger
Ledger Tables
Demo
Enabling SQL Database Ledger
Select SQL deployment option
Create SQL Database
Ledger configuration
Digest Storage
Demo
Ledger Tables
Updatable Ledger Table
CREATE SCHEMA [Account];
GO
CREATE TABLE [Account].[Balance]
(
[CustomerID] int NOT NULL PRIMARY KEY CLUSTERED
, [LastName] varchar(50) NOT NULL
, [FirstName] varchar(50) NOT NULL
, [Balance] decimal(10,2) NOT NULL
)
WITH (
SYSTEM_VERSIONING = ON --(HISTORY_TABLE = [Account].[BalanceHistory])
, LEDGER = ON --(LEDGER_VIEW = [Account].[BalanceLedgerView])
);
GO
INSERT transactions
-- First transaction
INSERT INTO [Account].[Balance]
VALUES
(1, 'Jones', 'Nick', 50);
GO
-- Second transaction
INSERT INTO [Account].[Balance]
VALUES
(2, 'Smith', 'John', 500)
, (3, 'Smith', 'Joe', 30)
, (4, 'Michaels', 'Mary', 200);
GO
Selecting data
-- By default, columns with information relating to transactions are
-- not returned (provides transparency to applications)
SELECT *
FROM [Account].[Balance];
GO
Selecting additional metadata
-- Metadata columns must be explicitly selected
SELECT *
, [ledger_start_transaction_id]
, [ledger_end_transaction_id]
, [ledger_start_sequence_number]
, [ledger_end_sequence_number]
FROM [Account].[Balance];
GO
Updating data
UPDATE[Account].[Balance]
SET[Balance] = 100
WHERE[CustomerID] = 1;
GO
Query data & ledger metadata after updates
-- We query the updateable table, the history table and the ledger view
SELECT *
,[ledger_start_transaction_id]
,[ledger_end_transaction_id]
,[ledger_start_sequence_number]
,[ledger_end_sequence_number]
FROM [Account].[Balance];
SELECT * FROM [Account].[MSSQL_LedgerHistoryFor_1525580473];
SELECT * FROM [Account].[Balance_Ledger] ORDER BY [ledger_transaction_id];
GO
Data & ledger metadata after updates
Append-only Ledger Table
CREATE SCHEMA [AccessControl];
GO
CREATE TABLE [AccessControl].[KeyCardEvents]
(
[EmployeeID] int NOT NULL PRIMARY KEY CLUSTERED
, [AccessOperationDescription] nvarchar(MAX) NOT NULL
, [Timestamp] datetime2 NOT NULL
)
WITH (
LEDGER = ON (APPEND_ONLY = ON)
);
GO
Insert data and trying to modify it…
-- Insert a row
INSERT INTO [AccessControl].[KeyCardEvents]
VALUES ('43869', 'Building42', '2020-05-02T19:58:47.1234567');
GO
-- If we try to update, it gives an error
UPDATE [AccessControl].[KeyCardEvents]
SET[EmployeeID] = 34184
WHERE[EmployeeID] = 43869;
GO
Demo
Verify database
Verify Database
General limitations
• Database level option forcing Ledger Tables cannot be disabled
• No conversion of existing tables (both ways)
• Migrate with sys.sp_copy_data_in_batches system procedure
• No dropping of tables/columns
• renamed/hidden but remain available for database verification
• Deleting older data from history tables forbidden
• Transaction can only update (!) 200 tables
• Updatable ledger tables are based on the technology of temporal
tables and inherits most of the limitations
Interoperability limitations
• In-memory tables not supported
• Partitions SWITCH IN/OUT operations not supported
• No Full-Text indexes
• Tables can’t be of type graph/filetable
• No non-clustered rowstore indexe with clustered columnstore index
• No Change Data Capture support
• Change Tracking not allowed on history tables
Other schema limitations
• Maximum number of columns always 1024 but
• updatable tables requires 4 columns
• Append-only tables requires 2 columns
• Adding nullable columns only (without WITH VALUES)
• Altering columns limited to
• NULL/NOT NULL, length of variable length columns
• Collation for Unicode strings
• XML, FILESTREAM, SqlVariant & user-defined types not supported
• Sparse Column Set not supported
Use cases for Ledger Tables
• In general: those who only need Forward Integrity
• System reliable processing transactions and protected against future tampering
• Some examples
• Streamlining audits
• Provides data integrity cryptographic proof to auditors (internal or external)
• Multiple-party business processes
• Alternative to Blockchain for intrinsically centralized systems, "trust, but verify" perspective
• Trusted off-chain storage for Blockchain
• Choosing an Azure ledger technology
• https://techcommunity.microsoft.com/t5/azure-sql/choosing-an-azure-ledger-
technology/ba-p/2450502
Resources
• Announcement blog
• https://aka.ms/sql-ledger-blog
• Azure SQL Database ledger Documentation
• https://aka.ms/sql-ledger-docs
• Whitepaper
• https://aka.ms/sql-ledger-whitepaper
Thanks!
Q&A

Data Integrity with SQL Database Ledger

  • 1.
    SQL START! 2022 Data Integrity with SQLDatabase Ledger Gianluca Hotz ghotz@ugiss.org | @glhotz
  • 2.
    Who am I? •Gianluca Hotz | @glhotz | ghotz@ugiss.org • Independent Consultant • 25+ years on SQL Server (from 4.21 back in 1996) • Database modeling & development, sizing & administration, modernization (upgrades & migrations), performance tuning, security • Community • 24 years Microsoft MVP SQL Server/Data Platform (from 1998) • VMware Experts SQL Server • Founder and president UGISS (ex «PASS Chapter»)
  • 3.
  • 4.
    Ledger technologies enabledigital trust Companies are moving from intermediaries and manual auditing that are slow and costly… …to ledger technologies that reduce cost, saves time, and lowers risk
  • 5.
    Blockchain market growthpredictions are growing https://www.forbes.com/sites/robertanzalone/2020/05/13/will-enterprise- blockchain-survive-a-new-report-says-that-the-blockchain-technology-market- will-reach-21-billion-by-2025/#7a5f793954b8 Gartner Predicts 2019: Blockchain Technologies
  • 6.
    Transaction Node(s) Member 2 Validator Node(s) Transaction Node(s) Member 3 Validator Node(s) Transaction Node(s) Member1 Validator Node(s) No data integrity off-chain Blockchains overkill for centralized scenarios • Decentralization requires all parties to host nodes on the network to participate in consensus • Governance rules must be established by the consortium and deployed/managed • Latency associated with network consensus can impact transaction throughput (<1000 TPS for Ethereum) • Off-chain storage patterns for querying data are a typical pattern, but data integrity is lost in the process • Bespoke development with immature tooling makes development and management challenging
  • 7.
    Makes data inSQL tamper-evident through cryptography Provides a historical record of all changes, verified through cryptographic proofs The same SQL you already know across Azure and on-premises Azure SQL Database Ledger
  • 8.
    Append-Only Ledger Table Trusted Storage Database Ledger BlockN Block N-1 Block N-2 Updatable Ledger Table History Table Ledger View Ledger Tables • Updatable allow insert/update/delete • History of updated/deleted rows preserved in history table and easy-to-query Ledger View • Integrity of updatable/history tables maintained through cryptographic links of the Database Ledger • System can periodically upload digital receipts to a customer-configured trusted storage service • Customer can use digital receipts to verify the integrity of the data • Append-Only allow only insert • no need for a history table
  • 9.
    Database Ledger • Incrementallycapture database state • At logical level • blockchain and Merkle Tree data structures • Captures also transaction metadata • e.g., timestamp, user • Blocks and transaction information in system tables • sys.database_ledger_transactions • sys.database_ledger_blocks • Block are closed • every 30 seconds • manually executing sys.sp_generate_database_ledger_digest
  • 10.
    Database Digest • Hashof last block in the Ledger • Represents state of all Ledger tables • Must be kept in a reliable and immutable storage • To prevent information tampering • Can be generated manually or automatically • Automatically generated, can also be saved automatically • Immutable Blob Storage • Azure Confidential Ledger (ACL)
  • 11.
    Reliable (trusted) Storage «ImmutableBlob Storage» • Storage “Write Once, Read Many” based on policies • BLOBs can be set as read-only for specified range • Data blocked only functionally based on policy • Support for audit logging but log creator must be trusted • Microsoft is the TCB • Trusted Computing Base «Azure Confidential Ledger» (ACL) • Storage "Write Once, Read Many" permanently • BLOBs written in the Ledger cannot be edited • Uses tamper-proof “Confidential Enclaves” • Create serialized Ledger files and transaction receipts that can be verified by customers • Microsoft is outside the TCB • source code is open source (Confidential Consortium Framework)
  • 12.
    Ledger verification • Tamperingpossible even if changes forbidden, especially on-premises • e.g. direct modification of data files, DBCC WRITEPAGE, SQL Server process "hijacking", etc. • Verify recomputes all hashes and compares them with digest • When to verify • when necessary (e.g. suspected tampering, formal audit, litigation) • on a recurring basis (e.g. daily, hourly) • Verification via system procedure depends on save mode • automatic: passing a BLOB Storage address • manual: passing JSON document containig the digest
  • 13.
    Ledger auditing • NewSQL Audit events • ENABLE LEDGER • ALTER LEDGER • GENERATE LEDGER DIGEST • VERIFY LEDGER • LEDGER_OPERATION_GROUP
  • 14.
    Demo Enabling SQL DatabaseLedger Ledger Tables
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
    Updatable Ledger Table CREATESCHEMA [Account]; GO CREATE TABLE [Account].[Balance] ( [CustomerID] int NOT NULL PRIMARY KEY CLUSTERED , [LastName] varchar(50) NOT NULL , [FirstName] varchar(50) NOT NULL , [Balance] decimal(10,2) NOT NULL ) WITH ( SYSTEM_VERSIONING = ON --(HISTORY_TABLE = [Account].[BalanceHistory]) , LEDGER = ON --(LEDGER_VIEW = [Account].[BalanceLedgerView]) ); GO
  • 22.
    INSERT transactions -- Firsttransaction INSERT INTO [Account].[Balance] VALUES (1, 'Jones', 'Nick', 50); GO -- Second transaction INSERT INTO [Account].[Balance] VALUES (2, 'Smith', 'John', 500) , (3, 'Smith', 'Joe', 30) , (4, 'Michaels', 'Mary', 200); GO
  • 23.
    Selecting data -- Bydefault, columns with information relating to transactions are -- not returned (provides transparency to applications) SELECT * FROM [Account].[Balance]; GO
  • 24.
    Selecting additional metadata --Metadata columns must be explicitly selected SELECT * , [ledger_start_transaction_id] , [ledger_end_transaction_id] , [ledger_start_sequence_number] , [ledger_end_sequence_number] FROM [Account].[Balance]; GO
  • 25.
  • 26.
    Query data &ledger metadata after updates -- We query the updateable table, the history table and the ledger view SELECT * ,[ledger_start_transaction_id] ,[ledger_end_transaction_id] ,[ledger_start_sequence_number] ,[ledger_end_sequence_number] FROM [Account].[Balance]; SELECT * FROM [Account].[MSSQL_LedgerHistoryFor_1525580473]; SELECT * FROM [Account].[Balance_Ledger] ORDER BY [ledger_transaction_id]; GO
  • 27.
    Data & ledgermetadata after updates
  • 28.
    Append-only Ledger Table CREATESCHEMA [AccessControl]; GO CREATE TABLE [AccessControl].[KeyCardEvents] ( [EmployeeID] int NOT NULL PRIMARY KEY CLUSTERED , [AccessOperationDescription] nvarchar(MAX) NOT NULL , [Timestamp] datetime2 NOT NULL ) WITH ( LEDGER = ON (APPEND_ONLY = ON) ); GO
  • 29.
    Insert data andtrying to modify it… -- Insert a row INSERT INTO [AccessControl].[KeyCardEvents] VALUES ('43869', 'Building42', '2020-05-02T19:58:47.1234567'); GO -- If we try to update, it gives an error UPDATE [AccessControl].[KeyCardEvents] SET[EmployeeID] = 34184 WHERE[EmployeeID] = 43869; GO
  • 30.
  • 31.
  • 32.
    General limitations • Databaselevel option forcing Ledger Tables cannot be disabled • No conversion of existing tables (both ways) • Migrate with sys.sp_copy_data_in_batches system procedure • No dropping of tables/columns • renamed/hidden but remain available for database verification • Deleting older data from history tables forbidden • Transaction can only update (!) 200 tables • Updatable ledger tables are based on the technology of temporal tables and inherits most of the limitations
  • 33.
    Interoperability limitations • In-memorytables not supported • Partitions SWITCH IN/OUT operations not supported • No Full-Text indexes • Tables can’t be of type graph/filetable • No non-clustered rowstore indexe with clustered columnstore index • No Change Data Capture support • Change Tracking not allowed on history tables
  • 34.
    Other schema limitations •Maximum number of columns always 1024 but • updatable tables requires 4 columns • Append-only tables requires 2 columns • Adding nullable columns only (without WITH VALUES) • Altering columns limited to • NULL/NOT NULL, length of variable length columns • Collation for Unicode strings • XML, FILESTREAM, SqlVariant & user-defined types not supported • Sparse Column Set not supported
  • 35.
    Use cases forLedger Tables • In general: those who only need Forward Integrity • System reliable processing transactions and protected against future tampering • Some examples • Streamlining audits • Provides data integrity cryptographic proof to auditors (internal or external) • Multiple-party business processes • Alternative to Blockchain for intrinsically centralized systems, "trust, but verify" perspective • Trusted off-chain storage for Blockchain • Choosing an Azure ledger technology • https://techcommunity.microsoft.com/t5/azure-sql/choosing-an-azure-ledger- technology/ba-p/2450502
  • 36.
    Resources • Announcement blog •https://aka.ms/sql-ledger-blog • Azure SQL Database ledger Documentation • https://aka.ms/sql-ledger-docs • Whitepaper • https://aka.ms/sql-ledger-whitepaper
  • 37.