Microsoft SQL Server
Filtered Indexes and Sparse Columns:
            Together,
            Together Separately
            Speaker: Don Vilen
          Chief S i i BuySight
          Chi f Scientist, B Si h




                     February 2011

           Mark Ginnebaugh, User Group Leader
                  www.bayareasql.org
15 Feb 2011




Filtered Indexes and
Sparse Columns:
     Together, Separately –

Don Vilen Chief Scientist Buysight
     Vilen,     Scientist,
DVilen@buysight.com
Agenda
 ◦   Filtered Indexes
 ◦   Filtered Statistics
 ◦   Wide Tables
 ◦   Sparse Columns
     S       C l

 ◦ T th …
   Together
 ◦ … and Separately

 ◦ Everything is SQL Server 2008 (and later), in
   all editions
The Scenario
 ◦ 100,000 rows in the table
    99 500 rows are hi
     99,500          historical, remaining 500 rows are current
                           i l       i i
    Indicated by NULL EndDate column or IsActive bit, etc.
 ◦ All queries on current data use index
 ◦ But why index all the historical 99.5% of the table?

 ◦ 1 000 columns in a table
   1,000
 ◦ BikeColor column is relevant only if ItemType is
   ‘Bicycle’
    For 0.5% of the rows; remainder are NULL
 ◦ But why index all the rows regardless of ItemType
   value?
Filtered Indexes
 ◦ Indexes only rows with values that match WHERE clause
    CREATE INDEX xyz ON table(columns, …)
                   y          (       , )
       WHERE EndDate IS NULL
       WHERE IsActive = 1
       WHERE ItemType = ‘Bicycle’
 ◦ Uses:
    Ranges of values for smaller portion of large table
       Avoid the common 80-90% of data where the index wouldn’t be helpful
    For categories of row data
       Index on Column120 and Column121 only useful when C1 = 37
    Table partitions, where index is needed only on the ‘current’ partition(s)
       Each partition will have the index structure, but only ‘current’ partitions will have any
        rows in the index
 ◦ Benefits
    Better query performance
    Reduction in storage costs
    Reduction in maintenance cost/time
Filtered Index – Allowed Syntax
◦ WHERE <filter_predicate>[from BOL: CREATE INDEX]
    <filter_predicate> ::= <conjunct> [ AND <conjunct> ]
    <conjunct> ::= <disjunct> | <comparison>
    <disjunct> ::= column_name IN (constant ,…)
    <comparison> ::= column_name <comparison_op> constant
  <comparison_op> ::= { IS | IS NOT | = | <> | != | > | >= | !> | < | <= | !< }



◦ No BETWEEN, no LIKE, no subquery, no variables

◦ So must be simple and deterministic
Filtered Indexes – Requirements
 ◦ Always some comparison involved, so must agree
   on how operations work, so requires standard
                     work
   SET options
    ON for ANSI_NULLS, ANSI_PADDING,
     ANSI_WARNINGS, ARITHABORT
     ANSI WARNINGS ARITHABORT,
     CONCAT_NULL_YIELDS_NULL,
     QUOTED_IDENTIFIER
    OFF for NUMERIC_ROUNDABORT
 ◦ Else:
    If not set when index is created, won’t create the index
    If not set when INSERT, UPDATE, DELETE, MERGE
     affects the data, gives error and rolls back
    If not set when the index might be used to optimize the
     query, it will not be considered
Filtered Indexes – Applicability
 ◦ Non-clustered indexes only (rather obviously )
 ◦ F UNIQUE i d
   For          indexes, only th i d d rows
                           l the indexed
   must have unique index values
    Duplicates in the non-indexed rows are not checked, but
     be careful that an update to a qualifying column doesn’t
                                                      doesn t
     cause a duplicate to occur
      CREATE UNIQUE INDEX ix1 ON xyz (c3)
         WHERE c2 = 10
    So now there is a way to create a unique index on
     column with multiple NULL values; create index WHERE
     ColY IS NOT NULL
 ◦ Fil
   Filtered i d
          d indexes d not apply to:
                    do       l
    XML indexes
    Full-text indexes
    Spatial indexes
Filtered Indexes – Getting Them Used 1
  ◦ QO can only use the index when it knows the index will
    match the conditions in the query’s WHERE clause
                                query s
  ◦ Assume Column120 and Column121 useful only when
    C1 = 37
     So CREATE INDEX i1 on dbo.t1 (Column120, Column121)
                               dbo t1 (Column120
        WHERE C1 = 37
     SELECT Column121
        FROM dbo.t1
        WHERE Column120 = 13
      Cannot use the index even if Column120 and Column121 only
      appear for C1 = 37
       As far as the QO knows, there may be other Column120 or Column121
        values that are not in the index
  ◦ Help the QO by adding more limiting predicates to
    WHERE clause
     Make it WHERE Column120 = 13 AND C1 = 37
Filtered Indexes – Getting Them Used 2
  ◦ WHERE with a variable rather than a literal
  ◦ Assume index is on WHERE IsActive > 0
      DECLARE @IsActive int; SET @IsActive = 1;
      SELECT xyz FROM table WHERE IsActive = @IsActive
  ◦ QO doesn’t know value of variable, so doesn’t
    know if index fits
    So shouldn’t use variables as if they were constants
  ◦ Again, help the QO by adding more limiting
    p
    predicates to WHERE clause
    Make it WHERE IsActive = @IsActive AND IsActive > 0

             But
             B t perhaps that d
                    h    th t doesn’t really make sense h
                                   ’t    ll    k        here
Filtered Indexes – Getting Them Used 3
    ◦ WHERE with a function or conversion on the filter
      predicate
      Obvious: WHERE ABS(C1) = 37
         Cannot use index on WHERE C1 = 37
         Could change it to WHERE C1 = ABS(37) if same meaning .. but not in
          this case
           hi
      Implicit conversions:
         Assume index is WHERE c3 > 100
         DECLARE @varR real; SET @varR = 1000.5;
                      @                @
         SELECT * FROM tv2 WHERE c3 = @varR
           Requires conversion of c3 to real before comparison, so can’t use
             index
         SELECT * FROM tv2 WHERE c3 = cast(@varR as int)
                                                   (@            )
           At least it requires no conversion of c3, but is unknown value at
             optimization time, so can’t use index
         So add a limiting predicate … assuming you know it will always be
          right
         SELECT * FROM tv2 WHERE c3 = cast(@varR as int) AND c3 > 100
A Mis-Application of Filtered Indexes
  Mis-
   ◦ Create a filtered index on c and b with
     WHERE on c

   ◦ Attempt to use the index as a validation table

   ◦ In code use the index in a hint and expect to
     get no row back for a b where c is a match,
     but
     b it gets an error instead due to hint
                               dd       h
     prevents a plan from being created
Filtered Indexes – And Views
 ◦ Cannot create a Filtered index on a view, not
   even a non-clustered index on an indexed view
   But a filtered index can be chosen by the QO for the
    query formed from a view .. or function
           f      df        i       f ti
Filtered Indexes – Considerations 1
 ◦ Storage size differences
    Fewer index rows take less space
    Less IO, more information fits in memory
    4,000 pages vs. 1 page
           p g         p g
 ◦ Limits auto-parameterization
    QO will not auto-parameterize if predicate is used in a
     filtered index (“in most cases”, per BOL)
                    ( in      cases
    Otherwise would inhibit use of filtered index
    So can affect plan reuse
 ◦ Index maintenance – same rebuild and reorganize
   as regular index
    But hopefully much less work to do
Filtered Indexes – Considerations 2
 ◦ Covering index
    Consider INCLUDEing other columns so more
     likely to be selected by QO
 ◦ DTA can suggest a filtered index
                     fil    di d
    ColX IS NOT NULL – only of this form
    But the missing indexes functionality does not flag
             missing-indexes
     them as missing
 ◦ When not to use:
    When non-filtered index already exists, or another
     access path is likely better or adequate
      Avoid the extra index maintenance
Filtered Statistics
  ◦ CREATE STATISTICS stats1 ON table (cols)
       WHERE <condition>
  ◦ Uses:
     Can create filtered statistics on skewed data to assist QO
     Filtered Statistics will likely be more precise because they cover only the
      data in the filtered subset (or filtered index)
     Table partitions, where statistics are needed only on ‘current’ partition(s)
  ◦ Cannot reference a computed column, a UDT column, a spatial
    data type column, or a hierarchyID data type column

  ◦ AutoCreateStats will create statistics on Filtered Index key
    columns
  ◦ AutoCreateStats will not create filtered statistics on other
    columns
     You have to create them yourself
  ◦ AutoUpdateStats will keep them updated once they are created
Metadata for Indexes, Statistics
 ◦ sys.indexes
    has_filter, filter_definition
 ◦ sys.stats
    has_filter, filter_definition


 ◦ SSMS
    Indexes and Statistics Properties have a Filter tab
Questions on Filtered Indexes,
Statistics
   Any q
      y questions?

   Now we’ll move on to Wide Tables
         we ll                Tables,
    Sparse Columns
Wide Tables
 ◦ Up to 30,000 Columns
   Great for Sharepoint-like “a row is an object, some
    attributes depend on other attributes”
 ◦ Some limits:
     Columns per non-wide table: 1,024
     Columns per wide table: 30,000
     Columns per SELECT statement: 4,096
     Columns per INSERT statement: 4,096
     Indexes per table: 1 000
                          1,000
     Statistics per table: 30,000
       BOL: Maximum Capacity Specifications for SQL Server
Wide Table
◦ A wide table has defined a column set, using sparse
  columns
  New row structure for sparse columns
    {column, value}, {column, value} …
  Can create flexible schemas within an application
  Can add or drop columns whenever you want without
   having to touch each row
◦ The maximum size of a wide table row is 8,018
                                            8 018
  bytes, so most of the data in a row has to be NULL
  Or has to be varchar-type columns so it can overflow to
   another page
◦ Limit is still 1,024 for number of non-sparse
  columns plus computed columns, even in a wide
  table
Wide Tables – Performance Impact
 ◦ Performance considerations:
    Increased run-time and compile-time memory
     requirements
    Wid t bl can h
     Wide tables       have up t 30,000 columns defined;
                               to 30 000 l      d fi d
     this can increase compile time
    There can be up to 1,000 indexes on a wide table,
                     p     ,                         ,
     which increases the index maintenance time
      Nonclustered indexes should be filtered indexes to
       minimize their impact

      For more information, see BOL: Performance Considerations
       for Wide Tables
Sparse Columns
◦ CREATE TABLE … (…, c1 int SPARSE NULL,
  …)
◦ New row format for sparse columns

◦ Column:
    Must be NULLable
    Cannot be part of a cluster index
    Cannot b part of a primary key index
     C      be       f             k   d
    Cannot have a DEFAULT
    Cannot be a computed column
Sparse Columns – Some More Cannots
  ◦ Some types cannot be sparse:
    geography   • ntext    • User-defined data types
    geometry    • text
    image       • timestamp


  ◦S
   Some attributes cannot be on sparse columns
             b            b              l
    No Filestream
    N t Id tit
     Not Identity
    Not RowGuidCol
Sparse Columns – Types and Size
 ◦ Size impact
   An important consideration but not the only one


 ◦ At what percentage of NULLs does a sparse
   column take less space than a non-sparse
   column?
          Non-Sparse
          N    S                Sparse
                                S                 Null Estimate
                                                  N ll E i
   BIT    1/8th byte          4 1/8th bytes      –> 98%
   BIGINT 8 bytes
              y                12 bytes
                                   y              –> 52%


     See BOL: Using Sparse Columns for a complete table of types
Column Sets
◦ How do you know which columns ‘exist’ for a row?
◦ You could just SELECT them; those that don t exist are NULL
                                         don’t
◦ Can define a “Column set”
   Optional, only one per table
◦ Include a column:
   MyColSet      XML      COLUMN_SET FOR ALL_SPARSE_COLUMNS
◦ Selecting from MyColSet returns an XML description of the sparse
  columns in that row
   <c25>ABC</c25><c34>599</c34>
◦ Can INSERT / UPDATE sparse columns by
   Referring to them by name as usual, or
   Specifying the XML for the Column_Set column

     See BOL: Using Column Sets for more details
Feature / Technology Support
 ◦ Sparse columns and column sets are not fully
   supported b some SQL Server technologies
            d by          S         h l i

 ◦ S arse Col mns not s
   Sparse Columns     supported b :
                          orted by:
   Merge Replication

 ◦ Column Sets not supported by:
   Replication, Distributed Query, Change Data
      p                          y      g
    Capture

     See BOL: Using Column Sets for more details
Meta Data for Sparse Columns
 ◦ sys.columns – is_sparse, is_column_set
   And in:
       sys.system_columns
       sys.all_columns
        sys all columns
       sys.computed_columns
       sys.identity_columns


 ◦ Do not confuse with sparse files as used for
   Database Snapshots
   The is_sparse in sys.database_files, sys.master_files
Together
 ◦ Sparse Columns together with Filtered Index
 ◦ On Sparse column, filtered index with
           xx IS NOT NULL
   avoids indexing all the rows with no value

 ◦ Makes a lot of sense, and likely the driving
   force behind filtered indexes
 ◦ B not needed on every sparse column
   But         d d                      l
Separately
 ◦ Filtered Index without Sparse Column
   Filtered indexes on skewed data
   Filtered statistics on skewed data


 ◦ Sparse Column without Filtered Index
   Sparse columns on sparse data, perhaps no index to
    go with it
Summary
 ◦   Filtered Indexes
 ◦   Filtered Statistics
 ◦   Wide Tables
 ◦   Sparse Columns

 ◦ Together …
 ◦ … and Separately

 ◦ Don Vilen
      Chief Scientist, Buysight
      DVilen@buysight com
       DVilen@buysight.com
To learn more or inquire about speaking opportunities, please
                   q            p     g pp           ,p
                           contact:

Mark Ginnebaugh, User Group Leader mark@designmind.com

Microsoft SQL Server Filtered Indexes & Sparse Columns Feb 2011

  • 1.
    Microsoft SQL Server FilteredIndexes and Sparse Columns: Together, Together Separately Speaker: Don Vilen Chief S i i BuySight Chi f Scientist, B Si h February 2011 Mark Ginnebaugh, User Group Leader www.bayareasql.org
  • 2.
    15 Feb 2011 FilteredIndexes and Sparse Columns: Together, Separately – Don Vilen Chief Scientist Buysight Vilen, Scientist, DVilen@buysight.com
  • 3.
    Agenda ◦ Filtered Indexes ◦ Filtered Statistics ◦ Wide Tables ◦ Sparse Columns S C l ◦ T th … Together ◦ … and Separately ◦ Everything is SQL Server 2008 (and later), in all editions
  • 4.
    The Scenario ◦100,000 rows in the table  99 500 rows are hi 99,500 historical, remaining 500 rows are current i l i i  Indicated by NULL EndDate column or IsActive bit, etc. ◦ All queries on current data use index ◦ But why index all the historical 99.5% of the table? ◦ 1 000 columns in a table 1,000 ◦ BikeColor column is relevant only if ItemType is ‘Bicycle’  For 0.5% of the rows; remainder are NULL ◦ But why index all the rows regardless of ItemType value?
  • 5.
    Filtered Indexes ◦Indexes only rows with values that match WHERE clause  CREATE INDEX xyz ON table(columns, …) y ( , )  WHERE EndDate IS NULL  WHERE IsActive = 1  WHERE ItemType = ‘Bicycle’ ◦ Uses:  Ranges of values for smaller portion of large table  Avoid the common 80-90% of data where the index wouldn’t be helpful  For categories of row data  Index on Column120 and Column121 only useful when C1 = 37  Table partitions, where index is needed only on the ‘current’ partition(s)  Each partition will have the index structure, but only ‘current’ partitions will have any rows in the index ◦ Benefits  Better query performance  Reduction in storage costs  Reduction in maintenance cost/time
  • 6.
    Filtered Index –Allowed Syntax ◦ WHERE <filter_predicate>[from BOL: CREATE INDEX]  <filter_predicate> ::= <conjunct> [ AND <conjunct> ]  <conjunct> ::= <disjunct> | <comparison>  <disjunct> ::= column_name IN (constant ,…)  <comparison> ::= column_name <comparison_op> constant  <comparison_op> ::= { IS | IS NOT | = | <> | != | > | >= | !> | < | <= | !< } ◦ No BETWEEN, no LIKE, no subquery, no variables ◦ So must be simple and deterministic
  • 7.
    Filtered Indexes –Requirements ◦ Always some comparison involved, so must agree on how operations work, so requires standard work SET options  ON for ANSI_NULLS, ANSI_PADDING, ANSI_WARNINGS, ARITHABORT ANSI WARNINGS ARITHABORT, CONCAT_NULL_YIELDS_NULL, QUOTED_IDENTIFIER  OFF for NUMERIC_ROUNDABORT ◦ Else:  If not set when index is created, won’t create the index  If not set when INSERT, UPDATE, DELETE, MERGE affects the data, gives error and rolls back  If not set when the index might be used to optimize the query, it will not be considered
  • 8.
    Filtered Indexes –Applicability ◦ Non-clustered indexes only (rather obviously ) ◦ F UNIQUE i d For indexes, only th i d d rows l the indexed must have unique index values  Duplicates in the non-indexed rows are not checked, but be careful that an update to a qualifying column doesn’t doesn t cause a duplicate to occur  CREATE UNIQUE INDEX ix1 ON xyz (c3) WHERE c2 = 10  So now there is a way to create a unique index on column with multiple NULL values; create index WHERE ColY IS NOT NULL ◦ Fil Filtered i d d indexes d not apply to: do l  XML indexes  Full-text indexes  Spatial indexes
  • 9.
    Filtered Indexes –Getting Them Used 1 ◦ QO can only use the index when it knows the index will match the conditions in the query’s WHERE clause query s ◦ Assume Column120 and Column121 useful only when C1 = 37  So CREATE INDEX i1 on dbo.t1 (Column120, Column121) dbo t1 (Column120 WHERE C1 = 37  SELECT Column121 FROM dbo.t1 WHERE Column120 = 13 Cannot use the index even if Column120 and Column121 only appear for C1 = 37  As far as the QO knows, there may be other Column120 or Column121 values that are not in the index ◦ Help the QO by adding more limiting predicates to WHERE clause  Make it WHERE Column120 = 13 AND C1 = 37
  • 10.
    Filtered Indexes –Getting Them Used 2 ◦ WHERE with a variable rather than a literal ◦ Assume index is on WHERE IsActive > 0  DECLARE @IsActive int; SET @IsActive = 1;  SELECT xyz FROM table WHERE IsActive = @IsActive ◦ QO doesn’t know value of variable, so doesn’t know if index fits  So shouldn’t use variables as if they were constants ◦ Again, help the QO by adding more limiting p predicates to WHERE clause  Make it WHERE IsActive = @IsActive AND IsActive > 0 But B t perhaps that d h th t doesn’t really make sense h ’t ll k here
  • 11.
    Filtered Indexes –Getting Them Used 3 ◦ WHERE with a function or conversion on the filter predicate  Obvious: WHERE ABS(C1) = 37  Cannot use index on WHERE C1 = 37  Could change it to WHERE C1 = ABS(37) if same meaning .. but not in this case hi  Implicit conversions:  Assume index is WHERE c3 > 100  DECLARE @varR real; SET @varR = 1000.5; @ @  SELECT * FROM tv2 WHERE c3 = @varR  Requires conversion of c3 to real before comparison, so can’t use index  SELECT * FROM tv2 WHERE c3 = cast(@varR as int) (@ )  At least it requires no conversion of c3, but is unknown value at optimization time, so can’t use index  So add a limiting predicate … assuming you know it will always be right  SELECT * FROM tv2 WHERE c3 = cast(@varR as int) AND c3 > 100
  • 12.
    A Mis-Application ofFiltered Indexes Mis- ◦ Create a filtered index on c and b with WHERE on c ◦ Attempt to use the index as a validation table ◦ In code use the index in a hint and expect to get no row back for a b where c is a match, but b it gets an error instead due to hint dd h prevents a plan from being created
  • 13.
    Filtered Indexes –And Views ◦ Cannot create a Filtered index on a view, not even a non-clustered index on an indexed view  But a filtered index can be chosen by the QO for the query formed from a view .. or function f df i f ti
  • 14.
    Filtered Indexes –Considerations 1 ◦ Storage size differences  Fewer index rows take less space  Less IO, more information fits in memory  4,000 pages vs. 1 page p g p g ◦ Limits auto-parameterization  QO will not auto-parameterize if predicate is used in a filtered index (“in most cases”, per BOL) ( in cases  Otherwise would inhibit use of filtered index  So can affect plan reuse ◦ Index maintenance – same rebuild and reorganize as regular index  But hopefully much less work to do
  • 15.
    Filtered Indexes –Considerations 2 ◦ Covering index  Consider INCLUDEing other columns so more likely to be selected by QO ◦ DTA can suggest a filtered index fil di d  ColX IS NOT NULL – only of this form  But the missing indexes functionality does not flag missing-indexes them as missing ◦ When not to use:  When non-filtered index already exists, or another access path is likely better or adequate  Avoid the extra index maintenance
  • 16.
    Filtered Statistics ◦ CREATE STATISTICS stats1 ON table (cols) WHERE <condition> ◦ Uses:  Can create filtered statistics on skewed data to assist QO  Filtered Statistics will likely be more precise because they cover only the data in the filtered subset (or filtered index)  Table partitions, where statistics are needed only on ‘current’ partition(s) ◦ Cannot reference a computed column, a UDT column, a spatial data type column, or a hierarchyID data type column ◦ AutoCreateStats will create statistics on Filtered Index key columns ◦ AutoCreateStats will not create filtered statistics on other columns  You have to create them yourself ◦ AutoUpdateStats will keep them updated once they are created
  • 17.
    Metadata for Indexes,Statistics ◦ sys.indexes  has_filter, filter_definition ◦ sys.stats  has_filter, filter_definition ◦ SSMS  Indexes and Statistics Properties have a Filter tab
  • 18.
    Questions on FilteredIndexes, Statistics  Any q y questions?  Now we’ll move on to Wide Tables we ll Tables, Sparse Columns
  • 19.
    Wide Tables ◦Up to 30,000 Columns  Great for Sharepoint-like “a row is an object, some attributes depend on other attributes” ◦ Some limits:  Columns per non-wide table: 1,024  Columns per wide table: 30,000  Columns per SELECT statement: 4,096  Columns per INSERT statement: 4,096  Indexes per table: 1 000 1,000  Statistics per table: 30,000  BOL: Maximum Capacity Specifications for SQL Server
  • 20.
    Wide Table ◦ Awide table has defined a column set, using sparse columns  New row structure for sparse columns  {column, value}, {column, value} …  Can create flexible schemas within an application  Can add or drop columns whenever you want without having to touch each row ◦ The maximum size of a wide table row is 8,018 8 018 bytes, so most of the data in a row has to be NULL  Or has to be varchar-type columns so it can overflow to another page ◦ Limit is still 1,024 for number of non-sparse columns plus computed columns, even in a wide table
  • 21.
    Wide Tables –Performance Impact ◦ Performance considerations:  Increased run-time and compile-time memory requirements  Wid t bl can h Wide tables have up t 30,000 columns defined; to 30 000 l d fi d this can increase compile time  There can be up to 1,000 indexes on a wide table, p , , which increases the index maintenance time  Nonclustered indexes should be filtered indexes to minimize their impact  For more information, see BOL: Performance Considerations for Wide Tables
  • 22.
    Sparse Columns ◦ CREATETABLE … (…, c1 int SPARSE NULL, …) ◦ New row format for sparse columns ◦ Column:  Must be NULLable  Cannot be part of a cluster index  Cannot b part of a primary key index C be f k d  Cannot have a DEFAULT  Cannot be a computed column
  • 23.
    Sparse Columns –Some More Cannots ◦ Some types cannot be sparse:  geography • ntext • User-defined data types  geometry • text  image • timestamp ◦S Some attributes cannot be on sparse columns b b l  No Filestream  N t Id tit Not Identity  Not RowGuidCol
  • 24.
    Sparse Columns –Types and Size ◦ Size impact  An important consideration but not the only one ◦ At what percentage of NULLs does a sparse column take less space than a non-sparse column? Non-Sparse N S Sparse S Null Estimate N ll E i  BIT 1/8th byte 4 1/8th bytes –> 98%  BIGINT 8 bytes y 12 bytes y –> 52%  See BOL: Using Sparse Columns for a complete table of types
  • 25.
    Column Sets ◦ Howdo you know which columns ‘exist’ for a row? ◦ You could just SELECT them; those that don t exist are NULL don’t ◦ Can define a “Column set”  Optional, only one per table ◦ Include a column:  MyColSet XML COLUMN_SET FOR ALL_SPARSE_COLUMNS ◦ Selecting from MyColSet returns an XML description of the sparse columns in that row  <c25>ABC</c25><c34>599</c34> ◦ Can INSERT / UPDATE sparse columns by  Referring to them by name as usual, or  Specifying the XML for the Column_Set column  See BOL: Using Column Sets for more details
  • 26.
    Feature / TechnologySupport ◦ Sparse columns and column sets are not fully supported b some SQL Server technologies d by S h l i ◦ S arse Col mns not s Sparse Columns supported b : orted by:  Merge Replication ◦ Column Sets not supported by:  Replication, Distributed Query, Change Data p y g Capture  See BOL: Using Column Sets for more details
  • 27.
    Meta Data forSparse Columns ◦ sys.columns – is_sparse, is_column_set  And in:  sys.system_columns  sys.all_columns sys all columns  sys.computed_columns  sys.identity_columns ◦ Do not confuse with sparse files as used for Database Snapshots  The is_sparse in sys.database_files, sys.master_files
  • 28.
    Together ◦ SparseColumns together with Filtered Index ◦ On Sparse column, filtered index with xx IS NOT NULL avoids indexing all the rows with no value ◦ Makes a lot of sense, and likely the driving force behind filtered indexes ◦ B not needed on every sparse column But d d l
  • 29.
    Separately ◦ FilteredIndex without Sparse Column  Filtered indexes on skewed data  Filtered statistics on skewed data ◦ Sparse Column without Filtered Index  Sparse columns on sparse data, perhaps no index to go with it
  • 30.
    Summary ◦ Filtered Indexes ◦ Filtered Statistics ◦ Wide Tables ◦ Sparse Columns ◦ Together … ◦ … and Separately ◦ Don Vilen  Chief Scientist, Buysight  DVilen@buysight com DVilen@buysight.com
  • 31.
    To learn moreor inquire about speaking opportunities, please q p g pp ,p contact: Mark Ginnebaugh, User Group Leader mark@designmind.com