MySQL Query Optimization


      2010.07.09 Cai Baohua
Agenda
• What the query optimizer is
• The principles of the optimization
• Explain and Profiling
• Use Index
• JOIN Optimization
• ORDER BY, GROUP BY Optimization
MySQL Query Optimizer
MySQl Query Optimizer

               Parser


              Table          Table
Optimizer   Modification   Maintenance   .....
             Module         Module

              Access
              Control
              Module
MySQl Query Optimizer


• Not only CBO but aslo CBO + RBO
 • Cost Base Optimizer
 • Rule Base Optimizer
The Principles Of the
   Optimization
•   Optimizing the query which need more optimization
•   Optimizing the query which need more optimization
•   Identify the performance bottleneck
•   Optimizing the query which need more optimization
•   Identify the performance bottleneck
•   Find clear optimization objects
•   Optimizing the query which need more optimization
•   Identify the performance bottleneck
•   Find clear optimization objects
•   Start with Explain and use Profile more often
•   Optimizing the query which need more optimization
•   Identify the performance bottleneck
•   Find clear optimization objects
•   Start with Explain and use Profile more often
•   Always using the small result set to drive the large result set
•   Optimizing the query which need more optimization
•   Identify the performance bottleneck
•   Find clear optimization objects
•   Start with Explain and use Profile more often
•   Always using the small result set to drive the large result set
•   Complete the sequencing in the index as much as possible
•   Optimizing the query which need more optimization
•   Identify the performance bottleneck
•   Find clear optimization objects
•   Start with Explain and use Profile more often
•   Always using the small result set to drive the large result set
•   Complete the sequencing in the index as much as possible
•   Fetch the only fields that we need
•   Optimizing the query which need more optimization
•   Identify the performance bottleneck
•   Find clear optimization objects
•   Start with Explain and use Profile more often
•   Always using the small result set to drive the large result set
•   Complete the sequencing in the index as much as possible
•   Fetch the only fields that we need
•   Only use the most effective conditions of the filter
•   Optimizing the query which need more optimization
•   Identify the performance bottleneck
•   Find clear optimization objects
•   Start with Explain and use Profile more often
•   Always using the small result set to drive the large result set
•   Complete the sequencing in the index as much as possible
•   Fetch the only fields that we need
•   Only use the most effective conditions of the filter
•   Avoid the complex Join and sub queries as far as possible
Explain and Profiling
Use Explain and Profiling
Explain tells you:
•   In which order the tables are read
•   What types of read operations that are made
•   Which indexes could have been used
•   Which indexes are used
•   How the tables refer to each other
•   How many rows the optimizer estimates to retrieve
    from each table
Use Explain and Profiling
Use Explain and Profiling
Explain Types
Different join types.

system                  !"#$%&'(#$"&)$*+(,$*+#$-*.$
const                   /%$%"#$0*)%$*+#$0&%1"2+3$-*.4$%-#&%#5$&)$&$
                        1*+)%&+%$
eq_ref                  6+#$-*.$7#-$-*.$8-*0$7-#92*:)$%&'(#)$
ref                     ;#9#-&($-*.)$.2%"$0&%1"2+3$2+5#<$9&(:#$
ref_or_null             =2>#$-#84$7(:)$?@==$9&(:#)$
index_merge             ;#9#-&($2+5#<$)#&-1"#)$&-#$0#-3#5$
unique_subquery         ;&0#$&)$-#8$8*-$)*0#$):'A:#-2#)$
index_subquery          /)$&'*9#$8*-$+*+B:+2A:#$2+5#<#)$
range                   /$-&+3#$2+5#<$)1&+$
index                   !"#$."*(#$2+5#<$2)$)1&++#5$
ALL                     /$8:(($%&'(#$)1&+$
Use Explain and Profiling
Explain Types
Different join types.

system                  !"#$%&'(#$"&)$*+(,$*+#$-*.$                   good
const                   /%$%"#$0*)%$*+#$0&%1"2+3$-*.4$%-#&%#5$&)$&$
                        1*+)%&+%$
eq_ref                  6+#$-*.$7#-$-*.$8-*0$7-#92*:)$%&'(#)$
ref                     ;#9#-&($-*.)$.2%"$0&%1"2+3$2+5#<$9&(:#$
ref_or_null             =2>#$-#84$7(:)$?@==$9&(:#)$
index_merge             ;#9#-&($2+5#<$)#&-1"#)$&-#$0#-3#5$
unique_subquery         ;&0#$&)$-#8$8*-$)*0#$):'A:#-2#)$
index_subquery          /)$&'*9#$8*-$+*+B:+2A:#$2+5#<#)$
range                   /$-&+3#$2+5#<$)1&+$
index                   !"#$."*(#$2+5#<$2)$)1&++#5$
ALL                     /$8:(($%&'(#$)1&+$
                                                                      bad
Use Explain and Profiling
Explain Extra
This column contains additional information about how MySQL resolves the query.


Using index                      !"#$%#&'()$*&$+%#,)#-$&)%,*.")$/%01$)"#$*2-#3$

Using where                      40)$,(($%05&$,%#$'&#-$*2$)"#$%#&'()$

Distinct                         62(7$,$&*2.(#$%05$*&$%#,-$8#%$%05$+019*2,:02$
Not exists                       ;$LEFT JOIN$1*&&*2.$%05&$08:1*<,:02$*&$'&#-$

Using filesort                   ;2$#3)%,$%05$&0%:2.$&)#8$*&$-02#$

Using temporary                  ;$)#180%,%7$),9(#$*&$'&#-$

Range checked                    !"#$%#,-$)78#$*&$08:1*<#-$*2-*=*-',((7$/0%$#,+"$
for each record                  +019*2,:02$0/$%05&$/%01$)"#$8%#=*0'&$),9(#&$
Use Explain and Profiling


• Open / Close Query Profiler
     mysql> set profiling = 1 (close: 0)
Use Explain and Profiling
Show profiles
Use Explain and Profiling
 SHOW PROFILE
• ALL - displays all information
• BLOCK IO - displays counts for block input and output operations
• CONTEXT SWITCHES - displays counts for voluntary and involuntary context switches
• IPC - displays counts for messages sent and received
• MEMORY - is not currently implemented
• PAGE FAULTS - displays counts for major and minor page faults
• SOURCE - displays the names of functions from the source code, together with the name and line
  number of the file in which the function occurs
• SWAPS - displays swap count
Use Explain and Profiling
Show more info
Use Index
Index Types
Index Types
•   Balance-Tree
    •   Primary Key
    •   Secondary Index
    •   InnoDB, MyISAM often use
Index Types
•   Balance-Tree
    •   Primary Key
    •   Secondary Index
    •   InnoDB, MyISAM often use
•   Hash
    •   Memory, NDB Cluster
    •   “=”, “IN”, “<=>” not > < between != like
    •   not work for ORDER BY
Index Types
•   Balance-Tree
    •   Primary Key
    •   Secondary Index
    •   InnoDB, MyISAM often use
•   Hash
    •   Memory, NDB Cluster
    •   “=”, “IN”, “<=>” not > < between != like
    •   not work for ORDER BY
•   Fulltext
    •   CHAR,VARCHAR and TEXT
    •   Uses it instead of LIKE ‘%*****%’, more efficient
Index Types
•   Balance-Tree
    •   Primary Key
    •   Secondary Index
    •   InnoDB, MyISAM often use
•   Hash
    •   Memory, NDB Cluster
    •   “=”, “IN”, “<=>” not > < between != like
    •   not work for ORDER BY
•   Fulltext
    •   CHAR,VARCHAR and TEXT
    •   Uses it instead of LIKE ‘%*****%’, more efficient
•   R-Tree
    •   to solve the problem of spatial data retrieval
    •   only data type: GEOMETRY
Pros and Cons of Index
Pros and Cons of Index
•   Pros
    •   Improve the efficiency of data retrieval
    •   Reduce the cost of database I/O
    •   Reduce the cost of data sorting
Pros and Cons of Index
•   Pros
    •   Improve the efficiency of data retrieval
    •   Reduce the cost of database I/O
    •   Reduce the cost of data sorting
•   Cons
    •   index will take more disk space
    •   slow the speed of updating table (insert, update,
        delete)
When Use Index?
When Use Index?
•   Field used in WHERE more frequently, use
    index
When Use Index?
•   Field used in WHERE more frequently, use
    index
•   Field like status or type, no index
When Use Index?
•   Field used in WHERE more frequently, use
    index
•   Field like status or type, no index
    •   Contain too many records records, which
        bring too many random I/O, to many
        duplicate I/O
When Use Index?
•   Field used in WHERE more frequently, use
    index
•   Field like status or type, no index
    •   Contain too many records records, which
        bring too many random I/O, to many
        duplicate I/O
•   Field updated too often, no index
When Use Index?
•   Field used in WHERE more frequently, use
    index
•   Field like status or type, no index
    •   Contain too many records records, which
        bring too many random I/O, to many
        duplicate I/O
•   Field updated too often, no index
•   Field not in WHERE, no index
1 or N-Columns Index
1 or N-Columns Index
•   No absolute conclusion
1 or N-Columns Index
•   No absolute conclusion
•   When a filter field can filter data more than
    90% and the other filter fields will be updated
    often, which we can try to use composite index
1 or N-Columns Index
•   No absolute conclusion
•   When a filter field can filter data more than
    90% and the other filter fields will be updated
    often, which we can try to use composite index
•   Reduce the cost of index updating and disk
    space of index
1 or N-Columns Index
•   No absolute conclusion
•   When a filter field can filter data more than
    90% and the other filter fields will be updated
    often, which we can try to use composite index
•   Reduce the cost of index updating and disk
    space of index
•   let one index used in different quries
1 or N-Columns Index
•   No absolute conclusion
•   When a filter field can filter data more than
    90% and the other filter fields will be updated
    often, which we can try to use composite index
•   Reduce the cost of index updating and disk
    space of index
•   let one index used in different quries
•   Don’t over index
Index Prefixes
Index Prefixes
•   Index prefixes of CHAR,VARCHAR, BINARY,
    VARBINARY, BLOB, and TEXT columns
Index Prefixes
•   Index prefixes of CHAR,VARCHAR, BINARY,
    VARBINARY, BLOB, and TEXT columns
•   name char (200)
Index Prefixes
•   Index prefixes of CHAR,VARCHAR, BINARY,
    VARBINARY, BLOB, and TEXT columns
•   name char (200)
    •   most value are unique within the first 10-20
Index Prefixes
•   Index prefixes of CHAR,VARCHAR, BINARY,
    VARBINARY, BLOB, and TEXT columns
•   name char (200)
    •   most value are unique within the first 10-20
    •   CREATE INDEX part_of_name ON
        customer (name(10));
Index Prefixes
•   Index prefixes of CHAR,VARCHAR, BINARY,
    VARBINARY, BLOB, and TEXT columns
•   name char (200)
    •   most value are unique within the first 10-20
    •   CREATE INDEX part_of_name ON
        customer (name(10));
•   faster query and disk I/O reduction
Limitation of Mysql Index
•   MyISAM - the total length of index <= 1000 bytes
•   BLOB and TEXT only create Index Prefix
•   Mysql not support Function Index
•   “!=” or “<>”, won’t use index
•   abs(column) etc, won’t use index
•   Join (a.city = b.city). If the filter fileds’ type are not the
    same, mysql won’t use index
•   Like ‘%abc’, won’t use index
•   Hash index only can be used when “=”, “<=>”, “IN”
Join
Principle


• Nested Loop Join
Example
users_group(g)
 index ref scan
Nested Loop (ref)
 g.group_id=m.group_id




users_group(g)    group_message(m)
 index ref scan     index ref scan
Nested Loop (ref)           Nested Loop (ref)
 g.group_id=m.group_id         m.id=c.group_msg_id




                                                      Result Set Output
users_group(g)    group_message(m)   group_message_content()
 index ref scan     index ref scan        index ref scan
Ideas for optimization
• Minimize the number of Nested Loop
• Give priority to optimizing the inner loop
• Indexing filter fields
 • ... FROM A, B WHERE B.group_id =
     A.group_id
• Join Buffer size, type is All, index, range,
  index_merge
Order By, Group By
How Satisfy Order By
How Satisfy Order By


• Use Index, without doing a any extra sorting
How Satisfy Order By


• Use Index, without doing a any extra sorting
• Use filesort algorithms
Use Index
Use Index
Use Index
SELECT col1, col2 FROM
                               sort
a ORDER BY [sort]
SELECT col1, col2 FROM
a WHERE colX=value          (colx, sort)
ORDER BY [sort]
SELECT * FROM a
WHERE uid=1 ORDER BY         (uid, x, y)
x, y
SELECT * FROM a
                          won’t use index
ORDER BY YEAR(date)
          ......               ......
Use Index
SELECT col1, col2 FROM
                               sort
a ORDER BY [sort]
SELECT col1, col2 FROM
a WHERE colX=value          (colx, sort)
ORDER BY [sort]
SELECT * FROM a
WHERE uid=1 ORDER BY         (uid, x, y)
x, y
SELECT * FROM a
                          won’t use index
ORDER BY YEAR(date)
          ......               ......
Use Index
SELECT col1, col2 FROM
                               sort
a ORDER BY [sort]
SELECT col1, col2 FROM
a WHERE colX=value          (colx, sort)
ORDER BY [sort]
SELECT * FROM a
WHERE uid=1 ORDER BY         (uid, x, y)
x, y
SELECT * FROM a
                          won’t use index
ORDER BY YEAR(date)
          ......               ......
Use Index
SELECT col1, col2 FROM
                               sort
a ORDER BY [sort]
SELECT col1, col2 FROM
a WHERE colX=value          (colx, sort)
ORDER BY [sort]
SELECT * FROM a
WHERE uid=1 ORDER BY         (uid, x, y)
x, y
SELECT * FROM a
                          won’t use index
ORDER BY YEAR(date)
          ......               ......
Use Index
SELECT col1, col2 FROM
                               sort
a ORDER BY [sort]
SELECT col1, col2 FROM
a WHERE colX=value          (colx, sort)
ORDER BY [sort]
SELECT * FROM a
WHERE uid=1 ORDER BY         (uid, x, y)
x, y
SELECT * FROM a
                          won’t use index
ORDER BY YEAR(date)
          ......               ......
Use Filesort

• increase max_length_for_stort_data
• remove return fields which are not
  necessary
• increase sort_buffer_size
How Satisfy Group By


• Loose Index Scan
• Tight Index Scan
Loose Index Scan
Loose Index Scan
Loose Index Scan
     Conditions                   Example

 The query is over a      SELECT c1, c2 FROM t1
    single table            GROUP BY c1, c2;
only columns that form
a leftmost prefix of the     index on (c1,c2,c3)
  index and no other          •    GROUP BY c1, c2
                              •    CROUP BY c2, c3
       columns.
only can use aggregate
                       SELECT c1, MIN(c2) FROM
 functions like MAX,
                           t1 GROUP BY c1;
         MIN
Loose Index Scan
     Conditions                   Example

 The query is over a      SELECT c1, c2 FROM t1
    single table            GROUP BY c1, c2;
only columns that form
a leftmost prefix of the     index on (c1,c2,c3)
  index and no other          •    GROUP BY c1, c2
                              •    CROUP BY c2, c3
       columns.
only can use aggregate
                       SELECT c1, MIN(c2) FROM
 functions like MAX,
                           t1 GROUP BY c1;
         MIN
Loose Index Scan
     Conditions                   Example

 The query is over a      SELECT c1, c2 FROM t1
    single table            GROUP BY c1, c2;
only columns that form
a leftmost prefix of the     index on (c1,c2,c3)
  index and no other          •    GROUP BY c1, c2
                              •    CROUP BY c2, c3
       columns.
only can use aggregate
                       SELECT c1, MIN(c2) FROM
 functions like MAX,
                           t1 GROUP BY c1;
         MIN
Loose Index Scan
     Conditions                   Example

 The query is over a      SELECT c1, c2 FROM t1
    single table            GROUP BY c1, c2;
only columns that form
a leftmost prefix of the     index on (c1,c2,c3)
  index and no other          •    GROUP BY c1, c2
                              •    CROUP BY c2, c3
       columns.
only can use aggregate
                       SELECT c1, MIN(c2) FROM
 functions like MAX,
                           t1 GROUP BY c1;
         MIN
Loose Index Scan
Loose Index Scan
Loose Index Scan
                            SELECT c1, c2 FROM t1
                              WHERE c1 < const
 Any other parts of the
                              GROUP BY c1, c2;
 index than those from
    the GROUP BY
                            SELECT MAX(c3), MIN
referenced in the query
                             (c3), c1, c2 FROM t1
   must be constants
                              WHERE c2 > const
                              GROUP BY c1, c2;

 Prefix index cannot be       col VARCHAR(20),
used for loose index scan     INDEX (col(10))
Loose Index Scan
                            SELECT c1, c2 FROM t1
                              WHERE c1 < const
 Any other parts of the
                              GROUP BY c1, c2;
 index than those from
    the GROUP BY
                            SELECT MAX(c3), MIN
referenced in the query
                             (c3), c1, c2 FROM t1
   must be constants
                              WHERE c2 > const
                              GROUP BY c1, c2;

 Prefix index cannot be       col VARCHAR(20),
used for loose index scan     INDEX (col(10))
Loose Index Scan


If loose index scan is applicable to a query,
the EXPLAIN output shows Using index for
group-by in the Extra column.
Tight Index Scan
•   MySQL Query Optimizer
    •   If loose index scan are not met, then try
        tight index scan
•   Different with loose, tight
    •   After finding all index keys in WHERE
        conditions, then MySQL do the grouping
        operation
Tight Index Scan
Tight Index Scan
idx(c1,c2,c3) on table t1(c1,c2,c3,c4)
Tight Index Scan
  idx(c1,c2,c3) on table t1(c1,c2,c3,c4)
• A gap in the GROUP BY
   • SELECT c1, c2, c3 FROM t1 WHERE c2 =
     'a' GROUP BY c1, c3;
Tight Index Scan
  idx(c1,c2,c3) on table t1(c1,c2,c3,c4)
• A gap in the GROUP BY
   • SELECT c1, c2, c3 FROM t1 WHERE c2 =
     'a' GROUP BY c1, c3;
• not the first part of the key
   • SELECT c1, c2, c3 FROM t1 WHERE c1 =
     'a' GROUP BY c2, c3;
More...
• Books
 • <<MySQL                        >>,
    Author:

• Web Sites
 • http://dev.mysql.com/doc/refman/5.1/en/
    optimization.html
 • http://www.slideshare.net/
Q and A

Mysql query optimization

  • 1.
    MySQL Query Optimization 2010.07.09 Cai Baohua
  • 2.
    Agenda • What thequery optimizer is • The principles of the optimization • Explain and Profiling • Use Index • JOIN Optimization • ORDER BY, GROUP BY Optimization
  • 3.
  • 4.
    MySQl Query Optimizer Parser Table Table Optimizer Modification Maintenance ..... Module Module Access Control Module
  • 5.
    MySQl Query Optimizer •Not only CBO but aslo CBO + RBO • Cost Base Optimizer • Rule Base Optimizer
  • 6.
    The Principles Ofthe Optimization
  • 8.
    Optimizing the query which need more optimization
  • 9.
    Optimizing the query which need more optimization • Identify the performance bottleneck
  • 10.
    Optimizing the query which need more optimization • Identify the performance bottleneck • Find clear optimization objects
  • 11.
    Optimizing the query which need more optimization • Identify the performance bottleneck • Find clear optimization objects • Start with Explain and use Profile more often
  • 12.
    Optimizing the query which need more optimization • Identify the performance bottleneck • Find clear optimization objects • Start with Explain and use Profile more often • Always using the small result set to drive the large result set
  • 13.
    Optimizing the query which need more optimization • Identify the performance bottleneck • Find clear optimization objects • Start with Explain and use Profile more often • Always using the small result set to drive the large result set • Complete the sequencing in the index as much as possible
  • 14.
    Optimizing the query which need more optimization • Identify the performance bottleneck • Find clear optimization objects • Start with Explain and use Profile more often • Always using the small result set to drive the large result set • Complete the sequencing in the index as much as possible • Fetch the only fields that we need
  • 15.
    Optimizing the query which need more optimization • Identify the performance bottleneck • Find clear optimization objects • Start with Explain and use Profile more often • Always using the small result set to drive the large result set • Complete the sequencing in the index as much as possible • Fetch the only fields that we need • Only use the most effective conditions of the filter
  • 16.
    Optimizing the query which need more optimization • Identify the performance bottleneck • Find clear optimization objects • Start with Explain and use Profile more often • Always using the small result set to drive the large result set • Complete the sequencing in the index as much as possible • Fetch the only fields that we need • Only use the most effective conditions of the filter • Avoid the complex Join and sub queries as far as possible
  • 17.
  • 18.
    Use Explain andProfiling Explain tells you: • In which order the tables are read • What types of read operations that are made • Which indexes could have been used • Which indexes are used • How the tables refer to each other • How many rows the optimizer estimates to retrieve from each table
  • 19.
    Use Explain andProfiling
  • 20.
    Use Explain andProfiling Explain Types Different join types. system !"#$%&'(#$"&)$*+(,$*+#$-*.$ const /%$%"#$0*)%$*+#$0&%1"2+3$-*.4$%-#&%#5$&)$&$ 1*+)%&+%$ eq_ref 6+#$-*.$7#-$-*.$8-*0$7-#92*:)$%&'(#)$ ref ;#9#-&($-*.)$.2%"$0&%1"2+3$2+5#<$9&(:#$ ref_or_null =2>#$-#84$7(:)$?@==$9&(:#)$ index_merge ;#9#-&($2+5#<$)#&-1"#)$&-#$0#-3#5$ unique_subquery ;&0#$&)$-#8$8*-$)*0#$):'A:#-2#)$ index_subquery /)$&'*9#$8*-$+*+B:+2A:#$2+5#<#)$ range /$-&+3#$2+5#<$)1&+$ index !"#$."*(#$2+5#<$2)$)1&++#5$ ALL /$8:(($%&'(#$)1&+$
  • 21.
    Use Explain andProfiling Explain Types Different join types. system !"#$%&'(#$"&)$*+(,$*+#$-*.$ good const /%$%"#$0*)%$*+#$0&%1"2+3$-*.4$%-#&%#5$&)$&$ 1*+)%&+%$ eq_ref 6+#$-*.$7#-$-*.$8-*0$7-#92*:)$%&'(#)$ ref ;#9#-&($-*.)$.2%"$0&%1"2+3$2+5#<$9&(:#$ ref_or_null =2>#$-#84$7(:)$?@==$9&(:#)$ index_merge ;#9#-&($2+5#<$)#&-1"#)$&-#$0#-3#5$ unique_subquery ;&0#$&)$-#8$8*-$)*0#$):'A:#-2#)$ index_subquery /)$&'*9#$8*-$+*+B:+2A:#$2+5#<#)$ range /$-&+3#$2+5#<$)1&+$ index !"#$."*(#$2+5#<$2)$)1&++#5$ ALL /$8:(($%&'(#$)1&+$ bad
  • 22.
    Use Explain andProfiling Explain Extra This column contains additional information about how MySQL resolves the query. Using index !"#$%#&'()$*&$+%#,)#-$&)%,*.")$/%01$)"#$*2-#3$ Using where 40)$,(($%05&$,%#$'&#-$*2$)"#$%#&'()$ Distinct 62(7$,$&*2.(#$%05$*&$%#,-$8#%$%05$+019*2,:02$ Not exists ;$LEFT JOIN$1*&&*2.$%05&$08:1*<,:02$*&$'&#-$ Using filesort ;2$#3)%,$%05$&0%:2.$&)#8$*&$-02#$ Using temporary ;$)#180%,%7$),9(#$*&$'&#-$ Range checked !"#$%#,-$)78#$*&$08:1*<#-$*2-*=*-',((7$/0%$#,+"$ for each record +019*2,:02$0/$%05&$/%01$)"#$8%#=*0'&$),9(#&$
  • 23.
    Use Explain andProfiling • Open / Close Query Profiler mysql> set profiling = 1 (close: 0)
  • 24.
    Use Explain andProfiling Show profiles
  • 25.
    Use Explain andProfiling SHOW PROFILE • ALL - displays all information • BLOCK IO - displays counts for block input and output operations • CONTEXT SWITCHES - displays counts for voluntary and involuntary context switches • IPC - displays counts for messages sent and received • MEMORY - is not currently implemented • PAGE FAULTS - displays counts for major and minor page faults • SOURCE - displays the names of functions from the source code, together with the name and line number of the file in which the function occurs • SWAPS - displays swap count
  • 26.
    Use Explain andProfiling Show more info
  • 27.
  • 28.
  • 29.
    Index Types • Balance-Tree • Primary Key • Secondary Index • InnoDB, MyISAM often use
  • 30.
    Index Types • Balance-Tree • Primary Key • Secondary Index • InnoDB, MyISAM often use • Hash • Memory, NDB Cluster • “=”, “IN”, “<=>” not > < between != like • not work for ORDER BY
  • 31.
    Index Types • Balance-Tree • Primary Key • Secondary Index • InnoDB, MyISAM often use • Hash • Memory, NDB Cluster • “=”, “IN”, “<=>” not > < between != like • not work for ORDER BY • Fulltext • CHAR,VARCHAR and TEXT • Uses it instead of LIKE ‘%*****%’, more efficient
  • 32.
    Index Types • Balance-Tree • Primary Key • Secondary Index • InnoDB, MyISAM often use • Hash • Memory, NDB Cluster • “=”, “IN”, “<=>” not > < between != like • not work for ORDER BY • Fulltext • CHAR,VARCHAR and TEXT • Uses it instead of LIKE ‘%*****%’, more efficient • R-Tree • to solve the problem of spatial data retrieval • only data type: GEOMETRY
  • 33.
    Pros and Consof Index
  • 34.
    Pros and Consof Index • Pros • Improve the efficiency of data retrieval • Reduce the cost of database I/O • Reduce the cost of data sorting
  • 35.
    Pros and Consof Index • Pros • Improve the efficiency of data retrieval • Reduce the cost of database I/O • Reduce the cost of data sorting • Cons • index will take more disk space • slow the speed of updating table (insert, update, delete)
  • 36.
  • 37.
    When Use Index? • Field used in WHERE more frequently, use index
  • 38.
    When Use Index? • Field used in WHERE more frequently, use index • Field like status or type, no index
  • 39.
    When Use Index? • Field used in WHERE more frequently, use index • Field like status or type, no index • Contain too many records records, which bring too many random I/O, to many duplicate I/O
  • 40.
    When Use Index? • Field used in WHERE more frequently, use index • Field like status or type, no index • Contain too many records records, which bring too many random I/O, to many duplicate I/O • Field updated too often, no index
  • 41.
    When Use Index? • Field used in WHERE more frequently, use index • Field like status or type, no index • Contain too many records records, which bring too many random I/O, to many duplicate I/O • Field updated too often, no index • Field not in WHERE, no index
  • 42.
  • 43.
    1 or N-ColumnsIndex • No absolute conclusion
  • 44.
    1 or N-ColumnsIndex • No absolute conclusion • When a filter field can filter data more than 90% and the other filter fields will be updated often, which we can try to use composite index
  • 45.
    1 or N-ColumnsIndex • No absolute conclusion • When a filter field can filter data more than 90% and the other filter fields will be updated often, which we can try to use composite index • Reduce the cost of index updating and disk space of index
  • 46.
    1 or N-ColumnsIndex • No absolute conclusion • When a filter field can filter data more than 90% and the other filter fields will be updated often, which we can try to use composite index • Reduce the cost of index updating and disk space of index • let one index used in different quries
  • 47.
    1 or N-ColumnsIndex • No absolute conclusion • When a filter field can filter data more than 90% and the other filter fields will be updated often, which we can try to use composite index • Reduce the cost of index updating and disk space of index • let one index used in different quries • Don’t over index
  • 48.
  • 49.
    Index Prefixes • Index prefixes of CHAR,VARCHAR, BINARY, VARBINARY, BLOB, and TEXT columns
  • 50.
    Index Prefixes • Index prefixes of CHAR,VARCHAR, BINARY, VARBINARY, BLOB, and TEXT columns • name char (200)
  • 51.
    Index Prefixes • Index prefixes of CHAR,VARCHAR, BINARY, VARBINARY, BLOB, and TEXT columns • name char (200) • most value are unique within the first 10-20
  • 52.
    Index Prefixes • Index prefixes of CHAR,VARCHAR, BINARY, VARBINARY, BLOB, and TEXT columns • name char (200) • most value are unique within the first 10-20 • CREATE INDEX part_of_name ON customer (name(10));
  • 53.
    Index Prefixes • Index prefixes of CHAR,VARCHAR, BINARY, VARBINARY, BLOB, and TEXT columns • name char (200) • most value are unique within the first 10-20 • CREATE INDEX part_of_name ON customer (name(10)); • faster query and disk I/O reduction
  • 54.
    Limitation of MysqlIndex • MyISAM - the total length of index <= 1000 bytes • BLOB and TEXT only create Index Prefix • Mysql not support Function Index • “!=” or “<>”, won’t use index • abs(column) etc, won’t use index • Join (a.city = b.city). If the filter fileds’ type are not the same, mysql won’t use index • Like ‘%abc’, won’t use index • Hash index only can be used when “=”, “<=>”, “IN”
  • 55.
  • 56.
  • 57.
  • 59.
  • 60.
    Nested Loop (ref) g.group_id=m.group_id users_group(g) group_message(m) index ref scan index ref scan
  • 61.
    Nested Loop (ref) Nested Loop (ref) g.group_id=m.group_id m.id=c.group_msg_id Result Set Output users_group(g) group_message(m) group_message_content() index ref scan index ref scan index ref scan
  • 62.
    Ideas for optimization •Minimize the number of Nested Loop • Give priority to optimizing the inner loop • Indexing filter fields • ... FROM A, B WHERE B.group_id = A.group_id • Join Buffer size, type is All, index, range, index_merge
  • 63.
  • 64.
  • 65.
    How Satisfy OrderBy • Use Index, without doing a any extra sorting
  • 66.
    How Satisfy OrderBy • Use Index, without doing a any extra sorting • Use filesort algorithms
  • 67.
  • 68.
  • 69.
    Use Index SELECT col1,col2 FROM sort a ORDER BY [sort] SELECT col1, col2 FROM a WHERE colX=value (colx, sort) ORDER BY [sort] SELECT * FROM a WHERE uid=1 ORDER BY (uid, x, y) x, y SELECT * FROM a won’t use index ORDER BY YEAR(date) ...... ......
  • 70.
    Use Index SELECT col1,col2 FROM sort a ORDER BY [sort] SELECT col1, col2 FROM a WHERE colX=value (colx, sort) ORDER BY [sort] SELECT * FROM a WHERE uid=1 ORDER BY (uid, x, y) x, y SELECT * FROM a won’t use index ORDER BY YEAR(date) ...... ......
  • 71.
    Use Index SELECT col1,col2 FROM sort a ORDER BY [sort] SELECT col1, col2 FROM a WHERE colX=value (colx, sort) ORDER BY [sort] SELECT * FROM a WHERE uid=1 ORDER BY (uid, x, y) x, y SELECT * FROM a won’t use index ORDER BY YEAR(date) ...... ......
  • 72.
    Use Index SELECT col1,col2 FROM sort a ORDER BY [sort] SELECT col1, col2 FROM a WHERE colX=value (colx, sort) ORDER BY [sort] SELECT * FROM a WHERE uid=1 ORDER BY (uid, x, y) x, y SELECT * FROM a won’t use index ORDER BY YEAR(date) ...... ......
  • 73.
    Use Index SELECT col1,col2 FROM sort a ORDER BY [sort] SELECT col1, col2 FROM a WHERE colX=value (colx, sort) ORDER BY [sort] SELECT * FROM a WHERE uid=1 ORDER BY (uid, x, y) x, y SELECT * FROM a won’t use index ORDER BY YEAR(date) ...... ......
  • 74.
    Use Filesort • increasemax_length_for_stort_data • remove return fields which are not necessary • increase sort_buffer_size
  • 75.
    How Satisfy GroupBy • Loose Index Scan • Tight Index Scan
  • 76.
  • 77.
  • 78.
    Loose Index Scan Conditions Example The query is over a SELECT c1, c2 FROM t1 single table GROUP BY c1, c2; only columns that form a leftmost prefix of the index on (c1,c2,c3) index and no other • GROUP BY c1, c2 • CROUP BY c2, c3 columns. only can use aggregate SELECT c1, MIN(c2) FROM functions like MAX, t1 GROUP BY c1; MIN
  • 79.
    Loose Index Scan Conditions Example The query is over a SELECT c1, c2 FROM t1 single table GROUP BY c1, c2; only columns that form a leftmost prefix of the index on (c1,c2,c3) index and no other • GROUP BY c1, c2 • CROUP BY c2, c3 columns. only can use aggregate SELECT c1, MIN(c2) FROM functions like MAX, t1 GROUP BY c1; MIN
  • 80.
    Loose Index Scan Conditions Example The query is over a SELECT c1, c2 FROM t1 single table GROUP BY c1, c2; only columns that form a leftmost prefix of the index on (c1,c2,c3) index and no other • GROUP BY c1, c2 • CROUP BY c2, c3 columns. only can use aggregate SELECT c1, MIN(c2) FROM functions like MAX, t1 GROUP BY c1; MIN
  • 81.
    Loose Index Scan Conditions Example The query is over a SELECT c1, c2 FROM t1 single table GROUP BY c1, c2; only columns that form a leftmost prefix of the index on (c1,c2,c3) index and no other • GROUP BY c1, c2 • CROUP BY c2, c3 columns. only can use aggregate SELECT c1, MIN(c2) FROM functions like MAX, t1 GROUP BY c1; MIN
  • 82.
  • 83.
  • 84.
    Loose Index Scan SELECT c1, c2 FROM t1 WHERE c1 < const Any other parts of the GROUP BY c1, c2; index than those from the GROUP BY SELECT MAX(c3), MIN referenced in the query (c3), c1, c2 FROM t1 must be constants WHERE c2 > const GROUP BY c1, c2; Prefix index cannot be col VARCHAR(20), used for loose index scan INDEX (col(10))
  • 85.
    Loose Index Scan SELECT c1, c2 FROM t1 WHERE c1 < const Any other parts of the GROUP BY c1, c2; index than those from the GROUP BY SELECT MAX(c3), MIN referenced in the query (c3), c1, c2 FROM t1 must be constants WHERE c2 > const GROUP BY c1, c2; Prefix index cannot be col VARCHAR(20), used for loose index scan INDEX (col(10))
  • 86.
    Loose Index Scan Ifloose index scan is applicable to a query, the EXPLAIN output shows Using index for group-by in the Extra column.
  • 88.
    Tight Index Scan • MySQL Query Optimizer • If loose index scan are not met, then try tight index scan • Different with loose, tight • After finding all index keys in WHERE conditions, then MySQL do the grouping operation
  • 89.
  • 90.
    Tight Index Scan idx(c1,c2,c3)on table t1(c1,c2,c3,c4)
  • 91.
    Tight Index Scan idx(c1,c2,c3) on table t1(c1,c2,c3,c4) • A gap in the GROUP BY • SELECT c1, c2, c3 FROM t1 WHERE c2 = 'a' GROUP BY c1, c3;
  • 92.
    Tight Index Scan idx(c1,c2,c3) on table t1(c1,c2,c3,c4) • A gap in the GROUP BY • SELECT c1, c2, c3 FROM t1 WHERE c2 = 'a' GROUP BY c1, c3; • not the first part of the key • SELECT c1, c2, c3 FROM t1 WHERE c1 = 'a' GROUP BY c2, c3;
  • 93.
    More... • Books •<<MySQL >>, Author: • Web Sites • http://dev.mysql.com/doc/refman/5.1/en/ optimization.html • http://www.slideshare.net/
  • 94.