Indexing Documents in MongoDBAlberto LernerSoftware Engineer – 10Genalerner@10gen.com
Indexing BasicsMongoDB can use separate tree structures to index a collectionWhen processing a search criteria, MongoDB will try to avoid going through a collection, taking advantage of existing indices
Team Work MongoDB’s job: use an index, if possibleSearchCriteriausing indexscanning the collection
Your JobTo provide indices for important queriesImportant queries?Very frequently usedEspecially low response time required
Creating an IndexYou have an automatic one over _idOthers can be created with ‘ensureIndex’# index over attribute ‘name’db.<collection>.ensureIndex({name:1})# compound keys, ascending/descendingdb.<collection>.ensureIndex({name:1, date:-1 })# unique keysdb.<collection>.ensureIndex({sku:1}, {unique:true})# building in backgrounddb.<collection>.ensureIndex( …, {background:true})
Simple Search CriteriaSearch criteria is the index key or a prefix thereofdb.<collection>.find({sku:1234}) # index over skudb.<collection>.find({sku:1234}) # index over sku, <xxx>
More Exact Matching# index over sku….find({sku: {$in:[1234,5678])# index over ‘product.sku’….find({“product.sku”:1234})# a tricky query, would need index on ‘product’ instead….find({product: {sku:1234}}) { _id:1, product: {sku:1234} } # matches
Range CriteriaSearch criteria may return several resultsdb.<collection>.findOne({sku: {$gt:1234}}) db.<collection>.find({sku: {$gt:5678,$lt:5699}})
Range Criteria (cont)# index over sku….find({sku:/^12/}
Other Operations# index over sku….update({sku:1234},{$inc:{sold:1}}})….remove({sku:1234})
Index CoveringSometimes, all the needed information is in the index itself….count({color:blue}) # index over color….find({sku:1234},{color:1}) # index over sku, color
Missing fieldsAll documents have an entry on an indexA missing field is indexed as a NULL# matches all documents without sku# if index over sku is unique, there could be only one….find({sku:NULL}) # will be using a sku index, but not there yet….find({{sku:{$exits:true}})
Array MatchingA field that contains an array will have one entry in the index per element in the array{ _id: “abcd”, x:[2,10]} will appear in all the following queries using an index over x….find({x:2})….find({x:10})….find({x:[2,10]})….find({x:{$gt:5}}) # because of 10
Indexes and OrderingSort elimination is also accomplished though using indexes….find({sku:{$gt:56678}).sort({sku:1})….find().sort({sku:-1})  # can traverse backwards
Is It Using the Index?explain() tool allows you to see whether an index is being chosendb.<collection>.find({sku:{$gt:5}}).explain(){“cursor” : “BtreeCursor sku_1”,…}
HintingSometimes we may force or avoid the use of an indexUsually, it should not be necessary to intervene# forces use of index over sku….find{{sku:1, …}).hint({sku:1})# prevents any index to be used….find({sku:1,…}).hint({$natural:1})
When Indexes Don’t Help# negation….find({sku:{$ne:9876}})# index helps to filter string sku’s, though….find({sku:/88/}) # generic regex# $where may contain very expressive searches# we don’t even try….find({$where:”this.sku==1234”})
Many indices?Evaluating search criteria currently uses just one index, even if more than one would be possibleThe choice is based on previous executions; if an index “worked well” for a query before, it’ll likely be againException: $or can use more than one index
So When to Index?There’s a trade off between search criteria efficiency and insertion/update/deletion of keysAlso, there is (a quite high) limit on number of indexes per collection (that we keep bumping up)
Indexes ResourcesIndexes are memory mapped as well; Be mindful of number of indexes and choice of keys# In ‘indexSizes’, individual indexes in collectiondb.<collection>.stats()# All indexes in collectiondb.<collection>.TotalIndexSize()
Take awayThe picture to keep in mindSearchCriteria
Questions?www.mongodb.org

Indexing documents

  • 1.
    Indexing Documents inMongoDBAlberto LernerSoftware Engineer – 10Genalerner@10gen.com
  • 2.
    Indexing BasicsMongoDB canuse separate tree structures to index a collectionWhen processing a search criteria, MongoDB will try to avoid going through a collection, taking advantage of existing indices
  • 3.
    Team Work MongoDB’sjob: use an index, if possibleSearchCriteriausing indexscanning the collection
  • 4.
    Your JobTo provideindices for important queriesImportant queries?Very frequently usedEspecially low response time required
  • 5.
    Creating an IndexYouhave an automatic one over _idOthers can be created with ‘ensureIndex’# index over attribute ‘name’db.<collection>.ensureIndex({name:1})# compound keys, ascending/descendingdb.<collection>.ensureIndex({name:1, date:-1 })# unique keysdb.<collection>.ensureIndex({sku:1}, {unique:true})# building in backgrounddb.<collection>.ensureIndex( …, {background:true})
  • 6.
    Simple Search CriteriaSearchcriteria is the index key or a prefix thereofdb.<collection>.find({sku:1234}) # index over skudb.<collection>.find({sku:1234}) # index over sku, <xxx>
  • 7.
    More Exact Matching#index over sku….find({sku: {$in:[1234,5678])# index over ‘product.sku’….find({“product.sku”:1234})# a tricky query, would need index on ‘product’ instead….find({product: {sku:1234}}) { _id:1, product: {sku:1234} } # matches
  • 8.
    Range CriteriaSearch criteriamay return several resultsdb.<collection>.findOne({sku: {$gt:1234}}) db.<collection>.find({sku: {$gt:5678,$lt:5699}})
  • 9.
    Range Criteria (cont)#index over sku….find({sku:/^12/}
  • 10.
    Other Operations# indexover sku….update({sku:1234},{$inc:{sold:1}}})….remove({sku:1234})
  • 11.
    Index CoveringSometimes, allthe needed information is in the index itself….count({color:blue}) # index over color….find({sku:1234},{color:1}) # index over sku, color
  • 12.
    Missing fieldsAll documentshave an entry on an indexA missing field is indexed as a NULL# matches all documents without sku# if index over sku is unique, there could be only one….find({sku:NULL}) # will be using a sku index, but not there yet….find({{sku:{$exits:true}})
  • 13.
    Array MatchingA fieldthat contains an array will have one entry in the index per element in the array{ _id: “abcd”, x:[2,10]} will appear in all the following queries using an index over x….find({x:2})….find({x:10})….find({x:[2,10]})….find({x:{$gt:5}}) # because of 10
  • 14.
    Indexes and OrderingSortelimination is also accomplished though using indexes….find({sku:{$gt:56678}).sort({sku:1})….find().sort({sku:-1}) # can traverse backwards
  • 15.
    Is It Usingthe Index?explain() tool allows you to see whether an index is being chosendb.<collection>.find({sku:{$gt:5}}).explain(){“cursor” : “BtreeCursor sku_1”,…}
  • 16.
    HintingSometimes we mayforce or avoid the use of an indexUsually, it should not be necessary to intervene# forces use of index over sku….find{{sku:1, …}).hint({sku:1})# prevents any index to be used….find({sku:1,…}).hint({$natural:1})
  • 17.
    When Indexes Don’tHelp# negation….find({sku:{$ne:9876}})# index helps to filter string sku’s, though….find({sku:/88/}) # generic regex# $where may contain very expressive searches# we don’t even try….find({$where:”this.sku==1234”})
  • 18.
    Many indices?Evaluating searchcriteria currently uses just one index, even if more than one would be possibleThe choice is based on previous executions; if an index “worked well” for a query before, it’ll likely be againException: $or can use more than one index
  • 19.
    So When toIndex?There’s a trade off between search criteria efficiency and insertion/update/deletion of keysAlso, there is (a quite high) limit on number of indexes per collection (that we keep bumping up)
  • 20.
    Indexes ResourcesIndexes arememory mapped as well; Be mindful of number of indexes and choice of keys# In ‘indexSizes’, individual indexes in collectiondb.<collection>.stats()# All indexes in collectiondb.<collection>.TotalIndexSize()
  • 21.
    Take awayThe pictureto keep in mindSearchCriteria
  • 22.