0

I'm trying to make range aggregation on the following data set:

{
    "ProductType": 1,
                    "ProductDefinition": "fc588f8e-14f2-4871-891f-c73a4e3d17ca",
                    "ParentProduct": null,
                    "Sku": "074617",
                    "VariantSku": null,
                    "Name": "Paraboot Avoriaz/Jannu Marron Brut Marron Brown Hiking Boot Shoes",
                    "AllowOrdering": true,
                    "Rating": null,
                    "ThumbnailImageUrl": "/media/1106/074617.jpg",
                    "PrimaryImageUrl": "/media/1106/074617.jpg",
                    "Categories": [
                        "399d7b20-18cc-46c0-b63e-79eadb9390c7"
                    ],
                    "RelatedProducts": [],
                    "Variants": [
                        "84a7ff9f-edf0-4aab-87f9-ba4efd44db74",
                        "e2eb2c50-6abc-4fbe-8fc8-89e6644b23ef",
                        "a7e16ccc-c14f-42f5-afb2-9b7d9aefbc5c"
                    ],
                    "PriceGroups": [
                        "86182755-519f-4e05-96ef-5f93a59bbaec"
                    ],
                    "DisplayName": "Paraboot Avoriaz/Jannu Marron Brut Marron Brown Hiking Boot Shoes",
                    "ShortDescription": "",
                    "LongDescription": "<ul><li>Paraboot Avoriaz Mountaineering Boots</li><li>Marron Brut Marron (Brown)</li><li>Full leather inners and uppers</li><li>Norwegien Welted Commando Sole</li><li>Hand made in France</li><li>Style number : 074617</li></ul><p>As featured on <a href=\"http://www.pritchards.co.uk/shoes-trainers-11/paraboot-avoriaz-jannu-marron-brut-brown-20879.htm\">Pritchards.co.uk</a></p>",
                    "UnitPrices": {
                        "EUR 15 pct": 343.85
                    },
                    "Taxes": {
                        "EUR 15 pct": 51.5775
                    },
                    "PricesInclTax": {
                        "EUR 15 pct": 395.4275
                    },
                    "Slug": "paraboot-avoriazjannu-marron-brut-marron-brown-hiking-boot-shoes",
                    "VariantsProperties": [
                        {
                            "Key": "ShoeSize",
                            "Value": "8"
                        },
                        {
                            "Key": "ShoeSize",
                            "Value": "10"
                        },
                        {
                            "Key": "ShoeSize",
                            "Value": "6"
                        }
                    ],
                    "Guid": "0d4f6899-c66a-4416-8f5d-26822c3b57ae",
                    "Id": 178,
                    "ShowOnHomepage": true
                }

I'm aggregating on VariantsProperties which have the following mapping

"VariantsProperties": {
                    "type": "nested",
                    "properties": {
                        "Key": {
                            "type": "keyword"
                        },
                        "Value": {
                            "type": "keyword"
                        }
                    }
                }

Terms aggregations are working fine with following code:

{
    "aggs": {
        "Nest": {
            "nested": {
                "path": "VariantsProperties"
            },
            "aggs": {
                "fieldIds": {
                    "terms": {
                        "field": "VariantsProperties.Key"
                    },
                    "aggs": {
                        "values": {
                            "terms": {
                                "field": "VariantsProperties.Value"
                            }
                        }
                    }
                }
            }
        }
    }
}

However when I try to do a range aggregation to get shoes in size between 8 - 12 such as:

{
    "aggs": {
        "Nest": {
            "nested": {
                "path": "VariantsProperties"
            },
            "aggs": {
                "fieldIds": {
                    "range": {
                        "field": "VariantsProperties.Value",
                        "ranges": [ { "from": 8, "to": 12 }]
                    }
                }
            }
        }
    }
}

I get the following error:

{
    "error": {
        "root_cause": [
            {
                "type": "illegal_argument_exception",
                "reason": "Field [VariantsProperties.Value] of type [keyword] is not supported for aggregation [range]"
            }
        ],
        "type": "search_phase_execution_exception",
        "reason": "all shards failed",
        "phase": "query",
        "grouped": true,
        "failed_shards": [
            {
                "shard": 0,
                "index": "product-avenueproductindexdefinition-24476f82-en-us",
                "node": "ejgN4XecT1SUfgrhzP8uZg",
                "reason": {
                    "type": "illegal_argument_exception",
                    "reason": "Field [VariantsProperties.Value] of type [keyword] is not supported for aggregation [range]"
                }
            }
        ],
        "caused_by": {
            "type": "illegal_argument_exception",
            "reason": "Field [VariantsProperties.Value] of type [keyword] is not supported for aggregation [range]",
            "caused_by": {
                "type": "illegal_argument_exception",
                "reason": "Field [VariantsProperties.Value] of type [keyword] is not supported for aggregation [range]"
            }
        }
    },
    "status": 400
}

Is there a way to "transform" the terms aggregation into a range aggregation, without the need of changing the schema? I know I could build the ranges myself by extracting the data from the terms aggregation and building the ranges out of it, however, I would prefer a solution within the elastic itself.

1 Answer 1

1

There are two ways to solve this:

Option A: Use a script instead of a field. This option will work without having to reindex your data, but depending on your volume of data, the performance might suffer.

POST test/_search
{
  "aggs": {
    "Nest": {
      "nested": {
        "path": "VariantsProperties"
      },
      "aggs": {
        "fieldIds": {
          "range": {
            "script": "Integer.parseInt(doc['VariantsProperties.Value'].value)",
            "ranges": [
              {
                "from": 8,
                "to": 12
              }
            ]
          }
        }
      }
    }
  }
}

Option B: Add an integer sub-field in your mapping.

PUT my-index/_mapping
{
  "properties": {
    "VariantsProperties": {
      "type": "nested",
      "properties": {
        "Key": {
          "type": "keyword"
        },
        "Value": {
          "type": "keyword",
          "fields": {
            "numeric": {
              "type": "integer",
              "ignore_malformed": true
            }
          }
        }
      }
    }
  }
}

Once your mapping is modified, you can run _update_by_query on your index in order to reindex the VariantsProperties.Value data

PUT my-index/_update_by_query

Finally, when this last command is done, you can run the range aggregation on the VariantsProperties.Value.numeric field.

Also note that this second but will be more performant on the long term.

Sign up to request clarification or add additional context in comments.

5 Comments

Hi Thanks for the answer, I really like the option B, however, I'm not sure what would the "numeric" field contain? Would it contain the actual values of the shoe sizes? Something like "VariantsProperties": [{"Key": "ShoeSize","Value": {"numeric": 8}} ?
You'll have a new field available for your queries and aggregations which is called VariantsProperties.Value.numeric, but you don't have to change anything in the way you index the data. You can simply modify the mapping using PUT and then call _update_by_query on your index. I've updated my answer to show you how to do it.
Alright, thanks it works, I've marked the your answer as accepted, thank you, one more question, would it be possible to do this conditionally? I mean what if some of the variants properties are not parsable into integers, like plain strings? Would that need a separate field or can they be mixed and then parsed afterwards?
Thanks again, everything works. Is this pattern/approach called somehow so I could read up on it? Since I'd like to know what the Numeric is doing exactly and how is it possible it works although it doesn't have a value.
Yes, sure, it's called multi-fields and it allows you to provide the value only once and index it in different ways for different needs.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.