3

I have a MySQL table with the following definition:

mysql> desc person;
+--------+---------+------+-----+---------+-------+
| Field  | Type    | Null | Key | Default | Extra |
+--------+---------+------+-----+---------+-------+
| id     | int(11) | NO   | PRI | NULL    |       |
| name   | text    | YES  |     | NULL    |       |
| fruits | json    | YES  |     | NULL    |       |
+--------+---------+------+-----+---------+-------+

The table has some sample data as follows:

mysql> select * from person;
+----+------+----------------------------------+
| id | name | fruits                           |
+----+------+----------------------------------+
|  1 | Tom  | ["apple", "orange"]              |
|  2 | John | ["apple", "mango"]               |
|  3 | Tony | ["apple", "mango", "strawberry"] |
+----+------+----------------------------------+

How can I calculate the total number of occurrences for each fruit? For example:

+------------+-------+
| fruit      | count |    
+------------+-------+
| apple      | 3     |
| orange     | 1     |
| mango      | 2     | 
| strawberry | 1     |
+------------+-------+

Some research shows that the JSON_LENGTH function can be used but I cannot find an example similar to my scenario.

2
  • which mysql version are you using`? Commented Dec 11, 2019 at 19:59
  • @nbk MySQL version 8.0.18 Commented Dec 11, 2019 at 20:01

3 Answers 3

8

You can use JSON_EXTRACT() function to extract each value ("apple", "mango", "strawberry" and "orange") of all three components of the arrays, and then then apply UNION ALL to combine all such queries:

SELECT comp, count(*)
FROM
(
 SELECT JSON_EXTRACT(fruit, '$[0]') as comp FROM person UNION ALL
 SELECT JSON_EXTRACT(fruit, '$[1]') as comp FROM person UNION ALL
 SELECT JSON_EXTRACT(fruit, '$[2]') as comp FROM person 
) q
WHERE comp is not null
GROUP BY comp

Indeed If your DB's version is 8, then you can also use JSON_TABLE() function :

SELECT j.fruit, count(*)
  FROM person p
  JOIN JSON_TABLE(
                 p.fruits,
                '$[*]' columns (fruit varchar(50) path '$')
       ) j
GROUP BY j.fruit;

Demo

Sign up to request clarification or add additional context in comments.

2 Comments

Thank. I have a question - how can I use the JSON_EXTRACT function if the maximum number of row elements are unknown?
@Arjun you're welcome. Indeed I advised JSON_TABLE because of this problem :) I don't know and don't think whether a dynamic method exists for JSON_EXTRACT. But luckily you have version 8.
1

You can't do it without first creating a table with one row per fruit.

CREATE TABLE allfruits (fruit VARCHAR(10) PRIMARY KEY);
INSERT INTO allfruits VALUES ('apple'), ('orange'), ('mango'), ('strawberry');

There is not a good way to generate this from the JSON.

Once you have that table, you can join it to the JSON and then use GROUP BY to count the occurrences.

SELECT fruit, COUNT(*) AS count
FROM allfruits
JOIN person ON JSON_SEARCH(person.fruits, 'one', fruit) IS NOT NULL
GROUP BY fruit;

Output:

+------------+-------+
| fruit      | count |
+------------+-------+
| apple      |     3 |
| mango      |     2 |
| orange     |     1 |
| strawberry |     1 |
+------------+-------+

Note that it will do a table-scan on the person table to find each fruit. This is pretty inefficient, and as your person table gets larger, it will become a performance problem.

If you want to optimize for this type of query, then you shouldn't use JSON to store an array of fruits. You should store data in a normalized way, representing the many-to-many relationship between persons and fruits with another table.

This is related to my answer to Is storing a delimited list in a database column really that bad?

1 Comment

Thanks for your answer. I understand your point about normalization and avoiding JSON for such a use-case. However, I am working with data where the majority can be well-fed into a relational database but there are some crucial fields in JSON.
1

I think the simplest solution would be to use JSON_TABLE function.

The query you need is


select ft.fruit, count(ft.fruit) from person,
json_table(
  fruits,
  '$[*]' columns(
     fruit varchar(128) path '$'
    ) 
  ) as ft
  group by ft.fruit
  ;

You can find working example in this dbfiddle Fruit demo

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.