Presto Updates to 0.178
Kai Sasaki
Treasure Data Inc
Bio
• Kai Sasaki (@Lewuathe)
• Software Engineer at Treasure Data
• Presto Team
• Hadoop/Spark/Hivemall Contributor
Presto In Treasure Data
Presto In Treasure Data
• Use Presto for query processing
• 4.3+ million queries per month
• 400 trillion records per month
• 6+ PB per month
Presto In Treasure Data
Presto
Coordinator
Presto
Worker
Presto
Worker
Presto
Worker
PostgreSQL
S3
presto-
client-ruby
0.152 -> 0.178
New Features
• Lambda Expression
• Filtered Aggregation
• VALIDATE mode in EXPLAIN
• Compressed Exchange
• Complex Grouping Operation
Lambda Expression
• Use -> in lambda function
https://prestodb.io/docs/current/functions/lambda.html
Filtered Aggregation
• Filtering inside aggregation function
SELECT
sum(a) FILTER (WHERE a > 0)
FROM
…
VALIDATE mode in EXPLAIN
• Syntax check by EXPLAIN
presto> EXPLAIN (type VALIDATE) SELECT …
Valid
———
true
(1 row)
Compressed Exchange
• Block exchanged between workers

are compressed in LZ4
• Enabled by

exchange.compression-enabled=true
Complex Grouping Operation
• UNION ALL + GROUP BY
SELECT host, path, code, AVG(size)
FROM www_access
GROUP BY GROUPING SETS (
(host),
(path),
(host,code)
);
Complex Grouping Operation
• UNION ALL + GROUP BY
SELECT host, NULL, NULL, AVG(size)
FROM www_access GROUP BY host
UNION ALL
SELECT NULL, path, NULL, AVG(size)
FROM www_access GROUP BY path
UNION ALL
SELECT host, NULL, code, AVG(size)
FROM www_access GROUP BY host, code
New Functions
• xxhash64(binary), to_big_endian_64(bigint)
• levenshtein_distance(string1,string2)
• array_overlap(x, y), array_except(x, y)
• to_ieee754_32(real), to_ieee754_64(double)
• codepoint()
• skewness(x), kurtosis(x)
Misc
• INT as alias for INTEGER
• Deprecated sample column for 

approximate query (experimental though)
• Allow specifying column comments

for CREATE TABLE
Future Works
• Presto Meetup - May 10th, 2017 

@ Facebook HQ
• Members
• Facebook, Teradata, Netflix, Uber etc
Future Works
• Disk Spill (on-going)

https://github.com/prestodb/presto/issues/5144
• Warning Framework

Notify warning and have a grace period so that users can
migrate queries to a new style
• Cost based optimizer

CAUTION!
• deprecated.legacy-order-by

Due to incompatibility of ORDER BY column
resolution
• deprecated.legacy-map-subscript

Due to incompatibility of map subscript
operator behavior if the key is not present
CAUTION!!!
• In 0.179
• “Fix planning failure when GROUPING() is
used with the legacy_order_by session
property set to true”
• https://prestodb.io/docs/current/release/
release-0.179.html
Thank you!

Presto updates to 0.178

  • 1.
    Presto Updates to0.178 Kai Sasaki Treasure Data Inc
  • 2.
    Bio • Kai Sasaki(@Lewuathe) • Software Engineer at Treasure Data • Presto Team • Hadoop/Spark/Hivemall Contributor
  • 3.
  • 4.
    Presto In TreasureData • Use Presto for query processing • 4.3+ million queries per month • 400 trillion records per month • 6+ PB per month
  • 5.
    Presto In TreasureData Presto Coordinator Presto Worker Presto Worker Presto Worker PostgreSQL S3 presto- client-ruby
  • 6.
  • 7.
    New Features • LambdaExpression • Filtered Aggregation • VALIDATE mode in EXPLAIN • Compressed Exchange • Complex Grouping Operation
  • 8.
    Lambda Expression • Use-> in lambda function https://prestodb.io/docs/current/functions/lambda.html
  • 9.
    Filtered Aggregation • Filteringinside aggregation function SELECT sum(a) FILTER (WHERE a > 0) FROM …
  • 10.
    VALIDATE mode inEXPLAIN • Syntax check by EXPLAIN presto> EXPLAIN (type VALIDATE) SELECT … Valid ——— true (1 row)
  • 11.
    Compressed Exchange • Blockexchanged between workers
 are compressed in LZ4 • Enabled by
 exchange.compression-enabled=true
  • 12.
    Complex Grouping Operation •UNION ALL + GROUP BY SELECT host, path, code, AVG(size) FROM www_access GROUP BY GROUPING SETS ( (host), (path), (host,code) );
  • 13.
    Complex Grouping Operation •UNION ALL + GROUP BY SELECT host, NULL, NULL, AVG(size) FROM www_access GROUP BY host UNION ALL SELECT NULL, path, NULL, AVG(size) FROM www_access GROUP BY path UNION ALL SELECT host, NULL, code, AVG(size) FROM www_access GROUP BY host, code
  • 14.
    New Functions • xxhash64(binary),to_big_endian_64(bigint) • levenshtein_distance(string1,string2) • array_overlap(x, y), array_except(x, y) • to_ieee754_32(real), to_ieee754_64(double) • codepoint() • skewness(x), kurtosis(x)
  • 15.
    Misc • INT asalias for INTEGER • Deprecated sample column for 
 approximate query (experimental though) • Allow specifying column comments
 for CREATE TABLE
  • 16.
    Future Works • PrestoMeetup - May 10th, 2017 
 @ Facebook HQ • Members • Facebook, Teradata, Netflix, Uber etc
  • 17.
    Future Works • DiskSpill (on-going)
 https://github.com/prestodb/presto/issues/5144 • Warning Framework
 Notify warning and have a grace period so that users can migrate queries to a new style • Cost based optimizer

  • 18.
    CAUTION! • deprecated.legacy-order-by
 Due toincompatibility of ORDER BY column resolution • deprecated.legacy-map-subscript
 Due to incompatibility of map subscript operator behavior if the key is not present
  • 19.
    CAUTION!!! • In 0.179 •“Fix planning failure when GROUPING() is used with the legacy_order_by session property set to true” • https://prestodb.io/docs/current/release/ release-0.179.html
  • 20.