SConvTransform: A New Compiler Optimization for Convolution Operations

This title was summarized by AI from the post below.

We are pleased to announce the public release of their latest research: the SConvTransform code optimization algorithm. This compiler optimization, implemented using the MLIR compiler infrastructure, converts Convolution operations from the Linalg dialect into an efficient loop nest that performs tiling, packing, and invokes a highly optimized microkernel. SConvTransform is used as an operation in the Transform dialect, allowing both the code being transformed (Payload IR) and the transformation itself (Transform IR) to be represented using MLIR. SConvTransform can efficiently utilize the cache hierarchy by running a convolutional analysis algorithm called Convolution Slicing Analysis (CSA), which determines how many tiles for each tensor can fit in each cache level. The user provides parameters such as cache size, cache latency, and microkernel size, allowing CSA to output the ideal scheduling and partitioning of the tensors. With this information, the Convolution Slicing Optimization (CSO) tiles the convolution operation into a loop nest, adds packing operations, and invokes a microkernel from the OpenBLAS library. The code for the SConvTransform is publicly available at Celera AI GitHub: https://lnkd.in/dVW8tmwp If you have questions or suggestions, reach out via email contact@celera.ai or send us a DM! #compilers #mlir #llvm #convolution #deeplearning #optimization

  • No alternative text description for this image

To view or add a comment, sign in

Explore content categories