Hybridize Functions: A Tool for Automatically Refactoring Imperative Deep Learning
Programs to Graph Execution
Raffi Khatchadourian1,2
Tatiana Castro Vélez2
Mehdi Bagherzadeh3
Nan Jia2
Anita Raja1,2
1
CUNY Hunter College, USA (ponder@hunter.cuny.edu) 2
CUNY Graduate Center, USA 3
Oakland University, USA
Introduction
As Deep Learning (DL) datasets grow,
efficiency becomes essential to support
responsiveness [16].
Traditionally, DL frameworks embraced
deferred execution-style DL code for fast
execution.
Hybrid approaches [2, 8, 13] execute
imperative DL programs quickly.
Hybridization
Figure: Screenshot of the Hybridize Functions refactoring
preview wizard.
In TensorFlow [1], AutoGraph [13] can
enhance run-time performance by decorating
(annotating) appropriate Python function(s)
with @tf.function (Fig. 1).
Problems with Hybrid Approaches
Require non-trivial metadata [12].
Exhibit limitations and known issues with
native program constructs [9].
Are difficult to use correctly and efficiently
(e.g., avoiding side-effects) [4].
Developers manually specifying which
functions are converted.
Insight
Although imperative DL code typically
executes sequentially, hybridization resembles
parallelizing traditional sequential code.
Automated Tool
We design and implement a fully automated,
open-source refactoring tool named
Hybridize Functions [11] that transforms
otherwise eagerly-executed imperative
(Python) DL code for enhanced performance.
Contributions
Refactoring approach for automatically
converting imperative DL code to graphs.
Novel tensor analysis for imperative DL.
Fully automated, open-source tool
implemented as a PyDev [15] Eclipse [7]
IDE plug-in that integrates static analyses
from WALA [14] and Ariadne [6].
Architecture & Dependencies
Figure: Overall architecture.
Eclipse is leveraged for its existing, well
documented and integrated refactoring
framework and test engine [3], including
transformation APIs (e.g., ASTRewrite),
refactoring preview pane (Fig. 1),
precondition checking (e.g.,
Refactoring.
checkInitialConditions(),
Refactoring.
checkFinalPreconditions()), and
refactoring testing (e.g.,
RefactoringTest).
PyDev used for efficient program entity
indexing, extensive refactoring support [3],
and that it is completely open-source for
all Python development.
WALA is used for static analyses, such as
ModRef, for which we built our side-effect
analysis upon.
Ariadne, which depends on WALA, is used
for its Python and tensor analysis,
including type inference and (TensorFlow)
library modeling.
Challenges Addressed
Reworked much of the existing Java (JDT)
refactoring tooling to work with Python.
Integrated Ariadne with PyDev due to its
excellent and long-lived refactoring support
for Python, including refactoring preview
pane, element GUI selection, and
refactoring undo history.
Augmented Ariadne to analyze imperative
Deep Learning (Python) code by vastly
expanding the XML summaries to support
a wide variety of popular TensorFlow 2
APIs.
Added support for Python constructs
commonly used in modern imperative DL
programs.
Correlated varying intermediate
representations (IRs) with the original
Python source code.
Modernizing Ariadne: New Enhancements
Python module packages.
Wild card imports.
Intra-package references (relative imports;
from .. import X).
Package initialization scripts.
Automatic unit test entry points discovery.
Non-scalar tensor dataset [10] iteration.
Modeling of additional libraries.
Static and class methods analysis.
Analysis of custom decorators.
Callable object (functor) analysis (used in
Keras).
Evaluation Summary
We applied our approach to 19 open-source
Python imperative DL programs of varying
size and domain, with thousands of source
lines of code ranging from 0.12 to 36.72.
Our tool considered 766 Python functions,
automatically refactoring 42.56% despite
being highly conservative.
During a run-time performance evaluation,
we measured an average relative model
training speedup of 2.16 (memory
consumption measurement pending).
Differences in model accuracy and loss
before and after refactoring were negligible.
Conclusion
Open-source, automated refactoring PyDev
Eclipse plug-in, Hybridize Functions,
assists developers with writing optimal
imperative DL Python code.
Integrates an Eclipse refactoring with
WALA Ariadne Python static analyses.
Future Work
Explore incorporating advanced
container-based analyses.
Automatically split functions.
References
1. Abadi, M. et al.: TensorFlow: A System for Large-Scale Machine Learning. In: OSDI (2016)
2. Apache, Hybridize. Apache MXNet documentation. (2021). https://mxnet.apache.org/versions/1.8.
0/api/python/docs/tutorials/packages/gluon/blocks/hybridize.html (visited on 04/08/2021)
3. Bäumer, D. et al.: “Integrating refactoring support into a Java development tool”.
4. Castro Vélez, T. et al.: Challenges in Migrating Imperative Deep Learning Programs to Graph Execution: An
Empirical Study. In: MSR. MSR ’22. ACM (2022). https://doi.org/10.1145/3524842.3528455
5. Chollet, F.: Deep Learning with Python. Manning (2020)
6. Dolby, J. et al.: Ariadne. Analysis for Machine Learning Programs. In: MAPL, pp. 1–10. ACM (2018)
7. Eclipse Foundation, Eclipse IDE. (2024). https://eclipseide.org/ (visited on 09/10/2024)
8. Facebook Inc., PyTorch. TorchScript. en. (2019). https://pytorch.org/docs/stable/jit.html
9. Google LLC, Better performance with tf.function. (2021). https://tensorflow.org/guide/function
10. Google LLC, tf.data.Dataset. TensorFlow. Version 2.9.3. (2023). https : / / www . tensorflow . org /
versions/r2.9/api_docs/python/tf/data/Dataset (visited on 12/15/2023)
11. Hybridize-Functions-Refactoring. (2024). https://github.com/ponder-lab/Hybridize-Functions-
Refactoring (visited on 09/30/2024).
12. Jeong, E. et al.: Speculative Symbolic Graph Execution of Imperative Deep Learning Programs. SIGOPS
Oper. Syst. Rev. 53(1), 26–33 (2019). https://doi.org/10.1145/3352020.3352025
13. Moldovan, D. et al.: AutoGraph: Imperative-style Coding with Graph-based Performance. (2019). arXiv:
1810.08061 [cs.PL].
14. T.J. Watson Libraries for Analysis. (2024). https://github.com/wala/WALA (visited on 09/10/2024).
original-date: 2012-04-05T18:57:03Z.
15. Zadrozny, F.: PyDev. (2023). https://www.pydev.org (visited on 05/31/2023)
16. Zhou, W. et al.: HARP: Holistic Analysis for Refactoring Python-Based Analytics Programs. In: ICSE (2020).
https://doi.org/10.1145/3377811.3380434
Acknowledgments This material is supported in part
by the National Science Foundation under awards CCF
2200343, CNS 2213763, and CCF 2343750.
International Conference on Fundamental Approaches to Software Engineering, May 3–8, 2025, Hamilton, Canada

Hybridize Functions: A Tool for Automatically Refactoring Imperative Deep Learning Programs to Graph Execution

  • 1.
    Hybridize Functions: ATool for Automatically Refactoring Imperative Deep Learning Programs to Graph Execution Raffi Khatchadourian1,2 Tatiana Castro Vélez2 Mehdi Bagherzadeh3 Nan Jia2 Anita Raja1,2 1 CUNY Hunter College, USA (ponder@hunter.cuny.edu) 2 CUNY Graduate Center, USA 3 Oakland University, USA Introduction As Deep Learning (DL) datasets grow, efficiency becomes essential to support responsiveness [16]. Traditionally, DL frameworks embraced deferred execution-style DL code for fast execution. Hybrid approaches [2, 8, 13] execute imperative DL programs quickly. Hybridization Figure: Screenshot of the Hybridize Functions refactoring preview wizard. In TensorFlow [1], AutoGraph [13] can enhance run-time performance by decorating (annotating) appropriate Python function(s) with @tf.function (Fig. 1). Problems with Hybrid Approaches Require non-trivial metadata [12]. Exhibit limitations and known issues with native program constructs [9]. Are difficult to use correctly and efficiently (e.g., avoiding side-effects) [4]. Developers manually specifying which functions are converted. Insight Although imperative DL code typically executes sequentially, hybridization resembles parallelizing traditional sequential code. Automated Tool We design and implement a fully automated, open-source refactoring tool named Hybridize Functions [11] that transforms otherwise eagerly-executed imperative (Python) DL code for enhanced performance. Contributions Refactoring approach for automatically converting imperative DL code to graphs. Novel tensor analysis for imperative DL. Fully automated, open-source tool implemented as a PyDev [15] Eclipse [7] IDE plug-in that integrates static analyses from WALA [14] and Ariadne [6]. Architecture & Dependencies Figure: Overall architecture. Eclipse is leveraged for its existing, well documented and integrated refactoring framework and test engine [3], including transformation APIs (e.g., ASTRewrite), refactoring preview pane (Fig. 1), precondition checking (e.g., Refactoring. checkInitialConditions(), Refactoring. checkFinalPreconditions()), and refactoring testing (e.g., RefactoringTest). PyDev used for efficient program entity indexing, extensive refactoring support [3], and that it is completely open-source for all Python development. WALA is used for static analyses, such as ModRef, for which we built our side-effect analysis upon. Ariadne, which depends on WALA, is used for its Python and tensor analysis, including type inference and (TensorFlow) library modeling. Challenges Addressed Reworked much of the existing Java (JDT) refactoring tooling to work with Python. Integrated Ariadne with PyDev due to its excellent and long-lived refactoring support for Python, including refactoring preview pane, element GUI selection, and refactoring undo history. Augmented Ariadne to analyze imperative Deep Learning (Python) code by vastly expanding the XML summaries to support a wide variety of popular TensorFlow 2 APIs. Added support for Python constructs commonly used in modern imperative DL programs. Correlated varying intermediate representations (IRs) with the original Python source code. Modernizing Ariadne: New Enhancements Python module packages. Wild card imports. Intra-package references (relative imports; from .. import X). Package initialization scripts. Automatic unit test entry points discovery. Non-scalar tensor dataset [10] iteration. Modeling of additional libraries. Static and class methods analysis. Analysis of custom decorators. Callable object (functor) analysis (used in Keras). Evaluation Summary We applied our approach to 19 open-source Python imperative DL programs of varying size and domain, with thousands of source lines of code ranging from 0.12 to 36.72. Our tool considered 766 Python functions, automatically refactoring 42.56% despite being highly conservative. During a run-time performance evaluation, we measured an average relative model training speedup of 2.16 (memory consumption measurement pending). Differences in model accuracy and loss before and after refactoring were negligible. Conclusion Open-source, automated refactoring PyDev Eclipse plug-in, Hybridize Functions, assists developers with writing optimal imperative DL Python code. Integrates an Eclipse refactoring with WALA Ariadne Python static analyses. Future Work Explore incorporating advanced container-based analyses. Automatically split functions. References 1. Abadi, M. et al.: TensorFlow: A System for Large-Scale Machine Learning. In: OSDI (2016) 2. Apache, Hybridize. Apache MXNet documentation. (2021). https://mxnet.apache.org/versions/1.8. 0/api/python/docs/tutorials/packages/gluon/blocks/hybridize.html (visited on 04/08/2021) 3. Bäumer, D. et al.: “Integrating refactoring support into a Java development tool”. 4. Castro Vélez, T. et al.: Challenges in Migrating Imperative Deep Learning Programs to Graph Execution: An Empirical Study. In: MSR. MSR ’22. ACM (2022). https://doi.org/10.1145/3524842.3528455 5. Chollet, F.: Deep Learning with Python. Manning (2020) 6. Dolby, J. et al.: Ariadne. Analysis for Machine Learning Programs. In: MAPL, pp. 1–10. ACM (2018) 7. Eclipse Foundation, Eclipse IDE. (2024). https://eclipseide.org/ (visited on 09/10/2024) 8. Facebook Inc., PyTorch. TorchScript. en. (2019). https://pytorch.org/docs/stable/jit.html 9. Google LLC, Better performance with tf.function. (2021). https://tensorflow.org/guide/function 10. Google LLC, tf.data.Dataset. TensorFlow. Version 2.9.3. (2023). https : / / www . tensorflow . org / versions/r2.9/api_docs/python/tf/data/Dataset (visited on 12/15/2023) 11. Hybridize-Functions-Refactoring. (2024). https://github.com/ponder-lab/Hybridize-Functions- Refactoring (visited on 09/30/2024). 12. Jeong, E. et al.: Speculative Symbolic Graph Execution of Imperative Deep Learning Programs. SIGOPS Oper. Syst. Rev. 53(1), 26–33 (2019). https://doi.org/10.1145/3352020.3352025 13. Moldovan, D. et al.: AutoGraph: Imperative-style Coding with Graph-based Performance. (2019). arXiv: 1810.08061 [cs.PL]. 14. T.J. Watson Libraries for Analysis. (2024). https://github.com/wala/WALA (visited on 09/10/2024). original-date: 2012-04-05T18:57:03Z. 15. Zadrozny, F.: PyDev. (2023). https://www.pydev.org (visited on 05/31/2023) 16. Zhou, W. et al.: HARP: Holistic Analysis for Refactoring Python-Based Analytics Programs. In: ICSE (2020). https://doi.org/10.1145/3377811.3380434 Acknowledgments This material is supported in part by the National Science Foundation under awards CCF 2200343, CNS 2213763, and CCF 2343750. International Conference on Fundamental Approaches to Software Engineering, May 3–8, 2025, Hamilton, Canada