A Tool for Optimizing Java 8 Stream Software
via Automated Refactoring
Raffi Khatchadourian1,2
Yiming Tang2
Mehdi Bagherzadeh3
Syed
Ahmed3
IEEE International Working Conference on Source Code Analysis and Manipu-
lation
September 2018, Madrid, Spain
1
Computer Science, City University of New York (CUNY) Hunter College, USA
2
Computer Science, City University of New York (CUNY) Graduate Center, USA
3
Computer Science & Engineering, Oakland University, USA
Introduction
Streaming APIs
• Streaming APIs are widely-available in today’s mainstream,
Object-Oriented programming languages [Biboudis et al., 2015].
• Incorporate MapReduce-like operations on native data
structures like collections.
• Can make writing parallel code easier, less error-prone (avoid
data cases, thread contention).
1
Problem
• MapReduce traditionally runs in highly-distributed
environments with no shared memory.
• Streaming APIs typically execute on a single node under
multiple threads or cores in a shared memory space.
• Collections reside in local memory.
• Issues may arise from close ties between shared memory and
the operations.
• Developers must manually determine whether running stream
code in parallel is efficient and interference-free.
• Requires thorough understanding of the API.
• Error-prone, possibly requiring complex analysis.
• Omission-prone, optimization opportunities may be missed.
2
Solution
• Fully-automated refactoring tool named Optimize Streams.
• Transforms Java 8 stream code for improved performance.
• Publicly available as an open source Eclipse IDE1
plug-in.2
• Includes fully-functional UI, preview pane, and unit tests.
• Based on:
• Novel ordering analysis.
• Infers when maintaining ordering is necessary for semantics
preservation.
• Typestate analysis [Fink et al., 2008; Strom and Yemini, 1986].
• Augments the type system with “state.”
• Traditionally used for preventing resource usage errors.
1http://eclipse.org.
2Available at http://git.io/vpTLk.
3
• First to integrate automated refactoring with typestate analysis.3
• Uses WALA static analysis framework4
and the SAFE typestate
analysis engine.5
• Combines analysis results from varying IR representations (SSA,
AST).
3To the best of our knowledge.
4http://wala.sf.net
5http://git.io/vxwBs
4
Demonstration
Also available at http://youtu.be/YaSYH7n6y5s.
Detailed video entry point links:
• Demo start.
• Refactoring start.
• Refactoring end.
5
Evaluation
Preliminary Results
• Applied to 11 Java projects of varying size and domain with a
total of ∼642 KSLOC.
• 36.31% candidate streams were refactorable.
• Observed an initial average speedup of 1.55 during performance
testing.
• See paper for more details, including user feedback, as well as
tool and data set engineering challenges.
6
Conclusion
• Optimize Streams is an open source, automated refactoring tool
that assists developers with writing optimal Java 8 Stream code.
• Integrates an Eclipse refactoring with the advanced static
analyses offered by WALA and SAFE.
• 11 Java projects totaling ∼642 thousands of lines of code were
used in the tool’s assessment.
• A speedup of 1.55 on the refactored code was observed as part
of a preliminary study.
7
For Further Reading
Biboudis, Aggelos, Nick Palladinos, George Fourtounis, and Yannis Smaragdakis (2015).
“Streams à la carte: Extensible Pipelines with Object Algebras”. In: ECOOP,
pp. 591–613. doi: 10.4230/LIPIcs.ECOOP.2015.591.
Fink, Stephen J., Eran Yahav, Nurit Dor, G. Ramalingam, and Emmanuel Geay (2008).
“Effective Typestate Verification in the Presence of Aliasing”. In: ACM TOSEM 17.2,
pp. 91–934. doi: 10.1145/1348250.1348255.
Strom, Robert E and Shaula Yemini (1986). “Typestate: A programming language
concept for enhancing software reliability”. In: IEEE TSE SE-12.1, pp. 157–171. doi:
10.1109/tse.1986.6312929.
8
Provocative Statements
1. Streaming API usage does not match that of how the API
designers envisioned usage.
Question
What are the consequences for future versions of such APIs?
2. Using streaming APIs in mainstream, Object-Oriented languages
has many benefits, such as conciseness and succinct
parallelism, but hinders code reuse, thus promoting clones.
Question
Is writing multiple, similar lambda expressions easier than writing
reusable functions?
9

A Tool for Optimizing Java 8 Stream Software via Automated Refactoring

  • 1.
    A Tool forOptimizing Java 8 Stream Software via Automated Refactoring Raffi Khatchadourian1,2 Yiming Tang2 Mehdi Bagherzadeh3 Syed Ahmed3 IEEE International Working Conference on Source Code Analysis and Manipu- lation September 2018, Madrid, Spain 1 Computer Science, City University of New York (CUNY) Hunter College, USA 2 Computer Science, City University of New York (CUNY) Graduate Center, USA 3 Computer Science & Engineering, Oakland University, USA
  • 2.
  • 3.
    Streaming APIs • StreamingAPIs are widely-available in today’s mainstream, Object-Oriented programming languages [Biboudis et al., 2015]. • Incorporate MapReduce-like operations on native data structures like collections. • Can make writing parallel code easier, less error-prone (avoid data cases, thread contention). 1
  • 4.
    Problem • MapReduce traditionallyruns in highly-distributed environments with no shared memory. • Streaming APIs typically execute on a single node under multiple threads or cores in a shared memory space. • Collections reside in local memory. • Issues may arise from close ties between shared memory and the operations. • Developers must manually determine whether running stream code in parallel is efficient and interference-free. • Requires thorough understanding of the API. • Error-prone, possibly requiring complex analysis. • Omission-prone, optimization opportunities may be missed. 2
  • 5.
    Solution • Fully-automated refactoringtool named Optimize Streams. • Transforms Java 8 stream code for improved performance. • Publicly available as an open source Eclipse IDE1 plug-in.2 • Includes fully-functional UI, preview pane, and unit tests. • Based on: • Novel ordering analysis. • Infers when maintaining ordering is necessary for semantics preservation. • Typestate analysis [Fink et al., 2008; Strom and Yemini, 1986]. • Augments the type system with “state.” • Traditionally used for preventing resource usage errors. 1http://eclipse.org. 2Available at http://git.io/vpTLk. 3
  • 6.
    • First tointegrate automated refactoring with typestate analysis.3 • Uses WALA static analysis framework4 and the SAFE typestate analysis engine.5 • Combines analysis results from varying IR representations (SSA, AST). 3To the best of our knowledge. 4http://wala.sf.net 5http://git.io/vxwBs 4
  • 7.
  • 8.
    Also available athttp://youtu.be/YaSYH7n6y5s. Detailed video entry point links: • Demo start. • Refactoring start. • Refactoring end. 5
  • 9.
  • 10.
    Preliminary Results • Appliedto 11 Java projects of varying size and domain with a total of ∼642 KSLOC. • 36.31% candidate streams were refactorable. • Observed an initial average speedup of 1.55 during performance testing. • See paper for more details, including user feedback, as well as tool and data set engineering challenges. 6
  • 11.
  • 12.
    • Optimize Streamsis an open source, automated refactoring tool that assists developers with writing optimal Java 8 Stream code. • Integrates an Eclipse refactoring with the advanced static analyses offered by WALA and SAFE. • 11 Java projects totaling ∼642 thousands of lines of code were used in the tool’s assessment. • A speedup of 1.55 on the refactored code was observed as part of a preliminary study. 7
  • 13.
    For Further Reading Biboudis,Aggelos, Nick Palladinos, George Fourtounis, and Yannis Smaragdakis (2015). “Streams à la carte: Extensible Pipelines with Object Algebras”. In: ECOOP, pp. 591–613. doi: 10.4230/LIPIcs.ECOOP.2015.591. Fink, Stephen J., Eran Yahav, Nurit Dor, G. Ramalingam, and Emmanuel Geay (2008). “Effective Typestate Verification in the Presence of Aliasing”. In: ACM TOSEM 17.2, pp. 91–934. doi: 10.1145/1348250.1348255. Strom, Robert E and Shaula Yemini (1986). “Typestate: A programming language concept for enhancing software reliability”. In: IEEE TSE SE-12.1, pp. 157–171. doi: 10.1109/tse.1986.6312929. 8
  • 14.
    Provocative Statements 1. StreamingAPI usage does not match that of how the API designers envisioned usage. Question What are the consequences for future versions of such APIs? 2. Using streaming APIs in mainstream, Object-Oriented languages has many benefits, such as conciseness and succinct parallelism, but hinders code reuse, thus promoting clones. Question Is writing multiple, similar lambda expressions easier than writing reusable functions? 9