Uncovering	
  Performance	
  Problems	
  in	
  Java	
  
Applica5ons	
  with	
  Reference	
  Propaga5on	
  Profiling	
  

       Dacong	
  Yan1,	
  Guoqing	
  Xu2,	
  Atanas	
  Rountev1	
  
                                	
  
                                  1	
  Ohio	
  State	
  University	
  
                         2	
  University	
  of	
  California,	
  Irvine	
  
                                                                  	
  

           PRESTO:	
  Program	
  Analyses	
  and	
  So5ware	
  Tools	
  Research	
  Group,	
  Ohio	
  State	
  University	
  
Overview	
  
        •  Performance	
  inefficiencies	
  
           –  O5en	
  exist	
  in	
  Java	
  applicaKons	
  
           –  Excessive	
  memory	
  usage	
  
           –  Long	
  running	
  Kmes,	
  even	
  for	
  simple	
  tasks	
  
        •  Challenges	
  
           –  Limited	
  compiler	
  opKmizaKons	
  
           –  Complicated	
  behavior	
  
           –  Large	
  libraries	
  and	
  frameworks	
  
        •  SoluKon:	
  manual	
  tuning	
  assisted	
  with	
  performance	
  
           analysis	
  tools	
  

2	
  
An	
  Example	
  
 1 class Vec {
 2 double x, y;
 3 sub(v) {
 4    res=new Vec(x-v.x, y-v.y);
 5    return res;
 6 }
 7 }
 8 for (i = 0; i < N; i++) {
 9    t = in[i+2].sub(a[i-1]);
10    q[i] = t;
11     t = in[i+1].sub(a[i-2]);
12    // use of fields of t
13 }
          ……
80 t=q[*];
81 // use of fields of t
3	
  
An	
  Example	
  
 1 class Vec {
 2 double x, y;
 3 sub(v) {
 4    res=new Vec(x-v.x, y-v.y);
 5    return res;
 6 }
 7 }
 8 for (i = 0; i < N; i++) {
 9    t = in[i+2].sub(a[i-1]);
10    q[i] = t;
11     t = in[i+1].sub(a[i-2]);
12    // use of fields of t
13 }
          ……
80 t=q[*];
81 // use of fields of t
4	
  
An	
  Example	
  
 1 class Vec {
 2 double x, y;
 3 sub(v) {
 4    res=new Vec(x-v.x, y-v.y);
 5    return res;
 6 }
 7 }
 8 for (i = 0; i < N; i++) {
 9    t = in[i+2].sub(a[i-1]);
10    q[i] = t;
11     t = in[i+1].sub(a[i-2]);
12    // use of fields of t
13 }
          ……
80 t=q[*];
81 // use of fields of t
5	
  
An	
  Example	
  
 1 class Vec {
 2 double x, y;
 3 sub(v) {
 4    res=new Vec(x-v.x, y-v.y);
 5    return res;
 6 }
 7 }
 8 for (i = 0; i < N; i++) {
 9    t = in[i+2].sub(a[i-1]);
10    q[i] = t;
11     t = in[i+1].sub(a[i-2]);
12    // use of fields of t
13 }
          ……
80 t=q[*];
81 // use of fields of t
6	
  
An	
  Example	
  
 1 class Vec {
 2 double x, y;
 3 sub(v) {
 4    res=new Vec(x-v.x, y-v.y);
 5    return res;
 6 }
 7 }
 8 for (i = 0; i < N; i++) {
 9    t = in[i+2].sub(a[i-1]);
10    q[i] = t;
11     t = in[i+1].sub(a[i-2]);
12    // use of fields of t
13 }
          ……
80 t=q[*];
81 // use of fields of t
7	
  
An	
  Example	
  
 1 class Vec {
 2 double x, y;
 3 sub(v) {
 4    res=new Vec(x-v.x, y-v.y);
 5    return res;
 6 }
 7 }
 8 for (i = 0; i < N; i++) {
 9    t = in[i+2].sub(a[i-1]);
10    q[i] = t;
11     t = in[i+1].sub(a[i-2]);
12    // use of fields of t
13 }
          ……
80 t=q[*];
81 // use of fields of t
8	
  
ImplementaKon	
  
        •  Reference	
  propagaKon	
  profiling	
  
           –  Implemented	
  in	
  Jikes	
  RVM	
  3.1.1	
  
           –  Modify	
  the	
  runKme	
  compiler	
  for	
  code	
  instrumentaKon	
  
           –  Create	
  shadow	
  loca5ons	
  to	
  track	
  data	
  dependence	
  
           –  Instrument	
  method	
  calls	
  to	
  track	
  interprocedural	
  
              propaga5on	
  

        •  Overheads	
  
           –  Space:	
  2-­‐3×	
  
           –  Time:	
  30-­‐50×	
  	
  


9	
  
Reference	
  PropagaKon	
  Profiling	
  
  •  Intraprocedural	
  propagaKon	
  
           –  Shadows	
  for	
  every	
  memory	
  locaKon	
  (stack	
  and	
  heap)	
  to	
  
              record	
  last	
  assignment	
  that	
  writes	
  to	
  it	
  
           –  Update	
  shadows	
  and	
  the	
  graph	
  accordingly	
  
             Code                       Shadow                            Graph

         6 a = new A;              aʹ = RefAssign(6,6)
         7 b = a;                  bʹ = RefAssign(6,7)
         8 c = new C;              cʹ = RefAssign(8, 8)
         9 b.fld = c;              b.fldʹ = RefAssign(8, 9)




10	
  
Reference	
  PropagaKon	
  Profiling	
  
  •  Intraprocedural	
  propagaKon	
  
           –  Shadows	
  for	
  every	
  memory	
  locaKon	
  (stack	
  and	
  heap)	
  to	
  
              record	
  last	
  assignment	
  that	
  writes	
  to	
  it	
  
           –  Update	
  shadows	
  and	
  the	
  graph	
  accordingly	
  
             Code                       Shadow                            Graph

         6 a = new A;              aʹ = RefAssign(6,6)
         7 b = a;                  bʹ = RefAssign(6,7)
         8 c = new C;              cʹ = RefAssign(8, 8)
         9 b.fld = c;              b.fldʹ = RefAssign(8, 9)




11	
  
Reference	
  PropagaKon	
  Profiling	
  
  •  Intraprocedural	
  propagaKon	
  
           –  Shadows	
  for	
  every	
  memory	
  locaKon	
  (stack	
  and	
  heap)	
  to	
  
              record	
  last	
  assignment	
  that	
  writes	
  to	
  it	
  
           –  Update	
  shadows	
  and	
  the	
  graph	
  accordingly	
  
             Code                       Shadow                            Graph

         6 a = new A;              aʹ = RefAssign(6,6)
         7 b = a;                  bʹ = RefAssign(6,7)
         8 c = new C;              cʹ = RefAssign(8, 8)
         9 b.fld = c;              b.fldʹ = RefAssign(8, 9)




12	
  
Reference	
  PropagaKon	
  Profiling	
  
  •  Intraprocedural	
  propagaKon	
  
           –  Shadows	
  for	
  every	
  memory	
  locaKon	
  (stack	
  and	
  heap)	
  to	
  
              record	
  last	
  assignment	
  that	
  writes	
  to	
  it	
  
           –  Update	
  shadows	
  and	
  the	
  graph	
  accordingly	
  
             Code                       Shadow                            Graph

         6 a = new A;              aʹ = RefAssign(6,6)
         7 b = a;                  bʹ = RefAssign(6,7)
         8 c = new C;              cʹ = RefAssign(8, 8)
         9 b.fld = c;              b.fldʹ = RefAssign(8, 9)




13	
  
Reference	
  PropagaKon	
  Profiling	
  
  •  Intraprocedural	
  propagaKon	
  
           –  Shadows	
  for	
  every	
  memory	
  locaKon	
  (stack	
  and	
  heap)	
  to	
  
              record	
  last	
  assignment	
  that	
  writes	
  to	
  it	
  
           –  Update	
  shadows	
  and	
  the	
  graph	
  accordingly	
  
             Code                       Shadow                            Graph

         6 a = new A;              aʹ = RefAssign(6,6)
         7 b = a;                  bʹ = RefAssign(6,7)
         8 c = new C;              cʹ = RefAssign(8, 8)
         9 b.fld = c;              b.fldʹ = RefAssign(8, 9)




14	
  
Reference	
  PropagaKon	
  Profiling	
  
  •  Intraprocedural	
  propagaKon	
  
           –  Shadows	
  for	
  every	
  memory	
  locaKon	
  (stack	
  and	
  heap)	
  to	
  
              record	
  last	
  assignment	
  that	
  writes	
  to	
  it	
  
           –  Update	
  shadows	
  and	
  the	
  graph	
  accordingly	
  
             Code                       Shadow                            Graph

         6 a = new A;              aʹ = RefAssign(6,6)
         7 b = a;                  bʹ = RefAssign(6,7)
         8 c = new C;              cʹ = RefAssign(8, 8)
         9 b.fld = c;              b.fldʹ = RefAssign(8, 9)


  •  Interprocedural	
  propagaKon	
  
           –  Per-­‐thread	
  scratch	
  space	
  save	
  and	
  restore	
  shadows	
  for	
  
15	
  
              parameters	
  and	
  return	
  variables	
  
Client	
  Analyses
                          	
  




16	
  
Client	
  Analyses
                                                   	
  
    •  Not-­‐assigned-­‐to-­‐heap	
  (NATH)	
  analysis	
  
         –  Locate	
  producer	
  nodes	
  that	
  do	
  not	
  reach	
  heap	
  
            propagaKon	
  nodes	
  (heap	
  reads	
  and	
  writes)	
  
         –  Variant:	
  mostly-­‐NATH	
  analysis	
  




17	
  
Client	
  Analyses
                                                   	
  
    •  Not-­‐assigned-­‐to-­‐heap	
  (NATH)	
  analysis	
  
         –  Locate	
  producer	
  nodes	
  that	
  do	
  not	
  reach	
  heap	
  
            propagaKon	
  nodes	
  (heap	
  reads	
  and	
  writes)	
  
         –  Variant:	
  mostly-­‐NATH	
  analysis	
  
    •  Cost-­‐benefit	
  imbalance	
  analysis	
  
         –  Detect	
  imbalance	
  between	
  the	
  cost	
  of	
  interesKng	
  
            operaKons,	
  and	
  the	
  benefits	
  they	
  produce	
  
         –  For	
  example,	
  analysis	
  of	
  write	
  read	
  imbalance	
  




18	
  
Client	
  Analyses
                                                   	
  
    •  Not-­‐assigned-­‐to-­‐heap	
  (NATH)	
  analysis	
  
         –  Locate	
  producer	
  nodes	
  that	
  do	
  not	
  reach	
  heap	
  
            propagaKon	
  nodes	
  (heap	
  reads	
  and	
  writes)	
  
         –  Variant:	
  mostly-­‐NATH	
  analysis	
  
    •  Cost-­‐benefit	
  imbalance	
  analysis	
  
         –  Detect	
  imbalance	
  between	
  the	
  cost	
  of	
  interesKng	
  
            operaKons,	
  and	
  the	
  benefits	
  they	
  produce	
  
         –  For	
  example,	
  analysis	
  of	
  write	
  read	
  imbalance	
  
    •  Analysis	
  of	
  never-­‐used	
  allocaKons	
  
         –  IdenKfy	
  producer	
  nodes	
  that	
  do	
  not	
  reach	
  the	
  
            consumer	
  node	
  
         –  Variant:	
  analysis	
  of	
  rarely-­‐used	
  allocaKons	
  
19	
  
A	
  Real	
  Tuning	
  Session	
  
 1 class Vec {
 2 double x, y;
 3 sub(v) {
 4    res=new Vec(x-v.x, y-v.y);
 5    return res;
 6 }
 7 }
 8 for (i = 0; i < N; i++) {
 9    t = in[i+2].sub(a[i-1]);
10    q[i] = t;
11     t = in[i+1].sub(a[i-2]);
12    // use of fields of t
13 }
          ……
80 t=q[*];
81 // use of fields of t
20	
  
A	
  Real	
  Tuning	
  Session	
  
 1 class Vec {
 2 double x, y;
 3 sub(v) {
 4    res=new Vec(x-v.x, y-v.y);
 5    return res;
 6 }
 7 }
 8 for (i = 0; i < N; i++) {
 9    t = in[i+2].sub(a[i-1]);
10    q[i] = t;
11     t = in[i+1].sub(a[i-2]);
12    // use of fields of t
13 }
          ……
80 t=q[*];
81 // use of fields of t
21	
  
A	
  Real	
  Tuning	
  Session	
  
 1 class Vec {
 2 double x, y;
                                             1             2
 3 sub(v) {
 4    res=new Vec(x-v.x, y-v.y);
 5    return res;
 6 }
 7 }
 8 for (i = 0; i < N; i++) {
 9    t = in[i+2].sub(a[i-1]);  1
10    q[i] = t;
11     t = in[i+1].sub(a[i-2]);  2
12    // use of fields of t
13 }
          ……
80 t=q[*];
81 // use of fields of t
22	
  
A	
  Real	
  Tuning	
  Session	
  
 1 class Vec {
 2 double x, y;
                                             1             2
 3 sub(v) {
 4    res=new Vec(x-v.x, y-v.y);
 5    return res;
 6 }
 7 }
 8 for (i = 0; i < N; i++) {
 9    t = in[i+2].sub(a[i-1]);  1
10    q[i] = t;
11     t = in[i+1].sub(a[i-2]);  2
12    // use of fields of t
13 }
          ……
80 t=q[*];
81 // use of fields of t
23	
  
A	
  Real	
  Tuning	
  Session	
  
 1 class Vec {                             1 class Vec {
 2 double x, y;                            2 double x, y;
 3 sub(v) {                                3 sub_rev(v, res) {
 4    res=new Vec(x-v.x, y-v.y);           4    res.x = x-v.x;
 5    return res;                          5    res.y = y-v.y;
 6 }                                       6 }
 7 }                             tuning    7 } = new Vec; // reusable
                                             nt
 8 for (i = 0; i < N; i++) {               8 for (i = 0; i < N; i++) {
 9    t = in[i+2].sub(a[i-1]);             9    t = in[i+2].sub(a[i-1]);
10    q[i] = t;                           10    q[i] = t;
11     t = in[i+1].sub(a[i-2]);           11     in[i+1].sub_rev(a[i-2], nt);
12    // use of fields of t               12    // use of fields of nt
13 }                                      13 }
          ……                                       ……
80 t=q[*];                                80 t=q[*];
81 // use of fields of t                  81 // use of fields of t
24	
  
A	
  Real	
  Tuning	
  Session	
  
 1 class Vec {                            1 class Vec {
 2 double x, y;                           2 double x, y;
 3 sub(v) {                               3 sub_rev(v, res) {
 4    res=new Vec(x-v.x, y-v.y);          4    res.x = x-v.x;
 5    return res;                         5    res.y = y-v.y;
 6 }                                      6 }
 7 }                             tuning   7 } = new Vec; // reusable
                                            nt
 8 for (i = 0; i < N; i++) {              8 for (i = 0; i < N; i++) {
 9    t = in[i+2].sub(a[i-1]);            9    t = in[i+2].sub(a[i-1]);
10    q[i] = t;                          10    q[i] = t;
11     t = in[i+1].sub(a[i-2]);          11     in[i+1].sub_rev(a[i-2], nt);
12    // use of fields of t              12    // use of fields of nt
13 }                                     13 }
          ……                                      ……
80 t=q[*];          Reductions: 13% in running time and
                                         80 t=q[*];
81 // use of fields of t 73% in #allocated objectsof fields of t
                                         81 // use
25	
  
Examples	
  of	
  Inefficiency	
  Pa`erns
                                                      	
  
    •  Temporary	
  objects	
  for	
  method	
  returns	
  
        –  ReducKons	
  for	
  euler:	
  13%	
  in	
  running	
  Kme	
  and	
  73%	
  in	
  
           #allocated	
  objects	
  
    •  Redundant	
  data	
  representaKon	
  
        – mst:	
  63%	
  and	
  40%	
  
    •  Unnecessary	
  eager	
  object	
  creaKon	
  
        – chart:	
  8%	
  and	
  8%
        – jflex:	
  3%	
  and	
  27%	
  
    •  Expensive	
  specializaKon	
  for	
  sanity	
  checks	
  
        – bloat:	
  10%	
  and	
  11%	
  
26	
  
Conclusions
    •  Reference	
  propagaKon	
  profiling	
  in	
  Jikes	
  RVM	
  
    •  Understanding	
  reference	
  propagaKon	
  is	
  a	
  good	
  
       starKng	
  point	
  for	
  performance	
  tuning	
  
    •  Client	
  analyses	
  can	
  uncover	
  performance	
  
       inefficiencies,	
  and	
  lead	
  to	
  effecKve	
  tuning	
  soluKons	
  




27	
  
Thank	
  	
  you	
  
    	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  




28	
  

Uncovering Performance Problems in Java Applications with Reference Propagation Profiling

  • 1.
    Uncovering  Performance  Problems  in  Java   Applica5ons  with  Reference  Propaga5on  Profiling   Dacong  Yan1,  Guoqing  Xu2,  Atanas  Rountev1     1  Ohio  State  University   2  University  of  California,  Irvine     PRESTO:  Program  Analyses  and  So5ware  Tools  Research  Group,  Ohio  State  University  
  • 2.
    Overview   •  Performance  inefficiencies   –  O5en  exist  in  Java  applicaKons   –  Excessive  memory  usage   –  Long  running  Kmes,  even  for  simple  tasks   •  Challenges   –  Limited  compiler  opKmizaKons   –  Complicated  behavior   –  Large  libraries  and  frameworks   •  SoluKon:  manual  tuning  assisted  with  performance   analysis  tools   2  
  • 3.
    An  Example   1 class Vec { 2 double x, y; 3 sub(v) { 4 res=new Vec(x-v.x, y-v.y); 5 return res; 6 } 7 } 8 for (i = 0; i < N; i++) { 9 t = in[i+2].sub(a[i-1]); 10 q[i] = t; 11 t = in[i+1].sub(a[i-2]); 12 // use of fields of t 13 } …… 80 t=q[*]; 81 // use of fields of t 3  
  • 4.
    An  Example   1 class Vec { 2 double x, y; 3 sub(v) { 4 res=new Vec(x-v.x, y-v.y); 5 return res; 6 } 7 } 8 for (i = 0; i < N; i++) { 9 t = in[i+2].sub(a[i-1]); 10 q[i] = t; 11 t = in[i+1].sub(a[i-2]); 12 // use of fields of t 13 } …… 80 t=q[*]; 81 // use of fields of t 4  
  • 5.
    An  Example   1 class Vec { 2 double x, y; 3 sub(v) { 4 res=new Vec(x-v.x, y-v.y); 5 return res; 6 } 7 } 8 for (i = 0; i < N; i++) { 9 t = in[i+2].sub(a[i-1]); 10 q[i] = t; 11 t = in[i+1].sub(a[i-2]); 12 // use of fields of t 13 } …… 80 t=q[*]; 81 // use of fields of t 5  
  • 6.
    An  Example   1 class Vec { 2 double x, y; 3 sub(v) { 4 res=new Vec(x-v.x, y-v.y); 5 return res; 6 } 7 } 8 for (i = 0; i < N; i++) { 9 t = in[i+2].sub(a[i-1]); 10 q[i] = t; 11 t = in[i+1].sub(a[i-2]); 12 // use of fields of t 13 } …… 80 t=q[*]; 81 // use of fields of t 6  
  • 7.
    An  Example   1 class Vec { 2 double x, y; 3 sub(v) { 4 res=new Vec(x-v.x, y-v.y); 5 return res; 6 } 7 } 8 for (i = 0; i < N; i++) { 9 t = in[i+2].sub(a[i-1]); 10 q[i] = t; 11 t = in[i+1].sub(a[i-2]); 12 // use of fields of t 13 } …… 80 t=q[*]; 81 // use of fields of t 7  
  • 8.
    An  Example   1 class Vec { 2 double x, y; 3 sub(v) { 4 res=new Vec(x-v.x, y-v.y); 5 return res; 6 } 7 } 8 for (i = 0; i < N; i++) { 9 t = in[i+2].sub(a[i-1]); 10 q[i] = t; 11 t = in[i+1].sub(a[i-2]); 12 // use of fields of t 13 } …… 80 t=q[*]; 81 // use of fields of t 8  
  • 9.
    ImplementaKon   •  Reference  propagaKon  profiling   –  Implemented  in  Jikes  RVM  3.1.1   –  Modify  the  runKme  compiler  for  code  instrumentaKon   –  Create  shadow  loca5ons  to  track  data  dependence   –  Instrument  method  calls  to  track  interprocedural   propaga5on   •  Overheads   –  Space:  2-­‐3×   –  Time:  30-­‐50×     9  
  • 10.
    Reference  PropagaKon  Profiling   •  Intraprocedural  propagaKon   –  Shadows  for  every  memory  locaKon  (stack  and  heap)  to   record  last  assignment  that  writes  to  it   –  Update  shadows  and  the  graph  accordingly   Code Shadow Graph 6 a = new A; aʹ = RefAssign(6,6) 7 b = a; bʹ = RefAssign(6,7) 8 c = new C; cʹ = RefAssign(8, 8) 9 b.fld = c; b.fldʹ = RefAssign(8, 9) 10  
  • 11.
    Reference  PropagaKon  Profiling   •  Intraprocedural  propagaKon   –  Shadows  for  every  memory  locaKon  (stack  and  heap)  to   record  last  assignment  that  writes  to  it   –  Update  shadows  and  the  graph  accordingly   Code Shadow Graph 6 a = new A; aʹ = RefAssign(6,6) 7 b = a; bʹ = RefAssign(6,7) 8 c = new C; cʹ = RefAssign(8, 8) 9 b.fld = c; b.fldʹ = RefAssign(8, 9) 11  
  • 12.
    Reference  PropagaKon  Profiling   •  Intraprocedural  propagaKon   –  Shadows  for  every  memory  locaKon  (stack  and  heap)  to   record  last  assignment  that  writes  to  it   –  Update  shadows  and  the  graph  accordingly   Code Shadow Graph 6 a = new A; aʹ = RefAssign(6,6) 7 b = a; bʹ = RefAssign(6,7) 8 c = new C; cʹ = RefAssign(8, 8) 9 b.fld = c; b.fldʹ = RefAssign(8, 9) 12  
  • 13.
    Reference  PropagaKon  Profiling   •  Intraprocedural  propagaKon   –  Shadows  for  every  memory  locaKon  (stack  and  heap)  to   record  last  assignment  that  writes  to  it   –  Update  shadows  and  the  graph  accordingly   Code Shadow Graph 6 a = new A; aʹ = RefAssign(6,6) 7 b = a; bʹ = RefAssign(6,7) 8 c = new C; cʹ = RefAssign(8, 8) 9 b.fld = c; b.fldʹ = RefAssign(8, 9) 13  
  • 14.
    Reference  PropagaKon  Profiling   •  Intraprocedural  propagaKon   –  Shadows  for  every  memory  locaKon  (stack  and  heap)  to   record  last  assignment  that  writes  to  it   –  Update  shadows  and  the  graph  accordingly   Code Shadow Graph 6 a = new A; aʹ = RefAssign(6,6) 7 b = a; bʹ = RefAssign(6,7) 8 c = new C; cʹ = RefAssign(8, 8) 9 b.fld = c; b.fldʹ = RefAssign(8, 9) 14  
  • 15.
    Reference  PropagaKon  Profiling   •  Intraprocedural  propagaKon   –  Shadows  for  every  memory  locaKon  (stack  and  heap)  to   record  last  assignment  that  writes  to  it   –  Update  shadows  and  the  graph  accordingly   Code Shadow Graph 6 a = new A; aʹ = RefAssign(6,6) 7 b = a; bʹ = RefAssign(6,7) 8 c = new C; cʹ = RefAssign(8, 8) 9 b.fld = c; b.fldʹ = RefAssign(8, 9) •  Interprocedural  propagaKon   –  Per-­‐thread  scratch  space  save  and  restore  shadows  for   15   parameters  and  return  variables  
  • 16.
  • 17.
    Client  Analyses   •  Not-­‐assigned-­‐to-­‐heap  (NATH)  analysis   –  Locate  producer  nodes  that  do  not  reach  heap   propagaKon  nodes  (heap  reads  and  writes)   –  Variant:  mostly-­‐NATH  analysis   17  
  • 18.
    Client  Analyses   •  Not-­‐assigned-­‐to-­‐heap  (NATH)  analysis   –  Locate  producer  nodes  that  do  not  reach  heap   propagaKon  nodes  (heap  reads  and  writes)   –  Variant:  mostly-­‐NATH  analysis   •  Cost-­‐benefit  imbalance  analysis   –  Detect  imbalance  between  the  cost  of  interesKng   operaKons,  and  the  benefits  they  produce   –  For  example,  analysis  of  write  read  imbalance   18  
  • 19.
    Client  Analyses   •  Not-­‐assigned-­‐to-­‐heap  (NATH)  analysis   –  Locate  producer  nodes  that  do  not  reach  heap   propagaKon  nodes  (heap  reads  and  writes)   –  Variant:  mostly-­‐NATH  analysis   •  Cost-­‐benefit  imbalance  analysis   –  Detect  imbalance  between  the  cost  of  interesKng   operaKons,  and  the  benefits  they  produce   –  For  example,  analysis  of  write  read  imbalance   •  Analysis  of  never-­‐used  allocaKons   –  IdenKfy  producer  nodes  that  do  not  reach  the   consumer  node   –  Variant:  analysis  of  rarely-­‐used  allocaKons   19  
  • 20.
    A  Real  Tuning  Session   1 class Vec { 2 double x, y; 3 sub(v) { 4 res=new Vec(x-v.x, y-v.y); 5 return res; 6 } 7 } 8 for (i = 0; i < N; i++) { 9 t = in[i+2].sub(a[i-1]); 10 q[i] = t; 11 t = in[i+1].sub(a[i-2]); 12 // use of fields of t 13 } …… 80 t=q[*]; 81 // use of fields of t 20  
  • 21.
    A  Real  Tuning  Session   1 class Vec { 2 double x, y; 3 sub(v) { 4 res=new Vec(x-v.x, y-v.y); 5 return res; 6 } 7 } 8 for (i = 0; i < N; i++) { 9 t = in[i+2].sub(a[i-1]); 10 q[i] = t; 11 t = in[i+1].sub(a[i-2]); 12 // use of fields of t 13 } …… 80 t=q[*]; 81 // use of fields of t 21  
  • 22.
    A  Real  Tuning  Session   1 class Vec { 2 double x, y; 1 2 3 sub(v) { 4 res=new Vec(x-v.x, y-v.y); 5 return res; 6 } 7 } 8 for (i = 0; i < N; i++) { 9 t = in[i+2].sub(a[i-1]); 1 10 q[i] = t; 11 t = in[i+1].sub(a[i-2]); 2 12 // use of fields of t 13 } …… 80 t=q[*]; 81 // use of fields of t 22  
  • 23.
    A  Real  Tuning  Session   1 class Vec { 2 double x, y; 1 2 3 sub(v) { 4 res=new Vec(x-v.x, y-v.y); 5 return res; 6 } 7 } 8 for (i = 0; i < N; i++) { 9 t = in[i+2].sub(a[i-1]); 1 10 q[i] = t; 11 t = in[i+1].sub(a[i-2]); 2 12 // use of fields of t 13 } …… 80 t=q[*]; 81 // use of fields of t 23  
  • 24.
    A  Real  Tuning  Session   1 class Vec { 1 class Vec { 2 double x, y; 2 double x, y; 3 sub(v) { 3 sub_rev(v, res) { 4 res=new Vec(x-v.x, y-v.y); 4 res.x = x-v.x; 5 return res; 5 res.y = y-v.y; 6 } 6 } 7 } tuning 7 } = new Vec; // reusable nt 8 for (i = 0; i < N; i++) { 8 for (i = 0; i < N; i++) { 9 t = in[i+2].sub(a[i-1]); 9 t = in[i+2].sub(a[i-1]); 10 q[i] = t; 10 q[i] = t; 11 t = in[i+1].sub(a[i-2]); 11 in[i+1].sub_rev(a[i-2], nt); 12 // use of fields of t 12 // use of fields of nt 13 } 13 } …… …… 80 t=q[*]; 80 t=q[*]; 81 // use of fields of t 81 // use of fields of t 24  
  • 25.
    A  Real  Tuning  Session   1 class Vec { 1 class Vec { 2 double x, y; 2 double x, y; 3 sub(v) { 3 sub_rev(v, res) { 4 res=new Vec(x-v.x, y-v.y); 4 res.x = x-v.x; 5 return res; 5 res.y = y-v.y; 6 } 6 } 7 } tuning 7 } = new Vec; // reusable nt 8 for (i = 0; i < N; i++) { 8 for (i = 0; i < N; i++) { 9 t = in[i+2].sub(a[i-1]); 9 t = in[i+2].sub(a[i-1]); 10 q[i] = t; 10 q[i] = t; 11 t = in[i+1].sub(a[i-2]); 11 in[i+1].sub_rev(a[i-2], nt); 12 // use of fields of t 12 // use of fields of nt 13 } 13 } …… …… 80 t=q[*]; Reductions: 13% in running time and 80 t=q[*]; 81 // use of fields of t 73% in #allocated objectsof fields of t 81 // use 25  
  • 26.
    Examples  of  Inefficiency  Pa`erns   •  Temporary  objects  for  method  returns   –  ReducKons  for  euler:  13%  in  running  Kme  and  73%  in   #allocated  objects   •  Redundant  data  representaKon   – mst:  63%  and  40%   •  Unnecessary  eager  object  creaKon   – chart:  8%  and  8% – jflex:  3%  and  27%   •  Expensive  specializaKon  for  sanity  checks   – bloat:  10%  and  11%   26  
  • 27.
    Conclusions •  Reference  propagaKon  profiling  in  Jikes  RVM   •  Understanding  reference  propagaKon  is  a  good   starKng  point  for  performance  tuning   •  Client  analyses  can  uncover  performance   inefficiencies,  and  lead  to  effecKve  tuning  soluKons   27  
  • 28.
    Thank    you                                     28