benf.org :  other :  cfr :  Java 9 String concatenation

Indified String concatenation

As of Java 9, string concatenation is no longer performed with StringBuilder, as previously. Java 9 introduced the StringConcatFactory, as described in JEP280. Another nice writeup is here.

CFR 133 will handle StringConcatFactory::makeConcat (javac with -XDstringConcat=indy), and StringConcatFactory::makeConcatWithConstants (-XDstringConcat=indyWithConstants - the default)

A few interesting observations

This mostly behaves entirely as expected, however there are a few things that are quite interesting points (from a decompilation point of view - as always, nothing in here is new or exciting!)

Constants can become string constants...

Consider

  public static void main(String [] args) {
    byte b1 = (byte)254;
    System.out.println("foo " + b1 + ((byte)254>(byte)1));
  }

This will get constant folded to "foo" + b1 + false, so in java 8, we'd get

  System.out.println("foo " + b1 + false);

However, now the true is embedded in the string arguments - if we compile with java9+, then decompile with --stringconcat=false, we see

  public static void main(String[] args) {
      byte b1 = -2;
      System.out.println((String)((Object)StringConcatFactory.makeConcatWithConstants(new Object[]{"foo \u0001false"}, (byte)b1)));
  }

Which means CFR will reconstruct as (decompiling without any overrides)

  public static void main(String[] args) {
    byte b1 = -2;
    System.out.println("foo " + b1 + "false");
  }

Empty strings get elided

  public void test(int y) {
      int x = y+2;
      int z = x;
      int a = z+1;
      if (a > 3) z = 2;
      System.out.print("" + a + x + "" + z);
  }

Becomes (when decompiled with --stringconcat=false)

  public void test(int y) {
    int x = y + 2;
    int z = x;
    int a = z + 1;
    if (a > 3) {
        z = 2;
    }
    System.out.print((String)((Object)StringConcatFactory.makeConcatWithConstants(new Object[]{"\u0001\u0001\u0001"}, (int)a, (int)x, (int)z)));
  }

As such, it's not really possible to infer missing empty strings! Similarly, addition of a single character can't be distinguished from a string.


Last updated 09/2018