0

When trying to construct a String from a array of type byte[] and then converting back to byte[] array using String.getBytes(), some byte values are modified. Below is a piece of code that reproduces my issue:

public static void main(String[] args)
{
    byte[] arr = new byte[] { (byte)0xff, 0x5e};
    String str = new String(arr);
    byte[] arr2 = str.getBytes();
    for(int i = 0; i < 2; i++)
        System.out.print(String.format("%02X ", arr2[i]));
    for(int i = 0; i < 2; i++)
        System.out.print(String.format("%02X ", arr[i]));
}

The output is as follows:

3F 5E FF 5E 

I have tried conversion with all of standard charsets and yet the result is the same. For a reason I'm not able to figure out, 0xFF becomes 0x3F... Why, and how do I correct this?

4
  • 1
    Cannot reproduce the problem. This code depends on the default charset encoding of the JVM. Commented Jan 12, 2018 at 21:13
  • Which is why we don't use the default encoding, but specify it when using getBytes() and new String(). A 1-byte encoding such as ISO-8859-1 should work just fine. Commented Jan 12, 2018 at 21:14
  • With a platform default of UTF-8, you see the problem, but using e.g. new String(arr, "ISO-8859-1"); the conversion works okay. Commented Jan 12, 2018 at 21:15
  • passing "ISO-8859-1" to getBytes has solved the problem. Thanks Mick! Commented Jan 12, 2018 at 21:17

1 Answer 1

1

After some helpful answers, here is how I got it to work:

public static void main(String[] args)
{
    byte[] arr = new byte[] { (byte)0xff, 0x5e};
    String str = new String(arr, Charset.forName("ISO-8859-1"));
    byte[] arr2 = str.getBytes(Charset.forName("ISO-8859-1"));
    for(int i = 0; i < 2; i++)
        System.out.print(String.format("%02X ", arr2[i]));
    for(int i = 0; i < 2; i++)
        System.out.print(String.format("%02X ", arr[i]));
}

The charset used above has allowed me to use bytes of any value without having them converted, which is useful for working with binary data.

Sign up to request clarification or add additional context in comments.

1 Comment

If you're working with binary data, you don't convert bytes to characters. Characters are for text data.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.