I am trying to write a simple program for this interview question:
Write a function that checks for valid unicode byte sequence. A unicode sequence is encoded as: - first byte indicates number of subsequent bytes '11110000' means 4 subsequent data bytes - data bytes start with a '10xxxxxx'
public static void main(String[] args)
{
System.out.println(checkUnicode(new byte[] {(byte)'c'}));
}
/**
* Write a function that checks for valid unicode byte sequence. A unicode
* sequence is encoded as: - first byte indicates number of subsequent bytes
* '1111000' means 4 subsequent data bytes - data bytes start with a
* '10xxxxxx'
*
* @param unicodeChar
* @return
*/
public static boolean checkUnicode(byte[] unicodeChar)
{
byte b = unicodeChar[0];
int len = 0;
int temp = (int)b<<1;
while((int)temp<<1 == 0)
{
len++;
}
System.out.println(len);
if (unicodeChar.length == len)
{
for(int i = 1 ; i < len; i++)
{
// Check if Most significant 2 bits in the byte are '10'
// c0, in base 16, is 11000000 in binary
// 10000000, in base 2, is 128 in decimal
if( ( (int)unicodeChar[i]&0Xc0 )==128 )
{
continue;
}
else
{
return false;
}
}
return true;
}
else
{
return false;
}
}
The output I get is
99
false
Changed the conversion from char to byte array based on Chris Jester-Young's comment.
Can someone point me to right direction
Thanks
Made some modifications based on input from Ted Hopp.
P.S:
I got the question from some forum and I think it wasn't posted in correctly there, however I still decided to solve it and use it as is to prevent obfuscating it more, since I did not understand it completely either !