5

I am having an issue URL decoding a UTF-8 string in Java that is encoded either with Javascript or Actionscript 3. I've set up a test case as follows:

The string in question is Produktgröße

When I encode with JS/AS3 I get the following string:

escape('Produktgröße')

Produktgr%F6%DFe

When I unescape this with JS I get no change

unescape('Produktgr%F6%DFe')

Produktgr%F6%DFe

So, by this I assume that JS isn't encoding the string properly??

The following JSP produces this outupt

<%@page import="java.net.URLEncoder"%>
<%@page import="java.net.URLDecoder"%>
<%=(URLDecoder.decode("Produktgr%F6%DFe","UTF-8"))%><br/>
<%=(URLEncoder.encode("Produktgröße","UTF-8"))%><br/>
<%=(URLEncoder.encode("Produktgröße"))%><br/>
<%=(URLDecoder.decode(URLEncoder.encode("Produktgröße")))%><br/>
<%=(URLDecoder.decode(URLEncoder.encode("Produktgröße"),"UTF-8"))%><br/>

Produktgr?e

Produktgr%C3%B6%C3%9Fe

Produktgr%C3%B6%C3%9Fe

Produktgröße

Produktgröße

Any idea why I'm having this disparity with the languages and why JS/AS3 isn't behaving as I expect it to?

Thanks.

3 Answers 3

10

escape is a deprecated function and does not correctly encode Unicode characters. Use encodeURI or encodeURIComponent, the latter probably being the method most suitable for your needs.

Sign up to request clarification or add additional context in comments.

Comments

1

Javascript is URL encoding your string using Latin-1 charset. Java is URL encoding it using UTF-8.

The URL encoding is really just replacing the characters/bytes that it doesn't recognise. For example, even if you were to stick with ASCII characters, ( would be encoded as %28. You have the additional problem of character sets when you start using non-ASCII characters (any thing longer than 7 bits).

Comments

1

I have been struggling with this problem for hours on end... My problem was a JQuery Ajax call like:

return $.ajax({
        url: '/author!getAuthorContent.action',
        type: 'GET',
        data : {author:name, 'content_type': ct || 'all', 'start': start || 0}
    });

'name' is a String which contains special characters like Jérôme-Serrano

For some reasons the way JS/JQuery was encoding these kind of special characters was incompatible and I couldn't decode it on Java BackEnd...

The solution was:

  • Encode on JS side using var econded = encodeURIComponent(name);
  • Decode them on Java side using String decoded = java.net.URLDecoder.decode(econded ,"UTF-8");

some refetences: http://www.programering.com/a/MjN2ADOwATg.html http://www.theerrormessage.com/2013/10/weird-characters-transmitted-to-and-from-server-through-jquery-ajax-call/

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.