Is there a native way to HTML escape character entities in javascript?

Question

The Unicode character 𠮵 given by point 134069, has the HTML escape 𠮵

Is there a (preferably native) way to get the HTML escapes for character entities from Javascript?

Gremash · Accepted Answer · 2019-04-09 23:40:36Z

2

You can get both the point and hex values of the char like this:

var codePoint = '𠮵'.codePointAt(0); //codePoint = 134069
var hexValue = '𠮵'.codePointAt(0).toString(16); //hexValue = 20bb5
var htmlEscape = '&#x' + hexValue + ';'; //htmlEscape = &#x20bb5;

Here is a working example:

$('#doIt').click(function() {
  $('#outputHex').html($('#inputText').val().codePointAt(0).toString(16));
  $('#outputString').html('&amp;#x' + $('#inputText').val().codePointAt(0).toString(16) + ';');
  $('#outputChar').html('&#x' + $('#inputText').val().codePointAt(0).toString(16) + ';');
});

code {
  display: block;
  padding: 4px;
  background-color: #EFEFEF;
}

<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<textarea id="inputText"></textarea>
<button id="doIt">do it</button>

<h3>result</h3>
<code id="outputHex"></code>
<code id="outputString"></code>
<code id="outputChar"></code>

One more thing, codePointAt is an ES6 function and isn't supported in older browsers. In case the browser blocks the code from running here: JSFiddle Example

edited Apr 9, 2019 at 23:40

answered Apr 1, 2016 at 23:01

Gremash

8,3387 gold badges32 silver badges50 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Startec Over a year ago

This is a good idea however I do not think that it works for all HTML entities (i.e. the & is more complicated correct)?

Gremash Over a year ago

No. this should work for any character. There are special characters, like & that have shortcuts, i.e. & but this will also work. I will update my answer with a JFiddle example.

trincot · Accepted Answer · 2016-04-01 23:26:51Z

2

Here is a function that converts all non-ASCII7 characters, and <, >, & to HTML entities:

function htmlEntities(s) {
    return Array.from(s).map(function (c) {
        return c.codePointAt(0) < 128 && '<&>'.indexOf(c) == -1 
            ? c 
            : '&#x' + c.codePointAt(0).toString(16) + ';';
    }).join('');
}

var s = 'This is \u{20BB5}, a special character & encoded in HTML.';
document.body.innerHTML = htmlEntities(s);

Be aware that in Javascript strings, extended unicode characters are counted as two characters (for example in length). The ES6 constructs like Array.from, [...s] make sure you get the right chunks.

answered Apr 1, 2016 at 23:26

trincot

357k38 gold badges282 silver badges338 bronze badges

2 Comments

Alan H. Over a year ago

Wow, this will needlessly bloat HTML for many languages. 99.9% of non-ASCII characters do not need to be encoded as entities. For example, this would make Chinese text take at least 5× more HTML bytes. To anyone reading this, be sure that this is really the solution you need. It probably isn't.

trincot Over a year ago

@AlanH., it seemed what the OP was asking -- they gave as example 𠮵, which indeed does not need to be encoded as entity, but yet the OP asked for it. It would be more useful if you would put such a comment to the OP...

Collectives™ on Stack Overflow

Is there a native way to HTML escape character entities in javascript?

2 Answers 2

2 Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related