The Unicode character 𠮵 given by point 134069, has the HTML escape 𠮵
Is there a (preferably native) way to get the HTML escapes for character entities from Javascript?
You can get both the point and hex values of the char like this:
var codePoint = '𠮵'.codePointAt(0); //codePoint = 134069
var hexValue = '𠮵'.codePointAt(0).toString(16); //hexValue = 20bb5
var htmlEscape = '&#x' + hexValue + ';'; //htmlEscape = 𠮵
Here is a working example:
$('#doIt').click(function() {
$('#outputHex').html($('#inputText').val().codePointAt(0).toString(16));
$('#outputString').html('&#x' + $('#inputText').val().codePointAt(0).toString(16) + ';');
$('#outputChar').html('&#x' + $('#inputText').val().codePointAt(0).toString(16) + ';');
});
code {
display: block;
padding: 4px;
background-color: #EFEFEF;
}
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<textarea id="inputText"></textarea>
<button id="doIt">do it</button>
<h3>result</h3>
<code id="outputHex"></code>
<code id="outputString"></code>
<code id="outputChar"></code>
One more thing, codePointAt is an ES6 function and isn't supported in older browsers. In case the browser blocks the code from running here: JSFiddle Example
& is more complicated correct)?Here is a function that converts all non-ASCII7 characters, and <, >, & to HTML entities:
function htmlEntities(s) {
return Array.from(s).map(function (c) {
return c.codePointAt(0) < 128 && '<&>'.indexOf(c) == -1
? c
: '&#x' + c.codePointAt(0).toString(16) + ';';
}).join('');
}
var s = 'This is \u{20BB5}, a special character & encoded in HTML.';
document.body.innerHTML = htmlEntities(s);
Be aware that in Javascript strings, extended unicode characters are counted as two characters (for example in length). The ES6 constructs like Array.from, [...s] make sure you get the right chunks.
𠮵, which indeed does not need to be encoded as entity, but yet the OP asked for it. It would be more useful if you would put such a comment to the OP...