24

For long time we used naive approach to split strings in JS:

someString.split('');

But popularity of emoji forced us to change this approach - emoji characters (and other non-BMP characters) like 😂 are made of two "characters'.

String.fromCodePoint(128514).split(''); // array of 2 characters; can't embed due to StackOverflow limitations

So what is modern, correct and performant approach to this task?

4
  • I'm curious. Which StackOverflow limitations are you talking about? Commented Feb 5, 2016 at 11:44
  • It seems like I couldn't post question with result of JSON.stringify(String.fromCodePoint(128514).split('')) expression - it caused "Malformed URI" error thrown from jQuery and disallowed to post question. Commented Feb 5, 2016 at 11:48
  • @MrLister: I have added Meta post. Commented Feb 5, 2016 at 11:56
  • 1
    see mathiasbynens.be/notes/javascript-unicode for the big picture Commented Feb 5, 2016 at 11:57

5 Answers 5

25

Using spread in array literal :

const str = "🌍🤖😸🎉";
console.log([...str]);

Using for...of :

function split(str) {
  const arr = [];
  for (const char of str) {
    arr.push(char);
  }
   
  return arr;
}

const str = "🌍🤖😸🎉";
console.log(split(str));
Sign up to request clarification or add additional context in comments.

2 Comments

I think it should be noted that this unfortunately still doesn't cover a lot of emojis currently in use. E.g. [...'👱🏽‍♀️'] becomes ["👱", "🏽", "‍", "♀", "️"]. Which means no e.g. straightforward string reversal or symbol-wise comparison is possible.
See github.com/orling/grapheme-splitter as an example library, mind the open issues regarding zero-width-joiners. Maybe there's a newer library out there.
11

JavaScript has a new API (part of ES2023) called Intl.Segmenter that allows you to split strings based on graphemes (the user-perceived characters of a string). With this API, your split might look like so:

const split = (str) => {
  const itr = new Intl.Segmenter("en", {granularity: 'grapheme'}).segment(str);
  return Array.from(itr, ({segment}) => segment);
}
// See browser console for output
console.log(split('😂')); // ['😂']
console.log(split('é')); // ['é']
console.log(split('👨‍👩‍👦')); // ['👨‍👩‍👦']
console.log(split('❤️')); // ['❤️']
console.log(split('👱🏽‍♀️')); // ['👱🏽‍♀️']
<p>See browser console for logs</p>

This allows you to not only deal with emojis consisting of two code points such as 😂, but other characters also such as composite characters (eg: ), characters separated by ZWJs (eg: 👨‍👩‍👦), characters with variation selectors (eg: ❤️), characters with emoji modifiers (eg: 👱🏽‍♀️) etc. all of which can't be handled by invoking the iterator of strings (by using spread ..., for..of, Symbol.iterator etc.) as seen in the other answers, as these will only iterate the code points of your string.

1 Comment

Upvoted! Unfortunately not supported in Firefox as of this comment.
10

The best approach to this task is to use native String.prototype[Symbol.iterator] that's aware of Unicode characters. Consequently clean and easy approach to split Unicode character is Array.from used on string, e.g.:

const string = String.fromCodePoint(128514, 32, 105, 32, 102, 101, 101, 108, 32, 128514, 32, 97, 109, 97, 122, 105, 110, 128514);
Array.from(string);

1 Comment

same, the [...'👱🏽‍♀️'] becomes ["👱", "🏽", "‍", "♀", "️"]
5

A flag was introduced in ECMA 2015 to support unicode awareness in regex.

Adding u to your regex returns the complete character in your result.

const withFlag = `AB😂DE`.match(/./ug);
const withoutFlag = `AB😂DE`.match(/./g);

console.log(withFlag, withoutFlag);

There's a little more about it here

1 Comment

same, '👱🏽‍♀️'.match(/./ug) becomes ["👱", "🏽", "‍", "♀", "️"]
0

I did something like this somewhere I had to support older browsers and a ES5 minifier, probably will be useful to other

    if (Array.from && window.Symbol && window.Symbol.iterator) {
        array = Array.from(input[window.Symbol.iterator]());
    } else {
        array = ...; // maybe `input.split('');` as fallback if it doesn't matter
    }

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.