1

I was extensively profiling a code till I found out that following code allocates more than 1GB of RAM on the latest Chrome version in private mode when the size of "array" is about 33MB, the size doesn't really matter, it's only a file that had this size with which I was running my tests. I don't know how to generate such a big Uint8Array in the code for you test so the code below cannot be run as is, but maybe you can understand it anyways and help me with this.

    const bytesToString = function (array) {
      let uint8Array = new Uint8Array(array);
      let length = uint8Array.byteLength;

      let stringToEncode = "";

      for (let i = 0; i < length; i++) {
        stringToEncode += String.fromCharCode(uint8Array[i]);
      }

      return stringToEncode;
   }

When uncommenting the "for loop", the RAM consumption stays at the same level while running my code, as soon as the "for loop" is active the consumption explodes to over 1GB. This of course gets at some point GC, but I have a general memory problem where the browser will crash eventually because of excessive memory consumption and I am trying to figure out if this function is the problem. I could see with the performance analyzer from Chrome that GC is being called many times, I don't know how the GC from Chrome works, because you can read many "Minor GC" and at some point at the end "Major GC" and I was wondering if "Minor GC" does not really mean that the RAM is being freed but rather being "collected" and only at a later point the "Major GC" really frees RAM. If this is the case I suppose that between calling this function and "Major GC" my code runs something that also needs more RAM than usual and then the browser crashes. If this is the case it is the question if there is a better implementation for my function or can I manipulate the GC? As far as I could read, I cannot.

2
  • "can I manipulate the GC?" you should rather write more performant code ... Commented Apr 19, 2019 at 15:59
  • @Jonas Wilms Yes, of course, the question is more for curiosity and for testing purposes. Commented Apr 19, 2019 at 16:12

2 Answers 2

1

Strings in JS are immutable, so every time you add a character, it will create a new string that is 1 character longer than the previous one. GC will not run until everything is done, so you're stuck with tons of strings of various lengths.

You need other ways of combining strings. In this case your whole function could be written as String.fromCharCode(...array) (though if you actually want to make a string from binary data, you should consider using TextDecoder instead, which supports various encodings, caveat being that it is not available in environments such as Node.js).

Update: String.fromCharCode doesn't seem to work for very large arrays (there is a limit to number of parameters to any function), so instead you could try to map the array into 1-character strings, and then join them together:

Array.prototype.map.call(uint8Array, c => String.fromCharCode(c)).join("")

(Note the use of Array.prototype.map instead of uint8Array.map, since the latter will truncate your results to Uint8)

Sign up to request clarification or add additional context in comments.

4 Comments

I already tried to work with TextDecoder in the past but it didn't work at the end because of how I was then using the result, encoding problems, etc.
I tested it on the fly and String.fromCharCode(...array) seems really better than with the loop, with 44MB array ~740MB Ram peek instead of 1.4GB with the loop.
Thanks for the update, since I had big eyes minutes ago where I saw that fromCharCode(...array) was returning "" after all. Could you please elaborate your update, since I'm still not a JS Pro I cannot read that solution completely, also what is "c"? The updated solution also doesn't work in IE11.
And I tested this now and this is actually worse than my solution in terms of memory consumption. I suppose this is similar to doing following that I already tried an actually needs more RAM than with my original post: let length = uint8Array.byteLength; let strings = []; strings.length = length; for (let i = 0; i < length; i++) { strings[i] = String.fromCharCode(uint8Array[i]);} let result=strings.join('');
0

I think TextDecoder is probably the proper solution. But if you insist, you could also try creating a blob and then reading from it.

let blob = new Blob([arrayBuffer], {type: 'application/octet-stream'});
let reader = new FileReader();
reader.onload = function (event) {
  console.log(event.target.result);
};
// Use if you want the UTF-8 encoded version
reader.readAsText(blob);
// Use if you for example need to use the result with "window.btoa" as it was in my case.
reader.readAsBinaryString(blob);

2 Comments

I edited the solution to reflect my solution based on your proposal, I think currently this is the best I could find.
The only problem is that "readAsBinaryString" is actually kinda deprecated developer.mozilla.org/en-US/docs/Web/API/FileReader/…

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.