Is there a better/faster way to achieve this?
Yes: Do it on a server, not in the browser. This also has the advantage that you can do it once and reuse the information.
But assuming for some reason that's not possible:
The first thing you can do is stop making several million unnecessary function calls by only doing $(this) once per loop:
.....each(function () {
var $this = $(this);
words.push({
"page": pagenumber,
"content": word,
"hpos": $this.attr("HPOS"),
"vpos": $this.attr("VPOS"),
"width": $this.attr("WIDTH"),
"height": $this.attr("HEIGHT")
});
});
Normally repeatedly doing that isn't a big deal (though I'd avoid it anyway), but if you're doing this one and a quarter million times...
And if all of those attributes really are attributes, you can avoid the call entirely by cutting jQuery out of the middle:
.....each(function () {
words.push({
"page": pagenumber,
"content": word,
"hpos": this.getAttribute("HPOS"),
"vpos": this.getAttribute("VPOS"),
"width": this.getAttribute("WIDTH"),
"height": this.getAttribute("HEIGHT")
});
});
jQuery's attr function is great and smooths over all sorts of cross-browser hassles with certain attributes, but I don't think any of those four needs special handling, not even on IE, so you can just use DOM's getAttribute directly.
The next thing is that some JavaScript engines execute push more slowly than assigning to the end of the array. These two statements do the same thing:
myarray.push(entry);
// and
myarray[myarray.length] = entry;
But other engines process push as fast or faster than the assignment (it is, after all, a ripe target for optimization). So you might look at whether IE does push more slowly and, if so, switch to using assignment instead.