2

Suppose I have the following string: const test = "This is outside the HTML tag. <title>How to remove an HTML element using JavaScript ?</title>";

I'd like to remove the content within all HTML tags in that string. I have tried doing test.replace(/(<([^>]+)>)/gi, ''), but this only removes the HTML tags rather than all the content within it as well. I would expect the outcome to only be 'This is outside the HTML tag.'.

Is it possible to remove HTML tags and its contents within a string?

2
  • 1
    Does this answer your question? Strip HTML from Text JavaScript Commented Nov 21, 2022 at 10:46
  • @SimoneRossaini nope, that does the same thing as the regex in my post. Commented Nov 21, 2022 at 10:55

3 Answers 3

3

Rather than trying to remove the HTML element via Regex, it's much more straightforward to create and populate a DOM Fragment using:

let myDiv = document.createElement('div');
myDiv.innerHTML = test;

and then remove the <title> element from that, using:

myDivTitle = myDiv.querySelector('title');
myDiv.removeChild(myDivTitle);

Working Example (One Element):

const test = "This is outside the HTML tag. <title>How to remove an HTML element using JavaScript ?</title>";

let myDiv = document.createElement('div');
myDiv.innerHTML = test;
myDivTitle = myDiv.querySelector('title');
myDiv.removeChild(myDivTitle);
const testAfter = myDiv.innerHTML;
console.log(testAfter);


The above works for one element (<title>) but you stated:

I'd like to remove the content within all HTML tags in that string

so let's try something more ambitious, using:

myDiv.querySelectorAll('*')

Working Example (All Elements):

const test = "<title>How to remove an HTML element using JavaScript ?</title> This is outside the HTML tag. <h1>Here we go...</h1> So is this. <p>This is going to save a lot of time trying to come up with regex patterns</p> This too.";

let myDiv = document.createElement('div');
myDiv.innerHTML = test;
myDivElements = myDiv.querySelectorAll('*');

for (let myDivElement of myDivElements) {
  myDiv.removeChild(myDivElement);
}

const testAfter = myDiv.innerHTML;
console.log(testAfter);

Sign up to request clarification or add additional context in comments.

4 Comments

Thank you very much! This is indeed a very elegant solution.
@Adam - Happy to have saved you from Tony the Pony
Variable declaration missing from the following line: for (myDivElement of myDivElements) { -> for (let myDivElement of myDivElements) {
Nicely caught, @christopher.theagen - thank you. Updated.
2

You should try like this :

var html = "<p>Hello, <b>Frields</b>";
var div = document.createElement("div");
div.innerHTML = html;
alert(div.innerText); // Hello, Frields

2 Comments

Thank you for your answer. However, my question was to remove all the contents within the HTML tags (which I believe this does not do, unless I am mistaken).
Works perfectly for me with both the original example and a much bigger string.
1

You can replace everything between the two elements by putting a Wildcard character between two of your regex

const test = "This is outside the HTML tag. <title>How to remove an HTML element using JavaScript ?</title>";

console.log(test.replace(/(<([^>]+)>).*(<([^>]+)>)/, ''))

2 Comments

Thank you for your answer! This works in this particular example (which is what I had asked for). I have merely accepted the other response because it caters for a more complex example.
Here's a simpler regex: test.replace(/<.+?>.*<\/.+>/, '').trim()

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.