0

I would like to remove all single comments // and block comments /* */ of a line inside a string. However, if those comments are inside the characters '', "", [] and/or `` it should not be removed.

Examples:

='a' &
'//b/*aasdsa //dsfds*/'& //comment
'//d' /*aasdsa //dsfds*/ & '//e' & /*aasdsa //dsfds*/ 
& 'c'
[// /* [] */] & h
`// /* [] */` & p

Should be:

='a' &
'//b'& 
'//d'  & '//e' &  
& 'c'
 & h
 & p

I have tried different solutions but I haven't got too far.

let text = "='a' &
'//b'& //comment expression 
'//d' /*aasdsa //dsfds*/ & '//e' & /*aasdsa //dsfds*/ 
& 'c'";
let arrayText = text.split('\n');
arrayText = arrayText.filter(a => a.indexOf('//') !== 0);
arrayContent = arrayContent.map(x =>
x.replace(/[^\(?<=\').*(?=\'$)|\(?<=\[).*(?=\]$](\/\*[\s\S]*?\*\/|\/\/.*)/gm, ''));
text = arrayContent.join(' ');

With the solution that I tried I just got the following:

Text:

='a' &
'//b/*aasdsa //dsfds*/'& //comment
'//d' /*aasdsa //dsfds*/ & '//e' & /*aasdsa //dsfds*/ 
& 'c'
[// /* [] */] & h
`// /* [] */` & p

Result with my solution (that doesn't work)

='a' &
'//b'&
'//d' & '//e' & 
& 'c'
[//]

Expected result:

='a' &
'//b/*aasdsa //dsfds*/'& 
'//d'  & '//e' &  
& 'c'
[// /* [] */] & h
`// /* [] */` & p

I would appreciate if somebody can point me what I am missing or any other hint.

3
  • 1
    Your 2nd sample input contains a block comment inside of the '//b' string that gets removed. Is that intentional? Commented Feb 13, 2020 at 20:06
  • IMO regex isn't the proper tool for this. I would consider using a tokenizer/parser instead. The regex you have is already heinously complex and difficult to understand, and I fully expect you will continually be running into inputs that breaks the regex. Commented Feb 13, 2020 at 20:08
  • See stackoverflow.com/questions/42287216/… for an example of how much more complex your regex can get. Commented Feb 13, 2020 at 20:09

1 Answer 1

1

As mentioned in the comments, regex isn't a proper tool for something normally done by tokenizers. For this particular use case, you could write a simple rule-based parser like this:

const rules = [
  { start: '[', end: ']', remove: false },
  { start: "'", end: "'", remove: false },
  { start: '"', end: '"', remove: false },
  { start: '`', end: '`', remove: false },
  { start: '//', end: {EOL:true}, remove: true },
  { start: '/*', end: '*/', remove: true },
];

function removeComments(str) {
  let start = -1, rule = null;
  //iterate over the input, character by character
  for(let i=0; i<str.length; i++) {
    if(!rule) {
      //if not currently in a 'group' (either string or comment) search for one
      let test = rules.find(r => str.slice(i).startsWith(r.start));
      if(test) {
        rule = test;
        start = i;
      }
    } else {
      //currently in a string or comment, check if it ended
      let end = -1;
      if(str.slice(i).startsWith(rule.end)) {
        end = i + rule.end.length;
      } else if(rule.end.EOL && (str.slice(i).startsWith('\n') || str.slice(i).startsWith('\r\n') || i == str.length - 1)) {
        //special handling for line comments which can end on many conditions
        end = i + 1; 
      }
      if(end > -1) {
        if(rule.remove) {
          //modify str if it was a comment rule - cut out the comment
          str = str.slice(0,start) + str.slice(end);
          i -= end - start;
        }
        rule = null;
      }
    }
  }
  return str;
}

["='a' &",
"'//b/*aasdsa //dsfds*/'& //comment",
"'//d' /*aasdsa //dsfds*/ & '//e' & /*aasdsa //dsfds*/",
"& 'c'",
"[// /* [] */] & h",
"`// /* [] */` & p"].forEach(str => console.log(removeComments(str)));

Note that the output differs from your expected output, because your expected output does things that defy the rules you laid out in the question - it removes everything bordered by [] and ````. as well as block comments contained in strings '/* */'.

Sign up to request clarification or add additional context in comments.

1 Comment

@Joe This is already what is done by my solution. I don't understand what other requirements need to be added?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.