4

I'm trying to get a string input like this:

{p}This is a paragraph{/p} {img}(path/to/image) {p}Another paragraph{/p}

To return an array of objects like this

[
  {"txt" : "This is a paragraph" },
  {"img" : "path/to/image"},
  {"txt" : "Another paragraph"}
]

I need the array to be indexed by the order they are found – i.e. in the example above the first paragraph gets index 0, image gets index 1 and so forth.

I can get the strings great with the code below, but I am unsure how to modify it to loop through the entire string and put together the object. So any pointers would be greatly appreciated

var p = /{p}(.*){\/p}/gmi;
var i = /{img}\((.*)\)/gmi;

var test = "{p} This is a paragraph {/p} {img}(text)";


function returnJson(test) {
  var ps = p.exec(test);
  var im = i.exec(test)
  var arr = [];
  if (ps.length > 1) {
    arr.push({"txt" : ps[1]})
  } 
  if (im.length > 1) {
    arr.push({"img" : im[1]})
  }
  return arr;
}

I was thinking of doing a recursive function, where I replace the found matches with the string. But then I am unsure of how to get the array in the order they were found. Any tips greatly appreciated.

6
  • Are you only dealing with paragraphs and images? Commented Mar 2, 2019 at 13:59
  • I will have one or two more elements – most likely headings and links that will be marked in a similar fashion. Commented Mar 2, 2019 at 14:01
  • Will they both close like the paragraphs? Commented Mar 2, 2019 at 14:03
  • Yes, i was thinking {h}Heading{/h} and {a}Link Text{/a}(url). I am setting the markup of the strings myself, so I'm very flexible when it comes to altering the structure. Commented Mar 2, 2019 at 14:05
  • And will they ever be nested? Or will links always be siblings to paragraphs instead of children of them? Commented Mar 2, 2019 at 14:11

1 Answer 1

5

You could use this regex

/{(\w+)}([^{]+)(?:{\/\1})?/g

And create an array using exec like this:

let str = "{p}This is a paragraph{/p} {img}(path/to/image) {p}Another paragraph{/p}";

let regex = /{(\w+)}([^{]+)(?:{\/\1})?/g;
let match;
let matches = [];

while (match = regex.exec(str)) {
    if(match[1] == "p")
      matches.push({ txt: match[2] })
    else
      matches.push({ [match[1]]: match[2]})
}

console.log(matches)

  • {(\w+)} gets the tag name to a capturing group
  • ([^{]+) gets the content to another capturing group
  • (?:{\/\1})? optionally matches the closing tag. (\1 refers to the first capturing group)
Sign up to request clarification or add additional context in comments.

3 Comments

Very close! There's parens around the image path that need to be removed, and a trailing space.
Thank you! Clever idea with the while-loop and using the different capture groups to add to the array. I modified the expression to {(\w+)}(?:\()?([^{[\)]+)(?:\))?(?:{\/\1})? to exclude a leading parenthesis and not match the ending parenthesis. This looks like it should work? I am new to these more complex regex patterns, so sorry if it is obvious
@user2868900 I didn't see the parenthesis bit. That should work: regex101.com/r/TMRVG8/1

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.