3

I have a simple project. I need the help this is a related project. I need to read an HTML file and then convert it to JSON format. I want to get the matches as code and text. How I achieve this?

In this way, I have two HTML tags

<p>In practice, it is usually a bad idea to modify global variables inside the function scope since it often is the cause of confusion and weird errors that are hard to debug.<br />
If you want to modify a global variable via a function, it is recommended to pass it as an argument and reassign the return-value.<br />
For example:</p>

<pre><code class="{python} language-{python}">a_var = 2

def a_func(some_var):
    return 2**3

a_var = a_func(a_var)
print(a_var)
</code></pre>

mycode:

const fs = require('fs')
const showdown  = require('showdown')

var read =  fs.readFileSync('./test.md', 'utf8')

function importer(mdFile) {

    var result = []
    let json = {}

    var converter = new showdown.Converter()
    var text      = mdFile
    var html      = converter.makeHtml(text);

    for (var i = 0; i < html.length; i++) {
        htmlRead = html[i]
        if(html == html.match(/<p>(.*?)<\/p>/g))
            json.text = html.match(/<p>(.*?)<\/p>/g)

       if(html == html.match(/<pre>(.*?)<\/pre>/g))
            json.code = html.match(/<pre>(.*?)<\/pre>/g

    }

    return html
}
console.log(importer(read))

How do I get these matches on the code?

new code : I write all the p tags in the same json, how to write each p tag into different json blocks?

$('html').each(function(){
    if ($('p').text != undefined) {
        json.code = $('p').text()
        json.language = "Text"
    }
})

2 Answers 2

6

I would recommend using Cheerio. It tries to implement jQuery functionality to Node.js.

const cheerio = require('cheerio')

var html = "<p>In practice, it is usually a bad idea to modify global variables inside the function scope since it often be the cause of confusion and weird errors that are hard to debug.<br />If you want to modify a global variable via a function, it is recommended to pass it as an argument and reassign the return-value.<br />For example:</p>"

const $ = cheerio.load(html)
var paragraph = $('p').html(); //Contents of paragraph. You can manipulate this in any other way you like

//...You would do the same for any other element you require

You should check out Cheerio and read its documentation. I find it really neat!

Edit: for the new part of your question

You can iterate over every element and insert it into an array of JSON objects like this:

var jsonObject = []; //An array of JSON objects that will hold everything
$('p').each(function() { //Loop for each paragraph
   //Now let's take the content of the paragraph and put it into a json object
    jsonObject.push({"paragraph":$(this).html()}); //Add data to the main jsonObject    
});

So the resulting array of JSON objects should look something like this:

[
  {
    "paragraph": "text"
  },
  {
    "paragraph": "text 2"
  },
  {
    "paragraph": "text 3"
  }
]

I believe You should also read up on JSON and how it works.

Sign up to request clarification or add additional context in comments.

2 Comments

Yeah, that's exactly what I did. But I have a question, I write all the p tags in the same json, how to write each p tag into different json blocks? I updated question.
Does anyone know a only JS alternative?
0

The 'hpq' is not one of the most common HTML parsing library, but I think it is well suited to your request as it's 1 line description is

A utility to parse and query HTML into an object shape.

https://github.com/aduth/hpq

And its functionality is illustrated nicely in this live explorer page:

https://aduth.github.io/hpq/

The issue for you will be that it was made for the browser (it takes an HTML string or DOM element as input) so I'm not sure about using it with node.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.