0

I'm scraping multiple URL's of a website with BeautifulSoup and want to generate a file for each URL.

categories = ["NEWS_AND_MAGAZINES", "ART_AND_DESIGN",...,"FAMILY"]
subcategories = ["topselling_free",...,"topgrossing"]
urls = []

for i in range (0,len(categories)):
    for j in range (0,len(subcategories)):
        url = categories_url_prefix + categories[i]+'/collection/'+subcategories[j]
        urls.extend([url])

for i in urls:
response = get(i)
html_soup = BeautifulSoup(response.text, 'html.parser')
app_container = html_soup.find_all('div', class_="card no-rationale square-cover apps small")
file = open("apps.txt","a+")
for i in range(0, len(app_container)):
    print(app_container[i].div['data-docid'])
    file.write(app_container[i].div['data-docid'] + "\n")

file.close()

I'm generating a unique file 'app.txt' how can I generate a file for each URL ? Thank you

1 Answer 1

2

Just replace this:

for n, i in enumerate(urls):
  response = get(i)
  html_soup = BeautifulSoup(response.text, 'html.parser')
  app_container = html_soup.find_all('div', class_="card no-rationale square-cover apps small")
  with open("file{}.txt".format(n),"a+") as f:
    for i in range(0, len(app_container)):
      print(app_container[i].div['data-docid'])
      f.write(app_container[i].div['data-docid'] + "\n")
Sign up to request clarification or add additional context in comments.

2 Comments

"No such file or directory: URL.txt'"
Directory error? It's weird! If you run your program, you should have file0.txt, file1.txt ... in your working directory. Make sure of .txt after file{}

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.