0

I want to put list in dataframe, my code is,

webpage_urls = ["https://data.gov.au/dataset?q=&sort=extras_harvest_portal+asc%2C+score+desc%2C+metadata_modified+desc&_organization_limit=0&groups=sciences&organization=departmentofagriculturefisheriesandforestry&_groups_limit=0",
                 "https://data.gov.au/dataset?q=&organization=commonwealthscientificandindustrialresearchorganisation&sort=extras_harvest_portal+asc%2C+score+desc%2C+metadata_modified+desc&_organization_limit=0&groups=sciences&_groups_limit=0",
                 "https://data.gov.au/dataset?q=&organization=bureauofmeteorology&sort=extras_harvest_portal+asc%2C+score+desc%2C+metadata_modified+desc&_organization_limit=0&groups=sciences&_groups_limit=0",
                 "https://data.gov.au/dataset?q=&sort=extras_harvest_portal+asc%2C+score+desc%2C+metadata_modified+desc&_organization_limit=0&groups=sciences&organization=tasmanianmuseumandartgallery&_groups_limit=0",
                 "https://data.gov.au/dataset?q=&organization=department-of-industry&sort=extras_harvest_portal+asc%2C+score+desc%2C+metadata_modified+desc&_organization_limit=0&groups=sciences&_groups_limit=0"]

    for i in webpage_urls:
        wiki2 = i
        page= urllib.request.urlopen(wiki2)

        soup = BeautifulSoup(page)

        # fetching organisations

        data3 = soup.find_all('li', class_="nav-item active")

        lobbying1 = []
        for element in data3:
            lobbying1.append(element.span.get_text())
        print(lobbying1)

        df = pd.DataFrame({'Organisation':lobbying1})   

My above code is giving output as:

['Reserve Bank of Aus... (24)', 'Business Support an... (24)']
['Department of Finance (16)', 'Business Support an... (16)']
['Department of Agric... (13)', 'Business Support an... (13)']...so on

Which is multiple lists, not a nested one and I am getting data frame as follows only:

   Organisation
0  Australian Charitie... (1)
1  Business Support an... (1)

I want output as two columns first element of list in column in column 1 and second element of list in column 2, and I want all entries:

Organisation            Groups
Australian Cha...      Business Support and...

Help me out in this.

2 Answers 2

1

I think you need add [] for list of lists and then use DataFrame constructor:

    df = pd.DataFrame([lobbying1], columns=['Organization','Groups'])   
    print (df)

                  Organization        Groups
0  Department of Agric... (35)  Science (35)
                 Organization       Groups
0  Commonwealth Scient... (8)  Science (8)
                Organization       Groups
0  Bureau of Meteorology (4)  Science (4)
                 Organization       Groups
0  Tasmanian Museum an... (1)  Science (1)
                 Organization       Groups
0  Department of Indus... (1)  Science (1)

If need one DataFrame for all data append lobbying1 to data list and then call DataFrame constructor out of loop:

data = []
for i in webpage_urls:
    wiki2 = i
    page= urllib.request.urlopen(wiki2)

    soup = BeautifulSoup(page)
    # fetching organisations
    data3 = soup.find_all('li', class_="nav-item active")

    lobbying1 = []
    for element in data3:
        lobbying1.append(element.span.get_text())
    data.append(lobbying1)

df = pd.DataFrame(data, columns=['Organization','Groups'])   
print (df)
                  Organization        Groups
0  Department of Agric... (35)  Science (35)
1   Commonwealth Scient... (8)   Science (8)
2    Bureau of Meteorology (4)   Science (4)
3   Tasmanian Museum an... (1)   Science (1)
4   Department of Indus... (1)   Science (1)
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks alot. It exactly matches what I wanted.
Glad can help. Btw, I very like Australia ;)
1

Your list lobbying1 is a list of lists. So you can get a two columns dataframe by simply calling pd.Dataframe as follows:

lobbying1 = [['Reserve Bank of Aus... (24)', 'Business Support an... (24)'],
['Department of Finance (16)', 'Business Support an... (16)'],
['Department of Agric... (13)', 'Business Support an... (13)']]
df = pd.DataFrame(main_list, columns=['Organization','Groups'])

You get this as output

>>> df.head() 
                  Organization                       Groups
0  Reserve Bank of Aus... (24)  Business Support an... (24)
1   Department of Finance (16)  Business Support an... (16)
2  Department of Agric... (13)  Business Support an... (13)
>>> 

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.