I'm trying to scrape this page in python, in order to get all that information in a table. But once this table is kinda different, the information I want, the one that comes after the 'Autor:', 'Situação', 'Última Ação', etc, does not come in separate td (HTML) like common tables, how to proceed then ?
This is what I've done so far, but the information come very messy:
from bs4 import BeautifulSoup
import requests
url = 'https://sapl.recife.pe.leg.br/generico/materia_pesquisar_proc?page=1&step=10&txt_relator=&txt_numero=&dt_public2=&lst_tip_autor=Parlamentar&txt_num_protocolo=&hdn_txt_autor=&txt_ano=&hdn_cod_autor=&lst_localizacao=&lst_tip_materia=10&txt_assunto=&btn_materia_pesquisar=Pesquisar&incluir=0&lst_cod_partido=&dt_apres2=&chk_coautor=0&txt_npc=&lst_status=&dt_public=&rd_ordenacao=1&rad_tramitando=&existe_ocorrencia=0&dt_apres='
source = requests.get(url).text
soup = BeautifulSoup(source, 'lxml')
lst = []
for item in soup.find_all('tr'):
try:
lst.append((item.text))
except:
pass
print(lst)