PHP - Parsing html tables via DOM

Question

I am using the PHP Simple HTML DOM Parser and I am trying to get the table list of Top Goalscorers from this webpage: http://www.transfermarkt.co.uk/en/chinese-super-league/startseite/wettbewerb_CSL.html (it's the top 5...)

I am trying to parse the table Top Goal Scorers and that has the ID of "spieler". In doing so, I want to get each table row and list them on my own. The problem is... below Name / Club... there is a new <table> to make the image, name and club name easier to display on a webpage.

I am trying to figure out the DOM so I can see what I need to select and get the right player name, club name and the goals.

Here's what I have so far:

<textarea id='txt_out'>
<?php
echo "Player | Team | Goals\n:--|:--|:--:\n";
   
$url = "http://www.transfermarkt.co.uk/en/chinese-super-league/startseite/wettbewerb_CSL.html";
$html = file_get_html($url);

foreach($html->find('#spieler') as $row) {
    
  if ($i > 0) {
   $player = $row->find('table tr',3)->plaintext;
        echo $player . "|TEST TEAM|0";
    }
   $i++;
}
?>
</textarea>

and this echo returns blank.

<textarea id="txt_out">Player | Team | Goals
:--|:--|:--:
</textarea>

Won't $html->find('#spieler') return the table with the id of spieler (ie: an array of one item)? Seems to me that something like #spieler>tbody>tr[class] table tr would get you all (and only all) the rows that have data. Probably won't affect the overall result, but it seems like it'd obviate the need for the counter and all that. — cHao
– cHao, Commented May 5, 2013 at 10:59

Adidi · Accepted Answer · 2013-05-05 11:34:58Z

2

There you go (you have to play with the attributes a bit to get your desire output): In this solution I just take all the tds and get the plaintext of the them after I checked they don't include the inner table in them.

$output = '<table border="1">
                <tr>
                    <td>#</td>
                    <td>Player</td>
                    <td>Team</td>
                    <td>goals-1</td>
                    <td>goals-2</td>
                    <td>goals-3</td>
                    <td>points</td>
                </tr>
            ';

$url = "http://www.transfermarkt.co.uk/en/chinese-super-league/startseite/wettbewerb_CSL.html";
$html = file_get_html($url);

$tbl = $html->find('#spieler',0);

$trs = $tbl->find('tr[class=dunkel],tr[class=hell]');

foreach($trs as $tr){
    $output .= '<tr>';
    $tds = $tr->find('td');
    foreach($tds as $td){
        $inner_table = $td->find('table',0);
        if(!$inner_table){  
            $text = trim($td->plaintext);
            if($text != ''){
                $output .= '<td>' . $td->plaintext . '</td>';
            }
        }  
    }
    $output .= '</tr>';
}

$output .= '</table>';

echo($output);

answered May 5, 2013 at 11:34

Adidi

5,2835 gold badges27 silver badges30 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

test Over a year ago

I'm trying to add it as "markdown" inside a textbox... so each row would be using a delmiter like | and no need for points or assists section... hmmm

Adidi Over a year ago

jeeesus - so change it to your needs - do you want me to come to your house and feed you with a spoon also ? this is the global solution - whatever you want from here is very easy to format

M. Haris Azfar · Accepted Answer · 2013-05-05 10:53:02Z

0

Use DOMNodelist->item() (item() expects as argument the index, it's zero-based so 1 will return the 2nd table )

 $table = $dom->getElementsByTagName('table')->item(1);

answered May 5, 2013 at 10:53

M. Haris Azfar

6187 silver badges16 bronze badges

1 Comment

cHao Over a year ago

Unfortunately, he's using "Simple HTML DOM" (barf). It doesn't return a DOMNodeList; it just returns arrays (or single elements, if you specify an index to find or getElementsByTagName).

Collectives™ on Stack Overflow

PHP - Parsing html tables via DOM

2 Answers 2

2 Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related