I'm writing a program (in Java) that needs to extract links from webpages. I'm using htmlParser (http://htmlparser.sourceforge.net/) but I'm only able to extract html links (defined with <a href="...">) and I don't know how to handle javascript code to extract links from... can you help me??
-
1There are some missing parts in your question. Is it formatting issue?Grzegorz Oledzki– Grzegorz Oledzki2009-08-13 13:18:47 +00:00Commented Aug 13, 2009 at 13:18
-
Please edit your question : it's hard to understand what you mean.Philippe Carriere– Philippe Carriere2009-08-14 14:04:41 +00:00Commented Aug 14, 2009 at 14:04
-
I can't understand why it's hard to understand what I mean, is it because of my poor english? Please tell me more.Raffo– Raffo2009-08-14 15:10:08 +00:00Commented Aug 14, 2009 at 15:10
-
@Raffo did my answer helped?quarks– quarks2016-02-13 14:35:40 +00:00Commented Feb 13, 2016 at 14:35
Add a comment
|
3 Answers
You can use Rhino with DOM environment, written in JavaScript.
By the way it is written by John Resig.
1 Comment
Raffo
I've never played with DOM, but I'll take a look at your link, thanks.
This is probally the most comprehensive tool out there. Rhino . Everything you want to do can be done with Rhino.