0

I need to do some screen scraping on a web page where the content I need is generated by AJAX. On the initial page there is a table with 4 tabs. When you click on any of the tabs the content of the table changes. I need the content from the 3rd tab only. I have used the google chrome 'Inspect Element' tool to see what the requests and post data was and I can get the information I need when I put the information (session id and a lot of other cookie data as well as post data) from the inspect element result into a PHP curl request. But this only works for the 30 minutes that the session lasts. Does anyone know of a way I can get to this information?

2 Answers 2

1

I wont reproduce the code here but I will point you to the answer. Its within this book:

http://www.amazon.com/Webbots-Spiders-Screen-Scrapers-Developing/dp/1593273975/ref=dp_ob_image_bk

A must buy for someone doing what your doing.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks Aaron, i'll check it out.
0

In the end I used htmlunit to get the content I needed. I also found the HTMLUnit Scripter very useful to help generate the Java code required.

2 Comments

The question is about PHP. Your answer is about Java.
@KhomNazid yes I asked the question :) I answered it saying what I eventually used to get what I needed.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.