Executing javascript in java - Opening a URL and getting links

Question

import javax.script.ScriptEngine;
import javax.script.ScriptEngineManager;
import java.io.FileReader;

public class Main {

    public static void main(String[] args) {

        ScriptEngineManager manager = new ScriptEngineManager();
        ScriptEngine engine = manager.getEngineByName("js");
        try {
            FileReader reader = new FileReader("C:/yourfile.js");
            engine.put("urlfromjava", "http://www.something.com/?asvb");
            engine.eval(reader);
            reader.close();
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

Right now, the yourfile.js contains this line

function urlget(url)
{
    print("URL:"+url);
    var loc = window.open(url);
    var link = document.getElementsByTagName('a')["61"].href;
    return ("\nLink is: \n"+link); 

}
var x = urlget(urlfromjava);
print(x);

I get the error

"javax.script.ScriptException: sun.org.mozilla.javascript.internal.EcmaError: ReferenceError: "window" is not defined"

How to open a URL and get the links of it from java?

ug_ · Accepted Answer · 2015-08-01 07:40:59Z

6

you can embed Env.js in Rhino to get this kind of functionality

edited Aug 1, 2015 at 7:40

ug_

11.5k2 gold badges37 silver badges56 bronze badges

answered May 22, 2011 at 10:04

Grooveek

10.1k1 gold badge30 silver badges37 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

iruediger · Accepted Answer · 2011-05-22 09:51:24Z

2

According to the documentation:

The window object represents an open window in a browser.

Since you are not executing your script in a browser, the window object is not defined.

You can read the URL using the URL/URLConnecion classes and feed it to the ScriptEngine. There is a tutorial here.

answered May 22, 2011 at 9:51

iruediger

9536 silver badges9 bronze badges

2 Comments

Kit10 Over a year ago

I like the answer, except, w3schools is as much "the documentation" as wikipedia or a random web search result. So the first two lines of this answer are incorrect.

Andrew Scott Evans Over a year ago

I'm suprised no one told you to use JavaFX. You can achieve headlessness by using a JFrame.

Tapos · Accepted Answer · 2011-05-22 09:53:16Z

0

In javascript window means browser window. So when you are trying to execute this js from Java, it is unable to find browser window and you are getting error. You can use URL class in Java to get the content of the url.

answered May 22, 2011 at 9:53

Tapos

6014 silver badges8 bronze badges

3 Comments

harihb Over a year ago

Actually, the content of the URL has hyperlinks that I can retrieve only by using document.getElementByTagName('a'); So for that, I need to load the url in the memory, do this and get the link

Tapos Over a year ago

you can parse the string using regex pattern.

harihb Over a year ago

The link is not there in the source of the page. It gets loaded by javascript executed on server side.

aviad · Accepted Answer · 2011-05-23 08:23:25Z

0

try this:

import java.net.*;  
import java.io.*;  
  public class URLConnectionReader {  
  public static void main(String[] args) throws Exception {  
        URL yahoo = new URL("http://www.yahoo.com/");  
        URLConnection yc = yahoo.openConnection();  
        BufferedReader in = new BufferedReader(  
             new InputStreamReader(  
             yc.getInputStream()));  
       String inputLine;  
       while ((inputLine = in.readLine()) != null)   
             System.out.println(inputLine);// or save to some StringBuilder like this:   sb.append(inputLine); then pass the sb.toString() to the method that gets links out of it - > see getLinks below  
        in.close();  
       }  
  }  



private static final String CLOSING_QUOTE   = "\"";
private static final String HREF_PREFIX     = "href=\"";
private static final String HTTP_PREFIX     = "http://";



public static Set<String> getLinks(String page) {
    Set<String> links = new HashSet<String>();
    String[] rawLinks = StringUtils.splitByWholeSeparator(page, HREF_PREFIX);
    for (String str : rawLinks) {
        if(str.startsWith(HTTP_PREFIX)) {
            links.add(StringUtils.substringBefore(str, CLOSING_QUOTE));
        }
    }
    return links;
}

edited May 23, 2011 at 8:23

answered May 23, 2011 at 6:01

aviad

8,2789 gold badges55 silver badges100 bronze badges

3 Comments

harihb Over a year ago

The problem is, the link in the page is generated by javascript. So only after the URL is loaded, will the link arrive. i.e., it is not there in the source of the html file. That is why, after loading the url, I do document.getElementByTagName('a') rather than using URL class in java to extract out the links.

aviad Over a year ago

URL.openConnection emulates what client's browser does so you get exactly the same markup that you get via browser. Try it and I believe that you will see that it works. If i tdoes not let me know what you get and we could try to work it out further.

harihb Over a year ago

Sure, will do that and tell you.

Jose Wamba · Accepted Answer · 2020-03-24 20:41:28Z

0

you can use HtmlUnit is java API, i think it can help you to access the executed js content, as a simple html.

WebClient webClient = new WebClient();
HtmlPage myPage = (HtmlPage) webClient.getPage(new URL("YourURL"));
System.out.println(myPage.getVisibleText());

answered Mar 24, 2020 at 20:41

Jose Wamba

412 bronze badges

Collectives™ on Stack Overflow

Executing javascript in java - Opening a URL and getting links

5 Answers 5

Comments

2 Comments

3 Comments

3 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

Comments

2 Comments

3 Comments

3 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related