0

If we have an url e.g www.google.de how can I get ONLY the "google"

In Java new URL (url).getHost(); does work but it gives me google.de and this is not what I want to have.

Thank you

EDIT: If we have something like www.google.co.uk then I also want to have only "google" as result.

I dont want "google.de" or "www.google" I ONLY want "google"

12
  • 1
    That's not called a hostname. Commented Jul 11, 2017 at 15:23
  • 2
    Possible duplicate of Get domain name from given url Commented Jul 11, 2017 at 15:26
  • 3
    You're going to have to create your own rules. On what basis are you choosing "google" from "www.google.co.uk"? Second element? First element but ignoring "www" as a special case? What other special cases do you want to ignore? It's your requirement - you have to define it. Commented Jul 11, 2017 at 15:30
  • 2
    The part you want to ignore, i.e. the www actually is the hostname. Commented Jul 11, 2017 at 15:36
  • 3
    Stepping back, I think this is an XY problem. meta.stackexchange.com/questions/66377/what-is-the-xy-problem -- why do you want to do this? Commented Jul 11, 2017 at 15:47

2 Answers 2

1

Splitting on a period and selecting the first or second element (whichever is not "www") would work:

URL url = new URL("http://www.host.ext.ext");
String host = url.getHost(); // host = "www.host.ext.ext"
String splitHost = host.split("\\.") // splitHost = { "www", "host", "ext", "ext" }

host = splitHost[0].equals("www") ? splitHost[1] : splitHost[0]; // host = "host"

If there is anything more than http://www. before it, and the extension is potentially more than two "extensions" (.co.uk for instance), then there is no easy way to get just the part you want. As far as I know, you would have to try iterating over a list of extensions and return the part immediately before the longest matching extension.

Sign up to request clarification or add additional context in comments.

4 Comments

This works but I agree with the comment above that this is likely an XY Problem and there is missing information why "google" is more useful than "google.com".
Yeah, I think I agree as well now.
While it is possible this is an XY, I think it could also be exactly what they're asking. Perhaps they actually want to display the important part, ignoring the extension? Although that might be problem X now that I think about it; "how do I display a website's display name/title?"
Also, could anyone explain the downvote? The answer does exactly what the asker asked for, and explains the one possible pitfall. Help us out here.
0

The most basic solution would be using

 System.out.println(url.split("\\.")[1]);

Or you could try this https://stackoverflow.com/a/23079402/2555419

public String getHostName(String url) {
    URI uri = new URI(url);
    String hostname = uri.getHost();
    // to provide faultproof result, check if not null then return only hostname, without www.
    if (hostname != null) {
        return hostname.startsWith("www.") ? hostname.substring(4) : hostname;
    }
    return hostname;
}

2 Comments

He needs to specify his question, but he probably actually wants a ccTLD list.
Yeah probably, I just gave him the most basic solution so he can have somewhere to start from if he wants to use String manipulation.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.