22

In java, I'd like to replace the Host part of an url with a new Host, where both the host and url are supplied as a string.

This should take into account the fact that the host could have a port in it, as defined in the RFC

So for example, given the following inputs

I should get the following output from a function that did this correctly

Does anyone know of any libraries or routines that do Host replacement in an url correctly?

EDIT: For my use case, I want my host replacement to match what a java servlet would respond with. I tried this out by running a local java web server, and then tested it using curl -H 'Host:superduper.com:80' 'http://localhost:8000/testurl' and having that endpoint simply return the url from request.getRequestURL().toString(), where request is a HttpServletRequest. It returned http://superduper.com/testurl, so it removed the default port for http, so that's what I'm striving for as well.

7 Answers 7

27

The Spring Framework provides the UriComponentsBuilder. You can use it like this:

import org.springframework.web.util.UriComponentsBuilder;

String initialUri = "http://localhost/me/out?it=5";
UriComponentsBuilder builder = UriComponentsBuilder.fromHttpUrl(initialUri);
String modifiedUri = builder.host("myserver").port("20000").toUriString();
System.out.println(modifiedUri);
// ==> http://myserver:20000/me/out?it=5

Here you need to provide hostname and port in separate calls to get right encoding.

Sign up to request clarification or add additional context in comments.

Comments

20

You were right to use java.net.URI. The host and port (and user/password, if they exist) are collectively known as the authority component of the URI:

public static String replaceHostInUrl(String originalURL,
                                      String newAuthority)
throws URISyntaxException {

    URI uri = new URI(originalURL);
    uri = new URI(uri.getScheme().toLowerCase(Locale.US), newAuthority,
        uri.getPath(), uri.getQuery(), uri.getFragment());

    return uri.toString();
}

(A URI’s scheme is required to be lowercase, so while the above code can be said not to perfectly preserve all of the original URL’s non-authority parts, an uppercase scheme was never actually legal in the first place. And, of course, it won’t affect the functionality of the URL connections.)

Note that some of your tests are in error. For instance:

assertEquals("https://super/me/out?it=5", replaceHostInUrl("https://www.test.com:4300/me/out?it=5","super:443")); 
assertEquals("http://super/me/out?it=5", replaceHostInUrl("http://www.test.com:4300/me/out?it=5","super:80")); 

Although https://super/me/out?it=5 is functionally identical to https://super:443/me/out?it=5 (since the default port for https is 443), if you specify an explicit port in a URI, then the URI has a port specified in its authority and that’s how it should stay.

Update:

If you want an explicit but unnecessary port number to be stripped, you can use URL.getDefaultPort() to check for it:

public static String replaceHostInUrl(String originalURL,
                                      String newAuthority)
throws URISyntaxException,
       MalformedURLException {

    URI uri = new URI(originalURL);
    uri = new URI(uri.getScheme().toLowerCase(Locale.US), newAuthority,
        uri.getPath(), uri.getQuery(), uri.getFragment());

    int port = uri.getPort();
    if (port > 0 && port == uri.toURL().getDefaultPort()) {
        uri = new URI(uri.getScheme(), uri.getUserInfo(),
            uri.getHost(), -1, uri.getPath(),
            uri.getQuery(), uri.getFragment());
    }

    return uri.toString();
}

6 Comments

Hmmm... interesting.... I'll have to review this... thanks! and here's a repl of your solution as well, with the tests adjusted as you suggested.
Updated answer with code that strips default port numbers.
Thanks... Here's an updated repl showing that this works with the original test cases too!
The encoding information would lost if uri.getQuery() contains escaped characters
@machinarium That's true, but it might be okay for most uses. As far as I can tell, it only seems to change the characters that didn't really need to be encoded. %20 survives the round-trip. But %33 will be changed into a 3.
|
4

I quickly tried using java.net.URI, javax.ws.rs.core.UriBuilder, and org.apache.http.client.utils.URIBuilder, and none of them seemed to get the idea of a host header possibly including a port, so they all needed some extra logic from what I could see to make it happen correctly, without the port being "doubled up" at times, and not replaced correctly at other times.

Since java.net.URL doesnt require any extra libs, I used it. I do know that if I was using URL.equals somewhere, that could be a problem as it does DNS lookups possibly, but I'm not so I think it's good, as this covers my use cases, as displayed by the pseudo unit test.

I put together this way of doing it, which you can test it out online here at repl.it !

import java.net.URL;
import java.net.MalformedURLException;

class Main 
{
  public static void main(String[] args) 
  {
    testReplaceHostInUrl();
  }

  public static void testReplaceHostInUrl()
  {
    assertEquals("http://myserver:20000/me/out?it=5", replaceHostInUrl("http://localhost/me/out?it=5","myserver:20000")); 
    assertEquals("http://myserver:20000/me/out?it=5", replaceHostInUrl("http://localhost:19000/me/out?it=5","myserver:20000")); 
    assertEquals("http://super/me/out?it=5", replaceHostInUrl("http://localhost:19000/me/out?it=5","super")); 
    assertEquals("http://super/me/out?it=5", replaceHostInUrl("http://www.test.com/me/out?it=5","super")); 
    assertEquals("https://myserver:20000/me/out?it=5", replaceHostInUrl("https://localhost/me/out?it=5","myserver:20000")); 
    assertEquals("https://myserver:20000/me/out?it=5", replaceHostInUrl("https://localhost:19000/me/out?it=5","myserver:20000")); 
    assertEquals("https://super/me/out?it=5", replaceHostInUrl("https://www.test.com/me/out?it=5","super")); 
    assertEquals("https://super/me/out?it=5", replaceHostInUrl("https://www.test.com:4300/me/out?it=5","super")); 
    assertEquals("https://super/me/out?it=5", replaceHostInUrl("https://www.test.com:4300/me/out?it=5","super:443")); 
    assertEquals("http://super/me/out?it=5", replaceHostInUrl("http://www.test.com:4300/me/out?it=5","super:80")); 
    assertEquals("http://super:8080/me/out?it=5", replaceHostInUrl("http://www.test.com:80/me/out?it=5","super:8080")); 
    assertEquals("http://super/me/out?it=5&test=5", replaceHostInUrl("http://www.test.com:80/me/out?it=5&test=5","super:80")); 
    assertEquals("https://super:80/me/out?it=5&test=5", replaceHostInUrl("https://www.test.com:80/me/out?it=5&test=5","super:80")); 
    assertEquals("https://super/me/out?it=5&test=5", replaceHostInUrl("https://www.test.com:80/me/out?it=5&test=5","super:443")); 
    assertEquals("http://super:443/me/out?it=5&test=5", replaceHostInUrl("http://www.test.com:443/me/out?it=5&test=5","super:443")); 
    assertEquals("http://super:443/me/out?it=5&test=5", replaceHostInUrl("HTTP://www.test.com:443/me/out?it=5&test=5","super:443")); 
    assertEquals("http://SUPERDUPER:443/ME/OUT?IT=5&TEST=5", replaceHostInUrl("HTTP://WWW.TEST.COM:443/ME/OUT?IT=5&TEST=5","SUPERDUPER:443")); 
    assertEquals("https://SUPERDUPER:23/ME/OUT?IT=5&TEST=5", replaceHostInUrl("HTTPS://WWW.TEST.COM:22/ME/OUT?IT=5&TEST=5","SUPERDUPER:23")); 
    assertEquals(null, replaceHostInUrl(null, null));
  }

  public static String replaceHostInUrl(String url, String newHost)
  {
    if (url == null || newHost == null)
    {
      return url;
    }

    try
    {
      URL originalURL = new URL(url);

      boolean hostHasPort = newHost.indexOf(":") != -1;
      int newPort = originalURL.getPort();
      if (hostHasPort)
      {
        URL hostURL = new URL("http://" + newHost);
        newHost = hostURL.getHost();
        newPort = hostURL.getPort();
      }
      else
      {
        newPort = -1;
      }

      // Use implicit port if it's a default port
      boolean isHttps = originalURL.getProtocol().equals("https");
      boolean useDefaultPort = (newPort == 443 && isHttps) || (newPort == 80 && !isHttps);
      newPort = useDefaultPort ? -1 : newPort;

      URL newURL = new URL(originalURL.getProtocol(), newHost, newPort, originalURL.getFile());
      String result = newURL.toString();

      return result;
    }
    catch (MalformedURLException e)
    {
      throw new RuntimeException("Couldnt replace host in url, originalUrl=" + url + ", newHost=" + newHost);
    }
  }

  public static void assertEquals(String expected, String actual)
  {
    if (expected == null && actual == null)
    {
      System.out.println("TEST PASSED, expected:" + expected + ", actual:" + actual);
      return;
    }
      
    if (! expected.equals(actual))
      throw new RuntimeException("Not equal! expected:" + expected + ", actual:" + actual);
      
    System.out.println("TEST PASSED, expected:" + expected + ", actual:" + actual);
  }
}

9 Comments

I'm impressed that the answer and solution were posted at the exact same time :)
@pruntlar Creating a question to directly answer it yourself is encouraged as it helps others with similar problems (SO: self-answer).
Yeah I tend to do this if I searched for something, didn't find the answer, and want to document it somewhere in case I ever need it again. It's supported by StackOverflow directly, to help share knowledge and foster discussion. The reason I do this is because there are probably better answers out there than mine, and if there are, I'll switch to using them, but for the time being, this works for my use case. Thanks!
I'd improve the question though, I find it a bit short and had you not posted an answer I'd be inclined to ask what you've tried and maybe even close it. One thing that would spring to my mind would be to use a regex to replace the host. There obviously are drawbacks/pitfalls with this but you could point out those requirements in the question.
@WilliMentzel - thanks, but I think they need to be there, as it helps verify edge cases, and makes it easy for others to test/compare their solutions in the online java repl too.
|
4

An alternate answer for people on Android with the Uri class, you can use buildUpon() to create a builder based on the existing Uri and then set the authority (covers both host and port)

Uri myUri = Uri.parse("http://localhost/me/out?it=5");
myUri = myUri.buildUpon().authority("myserver:20000").build();

Comments

3

I realize this is a pretty old question; but posting a simpler solution in case someone else needs it.

String newUrl = new URIBuilder(URI.create(originalURL)).setHost(newHost).build().toString();

1 Comment

When you rely on external dependencies to do something, it's appropriate to mention which one. I guess you're using httpcomponents:httpclient's URIBuilder?!
1

I've added a method to do this in the RawHTTP library, so you can simply do this:

URI uri = RawHttp.replaceHost(oldUri, "new-host");

Added in this commit: https://github.com/renatoathaydes/rawhttp/commit/cbe439f2511f7afcb89b5a0338ed9348517b9163#diff-ff0fec3bc023897ae857b07cc3522366

Feeback welcome, will release it soon.

Comments

-1

Or using some regex magic:

public static String replaceHostInUrl(String url, String newHost) {
    if (url == null || newHost == null) {
        return null;
    }
    String s = url.replaceFirst("(?i)(?<=(https?)://)(www.)?\\w*(.com)?(:\\d*)?", newHost);
    if (s.contains("http://")) {
        s = s.replaceFirst(":80(?=/)", "");
    } else if (s.contains("https://")) {
        s = s.replaceFirst(":443(?=/)", "");
    }
    Matcher m = Pattern.compile("HTTPS?").matcher(s);
    if (m.find()) {
        s = s.replaceFirst(m.group(), m.group().toLowerCase());
    }
    return s;
}

1 Comment

Nice! And here's a repl for it that shows it works just as good as my answer

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.