1

I have a list of URLs of type

  • http://www.example.com/pk/ca,
  • http://www.example.com/pk,
  • http://www.example.com/anthingcangoeshere/pk, and
  • http://www.example.com/pkisnotnecessaryhere.

Now, I want to find out only those URLs that ends with /pk or /pk/ and don't have anything in between .com and /pk

6
  • 2
    Your question isn't very clear. Give many more examples of what you do want to match and what you don't want to match. Commented Apr 18, 2010 at 10:13
  • It's still not clear. Does the URL have to contain .com? Commented Apr 18, 2010 at 10:15
  • @Mark yes, it should contain .com Commented Apr 18, 2010 at 10:17
  • Maybe you mean “URLs that’s path is /pk or start with /pk/” or “URLs that’s first path segment is pk”? Commented Apr 18, 2010 at 10:17
  • This is a very useful page to learn regex: zytrax.com/tech/web/regex.htm Commented Apr 18, 2010 at 10:54

4 Answers 4

1

Your problem isn't fully defined so I can't give you an exact answer but this should be a start you can use:

^[^:]+://[^/]+\.com/pk/?$

These strings will match:

http://www.example.com/pk
http://www.example.com/pk/
https://www.example.com/pk

These strings won't match:

http://www.example.co.uk/pk
http://www.example.com/pk/ca
http://www.example.com/anthingcangoeshere/pk
http://www.example.com/pkisnotnecessaryhere
Sign up to request clarification or add additional context in comments.

Comments

1
String pattern = "^http://www.example.com/pk/?$";

Hope this helps.

Some details: if you don't add ^ to the beginning of the pattern, then foobarhttp://www.example.com/pk/ will be accepted too. If you don't add $ to the end of the pattern, then http://www.exampke.com/pk/foobar will be accepted too.

Comments

1

Directly translating your request "[...] URLs that ends with /pk or /pk/ and don't have anything in between .com and /pk", with the additional assumption that there shall always be a ".com", yields this regex:

If you use find():

\.com/pk/?$

If you use matches():

.*\.com/pk/?

Other answers given here give more restrictive patterns, allowing only URLs that are more close to your examples. Especially my pattern does not validate that the given string is a syntactically valid URL.

Comments

0
String pattern = "^https?://(www\.)?.+\\.com/pk/?$";

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.