0

Possible Duplicate:
Java - Regex problem

I have list of URLs of types:

  • http://www.example.com/pk/etc
  • http://www.example.com/pk/etc/
  • http://www.example.com/pk/etc/etc

where etc can be anything.

So I want to search only those URLs that contains www.example.com/pk/etc or www.example.com/pk/etc/.

Note: It is for all those who think that it is a duplicate question -- Kindly read both the questions carefully before marking this question as duplicate. Even after reading you can't understand that both the questions are different, then kindly leave without marking it as duplicate because I can't tell you the diff. in anymore detail

8
  • Why are you asking the same question again? stackoverflow.com/questions/2661849/java-regex-problem Commented Apr 18, 2010 at 10:22
  • You know that if a string contains www.example.com/pk/etc/ it also contains www.example.com/pk/etc since that’s a prefix of www.example.com/pk/etc/. Commented Apr 18, 2010 at 10:23
  • @Cletus Look at the questions carefully... both are diff. questions Commented Apr 18, 2010 at 10:23
  • @Yatendra Goel: Then please tell us the difference. Commented Apr 18, 2010 at 10:24
  • 1
    Maybe you should do a more systematical approach than regular expressions. Java has an excellent URI class that can be used to parse and process URIs. Commented Apr 18, 2010 at 10:36

3 Answers 3

1
String pattern = "http://www.example.com/pk/[^/]+/?$";

I am assuming http://www.example.com/pk// is not accepted. If this should be accepted too, then use

String pattern = "http://www.example.com/pk/[^/]*/?$";
Sign up to request clarification or add additional context in comments.

7 Comments

@Bytecode + means "one or more" or "zero or more" ?
@Yatendra: + means "one or more"
@Bytecode... I think tha it will also match example.com/pk/anything/anything .... If it is true, then it is not acceptable... the url must have only one /.../ or /... segment after pk
Then you should add $ after the ?: "example.com/pk/[^/]+/?$". Please see the updated answer.
Note that in regular expressions . means any character. If you mean a literal dot then you must escape it.
|
1

Your problem isn't fully defined so I can't give you an exact answer but this should be a start you can use:

^[^:]+://[^/]+\.com/pk/[^/]+/?$

The difference is that the / is no longer optional and there must be at least one more character after pk/.

These strings will match:

http://www.example.com/pk/ca
http://www.example.com/pk/ca/
https://www.example.com/pk/ca/

These strings won't match:

http://www.example.com/pk//
http://www.example.co.uk/pk/ca
http://www.example.com/pk
http://www.example.com/pk/
http://www.example.com/anthingcangoeshere/pk
http://www.example.com/pkisnotnecessaryhere
http://www.example.com/pk/ca/sf

1 Comment

@Mark Yes, you are right that atleast one more character must be there after pk/. There should be only one segment /... or /.../ after pk and not more than one... so example.com/pk/ca/sf must not be matched
0

So I want to search only those URLs that contains www.example.com/pk/etc or www.example.com/pk/etc/.

Update

I think this will work:

https?://.*\\.?[A-Za-z0-9]+\\.com/pk/etc/?[^.]

But every item in the list you gave contains what you are searching for.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.