3

Using binarySearch never returns the right index

int j = Arrays.binarySearch(keys,key);

where keys is type String[] and key is type String

I read something about needing to sort the Array, but how do I even do that if that is the case?

Given all this I really just need to know:

How do you search for a String in an array of Strings (less than 1000) then?

11
  • If you read the documentation (j2ee.me/javase/6/docs/api/java/util/…, byte)), it says the array must be sorted and suggests using sort(). Commented Nov 21, 2009 at 0:47
  • 1
    It's not java that has to be smarter, you know Commented Nov 21, 2009 at 0:48
  • Damn, it ate my url: j2ee.me/javase/6/docs/api/java/util/… Commented Nov 21, 2009 at 0:48
  • 1
    How the heck did you think it could do a binary search if the data you were searching wasn't sorted? How do you think a binary search works? Commented Nov 21, 2009 at 0:52
  • 1
    If the array is always short then bnarySearch would probably be overkill. HOWEVER you should profile it to be sure. The array may be short, but if you search through it a number of times it may be significantly slower to search linearly over a short array than to do a binarySearch for on it. Don't assume that because something is small it will be ok to do it the slower way. Sometimes it does matter. Commented Nov 21, 2009 at 1:12

5 Answers 5

12

From Wikipedia:

"In computer science, a binary search is an algorithm for locating the position of an element in a sorted list by checking the middle, eliminating half of the list from consideration, and then performing the search on the remaining half.[1][2] If the middle element is equal to the sought value, then the position has been found; otherwise, the upper half or lower half is chosen for search based on whether the element is greater than or less than the middle element."

So the prerequisite for binary search is that the data is sorted. It has to be sorted because it cuts the array in half and looks at the middle element. If the middle element is what it is looking for it is done. If the middle element is larger it takes the lower half of the array. If the middle element is smaller it the upper half of the array. Then the process is repeated (look in the middle etc...) until the element is found (or not).

If the data isn't sorted the algorithm cannot work.

So you would do something like:

final String[] data;
final int      index;

data = new String[] { /* init the elements here or however you want to do it */ };
Collections.sort(data);
index = Arrays.binarySearch(data, value);

or, if you do not want to sort it do a linear search:

int index = -1; // not found

for(int i = 0; i < data.length; i++)
{
    if(data[i].equals(value))
    {
        index = i;
        break; // stop looking
    }
}

And for completeness here are some variations with the full method:

// strict one - disallow nulls for everything
public <T> static int linearSearch(final T[] data, final T value)
{
    int index;

    if(data == null)
    {
        throw new IllegalArgumentException("data cannot be null");
    }

    if(value == null)
    {
        throw new IllegalArgumentException("value cannot be null");
    }

    index = -1;

    for(int i = 0; i < data.length; i++)
    {
        if(data[i] == null)
        {
            throw new IllegalArgumentException("data[" + i + "] cannot be null");
        }

        if(data[i].equals(value))
        {
            index = i;
            break; // stop looking
        }
    }    

    return (index);
}

// allow null for everything

public static <T> int linearSearch(final T[] data, final T value)
{
    int index;

    index = -1;

    if(data != null)
    {
        for(int i = 0; i < data.length; i++)
        {
            if(value == null)
            {
                if(data[i] == null)
                {
                    index = i;
                    break;
                }
            } 
            else
            {            
                if(value.equals(data[i]))
                {
                    index = i;
                    break; // stop looking
                }
            }
        }    
    }

    return (index);
}

You can fill in the other variations, like not allowing a null data array, or not allowing null in the value, or not allowing null in the array. :-)

Based on the comments this is also the same as the permissive one, and since you are not writing most of the code it would be better than the version above. If you want it to be paranoid and not allow null for anything you are stuck with the paranoid version above (and this version is basically as fast as the other version since the overhead of the method call (asList) probably goes away at runtime).

public static <T> int linearSearch(final T[] data, final T value)
{
    final int index;

    if(data == null)
    {
        index = -1;
    }
    else
    {
        final List<T> list;

        list  = Arrays.asList(data);
        index = list.indexOf(value);
    }

    return (index);
}
Sign up to request clarification or add additional context in comments.

20 Comments

+1, though I'm afraid your answer might have fallen on deaf ears.
So much latent hostility here I hesitate to respond. :) Built in linear search in Java: int index = Arrays.asList(array).indexOf(key); The one extra allocation is a cheap pass-through so don't let it fool you. It's cheap and easy.
It equates to a loop almost identical to yours except without the extra stack var since they early return versus set and break. Which to me seems like it would be lost in the noise, so to speak.
No, Arrays.asList() doesn't copy the array... it just wraps it in a thin wrapper that passes directly through to the array. The only additional overhead is the creation of that wrapper (and one more method indirection).
And it doesn't vary from JVM to JVM as it's documented to act that way.
|
8

java.util.Arrays.sort(myArray);

That's how binarySearch is designed to work - it assumes sorting so that it can find faster.

If you just want to find something in a list in O(n) time, don't use BinarySearch, use indexOf. All other implementations of this algorithm posted on this page are wrong because they fail when the array contains nulls, or when the item is not present.

public static int indexOf(final Object[] array, final Object objectToFind, int startIndex) {
    if (array == null) {
        return -1;
    }
    if (startIndex < 0) {
        startIndex = 0;
    }
    if (objectToFind == null) {
        for (int i = startIndex; i < array.length; i++) {
            if (array[i] == null) {
                return i;
            }
        }
    } else {
        for (int i = startIndex; i < array.length; i++) {
            if (objectToFind.equals(array[i])) {
                return i;
            }
        }
    }
    return -1;
}

4 Comments

It doesn't return the right index because it assumes sorting (for performance) and your array is not sorted.
he did answer your question! It doesn't find it because the array is not sorted. To do a binary search the array must be sorted. The answer is to first sort the array using Arrays.sort and then call binarySearch on it.
If you just want a linear search, use indexOf from ArrayUtils.
No it does not. If you do not provide a comparator it uses the "natural order" of the elements, which means it uses the compareTo method that String implements from the Comparable interface: java.sun.com/javase/6/docs/api/java/util/…
1

To respond correctly to you question as you have put it. Use brute force

Comments

0

I hope it will help

   public int find(String first[], int start, int end, String searchString){
    int mid = start + (end-start)/2;
    // start = 0;
    if(first[mid].compareTo(searchString)==0){
        return mid;
    }
    if(first[mid].compareTo(searchString)> 0){
        return find(first, start, mid-1, searchString);
    }else if(first[mid].compareTo(searchString)< 0){
        return find(first, mid+1, end, searchString);
    }
    return -1;
  }

Comments

-2

Of all the overloaded versions of binarySearch in Java, there is no such a version which takes an argument of String. However, there are three types of binarySearch that might be helpful to your situation:

static int binarySearch(char[] a, char key); static int binarySearch(Object[] a, Object key); static int binarySearch(T[] a, T key, Comparator c)

5 Comments

Oh yeah, char one in particular would work exceptionally well.
@Michael: it's an array of strings not a string.
And String in java ain't a char[] either.
out of those 3 only the Object[], Object one is what would be needed for this situation. The char[] won't work (a String in not a char[]) and the Comparator version is not needed unless you want to change the default order (alphabetical) of how Strings are compared to one another.
Now if only a String were an Object!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.