4

Looking for a way to convert String to a map of "char" => int (index of). Obviously this assumes String has unique characters.

Is there a way to do it with Streams? I thought this would work but am running into errors trying to compile it:

String alphabet = "0123456789";
alphabet.chars().collect(Collectors.toMap(alphabet::charAt, Function.identity()));

Basically I want something equivalent to the Python code:

{c, i for i,c in enumerate("0123456789")}
2
  • 1
    What happens for duplicates in the input, like "bee"? What should 'e' map to? Commented Apr 19, 2015 at 4:37
  • Beware that Map<Character, Integer> is a very expensive data structure due to boxing. Do not use it in performance critical code. Consider e.g. goldman sachs collections instead (though conversion will be more expensive). enumerate in Python isn't cheaper either. Commented Apr 19, 2015 at 7:18

4 Answers 4

6

This works.

Map<Integer, Character> map =
  IntStream.range(0, alphabet.length()).boxed()
           .collect(Collectors.toMap(x -> x, alphabet::charAt));

edit:

of course if you used Scala instead (also runs on the JVM) you could write it much shorter. Then it would just be alphabet.zipWithIndex.toMap

Sign up to request clarification or add additional context in comments.

4 Comments

This is pretty elegant, but elaborating a bit more on why it works the way it does would make it really complete.
Thanks. The .boxed() call is what I was missing. Just for clarification I wanted a Map<Character, Integer> but yours is pretty much what I want so I accepted.
As with most Java related questions, this can be answered by looking at the API documentation. docs.oracle.com/javase/8/docs/api . An IntStream object doesn't have the type of collect method you need. boxed() converts it to a normal Stream.
Note that boxed streams are even more expensive; but if you want to build a boxed map, you'll have to pay that price eventually.
6

You asked in comments how to do it with 3-argument IntStream#collect. Here it is:

Map<Character, Integer> map = IntStream.range(0, alphabet.length())
    .collect(HashMap::new, (m, i) -> m.put(alphabet.charAt(i), i), Map::putAll);

If you want to do duplicate check like toMap does and still use the 3-argument form, you will have to update both 2nd (accumulator) and 3rd (combiner) arguments to do the check:

map = IntStream.range(0, alphabet.length())
    .collect(
        HashMap::new, 
        (m, i) -> m.merge(alphabet.charAt(i), i, 
            (a,b) -> {throw new IllegalStateException("duplicate!");}
        ), 
        (m1, m2) -> m2.forEach((c,i) -> 
            m1.merge(c, i, (a,b) -> {throw new IllegalStateException("duplicate!");})
        )
    );

As you see, this gets ugly and you would probably want to define a helper method to simplify this. If you are willing to disallow parallel stream, you can get away with only checking for duplicates in accumulator but then you will have to have the combiner throw an exception:

map = IntStream.range(0, alphabet.length())
    .collect(
            HashMap::new, 
            (m, i) -> m.merge(alphabet.charAt(i), i, 
                (a,b) -> {throw new IllegalStateException("duplicate!");}
            ), 
            (m1, m2) -> {throw new AssertionError("parallel not allowed");}
);

Unless you are doing this as an exercise, it's better to just leave all this mess for toMap to deal with. That's what it's there for.

7 Comments

I'm trying to work it out now. The only difference is using boxed and then to toMap() enforces unique key constraint, yours won't throw an IllegalStateException on duplicate. Thanks though.
Logic for duplicates just means turning the (m, i) into an expression and checking for duplicates though, you put me on the right track. Thanks again!
@aamit915 be careful how you do it. I will update my answer.
Quick question, since I'm still pretty new to lambdas: why does Map::putAll work? I would've used (m1,m2) -> m1.putAll(m2) since putAll accepts a Map, it's an instance method, and the argument lengths of putAll and accept are different.
@TNT Map::putAll is a method reference that's equivalent to (m1,m2) -> m1.putAll(m2). Take a look at JLS for all the possible method reference expressions: docs.oracle.com/javase/specs/jls/se8/html/jls-15.html#jls-15.13
|
5

This creates a map from each character to its integer:

Map<Character, Integer> map = IntStream.range(0, alphabet.length())
    .boxed().collect(Collectors.toMap(alphabet::charAt, Function.identity()));

Note that it won't work if you have repeated characters.

3 Comments

Thanks, exactly what I wanted. Any idea why it won't work without the call to .boxed()?
Because IntStream#collect takes in 3 arguments (Supplier<R> supplier, ObjIntConsumer<R> accumulator, BiConsumer<R,R> combiner), and you need to use .boxed() to convert it to a Stream<T> which can accept Collectors.
Note also that boxed() is simply shorthand for mapToObj(Integer::valueOf);
0

As you can see from the various solutions posted here, stream() doesn't make things a whole lot easier.

While functional-style spaghetti code is all the rage right now, I suggest to use imperative programming for readability.

for(int i=0; i<s.length(); i++) map.put(s.charAt(i), i);

Seriously, procedural can be much nicer sometimes! People underappreciate the clarity of procedural code, and it's efficiency.

However, you probably want to use this code instead for your scenario:

if (c<'0' || c>'9') throw RuntimeException("Not a digit");
int digit = c - '0';

which is much cheaper (faster, and 0 memory) than a Map.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.