4

I'm a newbie to python and had a question to ask about vectorizing a code

def makeNames2(nList):
  for nLi in nList:
    nLIdx=[i for i,j in enumerate(nList) if j==nLi]
    if nLIdx.__len__()>1:
        for i,j in enumerate(nLIdx):
            if i>0: nList[j]=nList[j]+str(i)
  return nList

which does the following:

>>> nLTest=['asda','asda','test','ada','test','yuil','test']
>>> print(makenames2(nLTest)
['asda', 'asda1', 'test', 'ada', 'test1', 'yuil', 'test2']

The code works fine, but I was wondering if there is a way to vectorize the for loops?

EDIT

Thanks everyone for all the three answers. This is exactly what I was interested in and would have liked to selected all answers. I can't select more than one, but all of them work.

11
  • 4
    Could you explain what you mean by vectorize? Commented Feb 19, 2014 at 14:26
  • possible duplicate of how do I parallelize a simple python loop? Commented Feb 19, 2014 at 14:27
  • If you mean vectorize as in SSE3 stuff -- not with python out of the box. For some problems, you might be able to do it using 3rd party packages, but even then, it's hard to say when things are actually being vectorized vs. just pushed into a different language (e.g. C) in the implementation. You can parallelize it using multiprocessing (or sometimes threading depending on the problem and python implementation) Commented Feb 19, 2014 at 14:27
  • 5
    Also, if nLIdx.__len__()>1 can be written as if len(nLIdx)>1 or just if nLIdx Commented Feb 19, 2014 at 14:28
  • 3
    @NoelEvans if len(nLIdx) > 1 (greater than 1, meaning at least 2 elements) is not equivalent to if nLIdx. Commented Feb 19, 2014 at 14:31

3 Answers 3

3
nLTest, items = ['asda','asda','test','ada','test','yuil','test'], {}
for idx, item in enumerate(nLTest):
    nLTest[idx] += str(items.setdefault(item, 0) or "")
    items[item] += 1
print nLTest

Output

['asda', 'asda1', 'test', 'ada', 'test1', 'yuil', 'test2']
Sign up to request clarification or add additional context in comments.

2 Comments

I think you can further avoid the if and nLTest[idx] += str(items.setdefault(item, 0) or "")
@goncalopp Awesome :) Thanks man. Updated my answer with that.
1

You could simplify it a bit:

def makenames(lst):
    seen = {}
    for index, name in enumerate(lst):
        if name in seen:
            seen[name] += 1
            lst[index] = "{0}{1}".format(name, seen[name])
        else:
            seen[name] = 0
    return lst

This removes one of the for loops, operating in O(n) (dictionary access is O(1)).

Note that this modifies the list in-place; you may wish to have a new output list to append to instead. You could also simplify this slightly using defaultdict or Counter from the collections module.

Comments

1

This is arguably more readable, avoids O(n^2). It's also not in-place.

from collections import defaultdict
def makeNames3(nList):
    counter= defaultdict(lambda:0)
    def posfix(x):
        n= counter[x]
        counter[x]+=1
        return str(n) if n>0 else ""
    return [x+posfix(x) for x in nList]

2 Comments

You can simply do defaultdict(int)
@thefourtheye You're right, I didn't know int had a default value. I'm not sure if this a case of explicit is better than implicit, though

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.