1

I'm having trouble indexing a result in a list.

The function "Similar" calculates the similarity between two strings in a range from 0 to 1. (Example: similar('dino','bino') = 0.75)

What I want to do is to iterate all items in each sublist of x with all items in list y. And find the domain with the highest similarity for each sublist of x.

My expected output would be:

['smart phones', NaN, 'fruits']

Here's my code so far:

x = [['phones', 'galaxy samsung', 'iphone'],[],['fruit', 'food']] ##each sublist refers to one user
y = ['fruits', 'smart phones', 'fishing', 'cars']                 ##domains

point = [0] * len(x)
best_dom = ['n'] * len(x)

for list in x:
i=0
  for query in list:
    for dom in y:
        sim = similar(query,dom)
        if sim > point[i]:
            point[i] = sim
            best_dom[i] = dom
i = i+1

print(best_dom)

But this is the output I'm getting:

['fruits', 'n', 'n']

2 Answers 2

1

My solution script consist of two parts:

  1. Generating a list similar to x with its sublists.
    These hold tuples according to (<item>, <similarity_value>) storing the item and its highest similarity value to any item in y.
  2. Getting the max-value of each sublist or 'NaN' if the sublist is empty.

-> afterwards printing the output list

# this will  be similar to x but storing the similarity_value with alongside each item
similarity_values = []

# iterating over x to generate values of similarity_values
for i, sublist in enumerate(x):
    similarity_values.append([])  # creating new sublist inside
    for item in sublist:  # iterating over each item in a sublist
        similarity_values[i].append((item, similar(item)))

# outputting the max values
output_list = []
for sublist in similarity_values:
    if sublist == []:  # 'NaN' if list is empty
        output_list.append('NaN')
    else:  # otherwise item with highest similarity_value
        output_list.append(max(sublist, key=lambda e: e[1])[0])
        # key for max() is set to the second element of the tuple

print(output_list)

I hope this will solve your problem!

Sign up to request clarification or add additional context in comments.

Comments

0

I got it! I was not addressing the index in the correct way.

x = [['phones', 'galaxy samsung', 'iphone'],[],['fruit', 'food']] ##each sublist refers to one user
y = ['fruits', 'smart phones', 'fishing', 'cars']                 ##domains

point = [0] * len(x)
best_dom = ['n'] * len(x)

for user in x:
  for query in user:
    for dom in y:
      i = x.index(user)
      sim = similar(query,dom)
      if sim > point[i]:
            point[i] = sim
            best_dom[i] = dom
                
print(best_dom)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.