0

This is an exercise from DataQuest.

I guess I'm trying to iterate over an array but it won't let me. How is an array different from a list?

32561 is the sample size, and 16280.50 is 50% male and 50% female.

import numpy as np
import matplotlib.pyplot as plt

chi_squared_values = []

for i in range(1000):
    random_n = np.random.random((32561,))
    for array in random_n:
        male_count = 0
        female_count = 0
        for n in array: # Error on this line
            if n < 0.5:
                male_count =+ 1
            else:
                female_count =+ 1
        male_diff = (male_count - 16280.5) ** 2 / 16280.5
        female_diff = (female_count - 16280.5) ** 2 / 16280.5
        chi_squared_value = male_diff + female_diff
        chi_squared_values.append(chi_squared_value)

plt.hist(chi_squared_values)
plt.show()

# Output: TypeError: 'numpy.float64' object is not iterable

The correct answer for reference is:

chi_squared_values = []
from numpy.random import random
import matplotlib.pyplot as plt

for i in range(1000):
    sequence = random((32561,))
    sequence[sequence < .5] = 0
    sequence[sequence >= .5] = 1
    male_count = len(sequence[sequence == 0])
    female_count = len(sequence[sequence == 1])
    male_diff = (male_count - 16280.5) ** 2 / 16280.5
    female_diff = (female_count - 16280.5) ** 2 / 16280.5
    chi_squared = male_diff + female_diff
    chi_squared_values.append(chi_squared)

plt.hist(chi_squared_values)
1
  • 1
    That's not an array. That's a single number! Why do you think iterating over random_n would give you arrays? Commented Mar 1, 2019 at 0:13

1 Answer 1

1

Reduce the quantities so you can see what's happening:

for i in range(1):
    random_n = np.random.random((5,))
    for array in random_n:
        print("array", array)

Output:

array 0.134163286857
array 0.872361053661
array 0.794873889688
array 0.68134812363
array 0.726452821311

random_n is simply an array of floats. Thus, what you've named array is a single float. You cannot iterate over that.

What are trying to achieve by changing the structure of the solution? What is your inner loop supposed to do?

Sign up to request clarification or add additional context in comments.

3 Comments

Okay. for some reason, I thought that range(1000) gave me 1000 extra arrays in the 32561 array. I removed the second for statement and remove one indent in the if statement. The code is taking a long time to run. Does appending many values to a list slow down the program or do I have some infinite loop?
If you thought that, then you need to back up a little and review what the for statement does. All you get is to repeat the loop body 1000 times.
I don't know for sure what your slowdown is. First, this is a separate problem; we need to you properly retire this one (for instance, accept an answer) and post your new problem as exactly that, with the code you're using. If I make the changes you describe ver batim, the program fails for an undefined symbol.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.