276

I am trying to build a histogram of counts... so I create buckets. I know I could just go through and append a bunch of zeros i.e something along these lines:

buckets = []
for i in xrange(0,100):
    buckets.append(0)

Is there a more elegant way to do it? I feel like there should be a way to just declare an array of a certain size.

I know numpy has numpy.zeros but I want the more general solution

4
  • 5
    Python's lists are lists, not arrays. And in Python you don't declare stuff like you do in C: you define functions and classes (via def and class statements), and assign to variables which, if they don't exist already, are created magically on first assignment. Also, variables (and lists) are not memory regions that contain, but names refering to, objects. One object can be contained in only one memory region but can be referenced by several names. Commented Oct 30, 2010 at 1:10
  • 1
    Python doesn't have "declarations", especially of containers with a size but unspecified contents. You want something, you write an expression. Commented Oct 30, 2010 at 1:14
  • 3
    ...and the semicolons are completely unnecessary Commented Oct 30, 2010 at 2:03
  • 1
    Not a duplicate. The perceived need for an air-quotes empty list starts a different conversation about list allocation and assignment. Also there should be two landing pages for the different search terms, which the stats indicate are common. Commented Feb 24, 2022 at 13:32

10 Answers 10

532
buckets = [0] * 100

Careful - this technique doesn't generalize to multidimensional arrays or lists of lists. Which leads to the List of lists changes reflected across sublists unexpectedly problem

Sign up to request clarification or add additional context in comments.

Comments

156

Just for completeness: To declare a multidimensional list of zeros in python you have to use a list comprehension like this:

buckets = [[0 for col in range(5)] for row in range(10)]

to avoid reference sharing between the rows.

This looks more clumsy than chester1000's code, but is essential if the values are supposed to be changed later. See the Python FAQ for more details.

1 Comment

Yup, you're right, unless for some strange reason you want to operate on n copies of the same array :)
31

You can multiply a list by an integer n to repeat the list n times:

buckets = [0] * 100

Comments

26

Use this:

bucket = [None] * 100
for i in range(100):
    bucket[i] = [None] * 100

OR

w, h = 100, 100
bucket = [[None] * w for i in range(h)]

Both of them will output proper empty multidimensional bucket list 100x100

2 Comments

What about bucket = [[0] * w] * h?
@RobinHood by using your idea bucked[0][x] would be always the same as e.g. bucket[1][x]. Even after changing bucket[0][x].
21

use numpy

import numpy
zarray = numpy.zeros(100)

And then use the Histogram library function

1 Comment

Sorry, but numpy.zeros was explicitly excluded.
8

The question says "How to declare array of zeros ..." but then the sample code references the Python list:

buckets = []   # this is a list

However, if someone is actually wanting to initialize an array, I suggest:

from array import array

my_arr = array('I', [0] * count)

The Python purist might claim this is not pythonic and suggest:

my_arr = array('I', (0 for i in range(count)))

The pythonic version is very slow and when you have a few hundred arrays to be initialized with thousands of values, the difference is quite noticeable.

2 Comments

Hi, and why isn't the array('I', [0] * count) slow? I'd assume it will first create a full list and based on that full list then create the array, which sounds... terrible. I would guess that laziness would actually be quite beneficial in your last code snippet and make it run faster?
I can't attest to the why ... I can modify my code based on performance results. I didn't include results already confirmed by consensus.
3

The simplest solution would be

"\x00" * size # for a buffer of binary zeros
[0] * size # for a list of integer zeros

In general you should use more pythonic code like list comprehension (in your example: [0 for unused in xrange(100)]) or using string.join for buffers.

2 Comments

I agree that the list comprehension looks more Pythonic. However, I timed it, and found that it's about 10x slower than the multiplication syntax. I know, something something preoptimization evil.
I am creating an array('I') and was using (0 for i in range(count)) to fill ... and it is very slow: 28000 items in the array. The multiplication syntax is much faster. If 'pythonic' equates to slow, then its out with the 'pythonic' and in with fast.
0

Depending on what you're actually going to do with the data after it's collected, collections.defaultdict(int) might be useful.

Comments

-1

Well I would like to help you by posting a sample program and its output

Program :-

t=input("")

x=[None]*t

y=[[None]*t]*t

for i in range(1,t+1):

      x[i-1]=i;
      for j in range(1,t+1):
            y[i-1][j-1]=j;

print x

print y

Output :-

2

[1, 2]

[[1, 2], [1, 2]]

I hope this clears some very basic concept of yours regarding their declaration. To initialize them with some other specific values,like initializing them with 0..you can declare them as :

x=[0]*10

Hope it helps..!! ;)

Comments

-4

If you need more columns:

buckets = [[0., 0., 0., 0., 0.] for x in range(0)]

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.