Break string into list of characters in Python [duplicate]

Question

Essentially I want to suck a line of text from a file, assign the characters to a list, and create a list of all the separate characters in a list -- a list of lists.

At the moment, I've tried this:

fO = open(filename, 'rU')
fL = fO.readlines()

That's all I've got. I don't quite know how to extract the single characters and assign them to a new list.

The line I get from the file will be something like:

fL = 'FHFF HHXH XXXX HFHX'

I want to turn it into this list, with each single character on its own:

['F', 'H', 'F', 'F', 'H', ...]

cs95 · Accepted Answer · 2019-06-08 07:26:00Z

174

You can do this using list:

new_list = list(fL)

Be aware that any spaces in the line will be included in this list, to the best of my knowledge.

edited Jun 8, 2019 at 7:26

cs95

406k106 gold badges744 silver badges797 bronze badges

answered Mar 23, 2012 at 2:34

Elliot Bonneville

53.6k23 gold badges101 silver badges125 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Ali Shah Ahmed Over a year ago

with utf-8 characters it doesn't work as expected. For string "zyć", i was expecting a list of 3 characters, instead i got this list: ['z', 'y', '\xc4', '\x87']. Could you please guide on what could be done to resolve this issue. Thanks

Ali Shah Ahmed Over a year ago

i've got my answer, i forgot to add 'u' before my string, so it was not getting treated as unicode. thanks.

Oscar · Accepted Answer · 2016-06-11 16:12:09Z

64

I'm a bit late it seems to be, but...

a='hello'
print list(a)
# ['h','e','l','l', 'o']

answered Jun 11, 2016 at 16:12

Oscar

6415 silver badges2 bronze badges

Comments

koblas · Accepted Answer · 2012-03-23 02:45:54Z

31

Strings are iterable (just like a list).

I'm interpreting that you really want something like:

fd = open(filename,'rU')
chars = []
for line in fd:
   for c in line:
       chars.append(c)

or

fd = open(filename, 'rU')
chars = []
for line in fd:
    chars.extend(line)

or

chars = []
with open(filename, 'rU') as fd:
    map(chars.extend, fd)

chars would contain all of the characters in the file.

edited Mar 23, 2012 at 2:45

answered Mar 23, 2012 at 2:37

koblas

27.4k6 gold badges42 silver badges51 bronze badges

2 Comments

agf Over a year ago

@FlexedCookie itertools.chain is really the simplest for this -- chars = list(itertools.chain.from_iterable(open(filename, 'rU))).

user2489252 Over a year ago

The code above does not account for the whitespaces, i.e., " "

cs95 · Accepted Answer · 2020-01-10 08:29:28Z

22

python >= 3.5

Version 3.5 onwards allows the use of PEP 448 - Extended Unpacking Generalizations:

>>> string = 'hello'
>>> [*string]
['h', 'e', 'l', 'l', 'o']

This is a specification of the language syntax, so it is faster than calling list:

>>> from timeit import timeit
>>> timeit("list('hello')")
0.3042821969866054
>>> timeit("[*'hello']")
0.1582647830073256

edited Jan 10, 2020 at 8:29

answered Jun 8, 2019 at 7:28

cs95

406k106 gold badges744 silver badges797 bronze badges

Comments

Ry- · Accepted Answer · 2019-10-18 05:17:17Z

10

So to add the string hello to a list as individual characters, try this:

newlist = []
newlist[:0] = 'hello'
print (newlist)

  ['h','e','l','l','o']

However, it is easier to do this:

splitlist = list(newlist)
print (splitlist)

edited Oct 18, 2019 at 5:17

Ry-♦

226k56 gold badges496 silver badges504 bronze badges

answered Jan 14, 2014 at 16:21

Tim

2,6021 gold badge28 silver badges33 bronze badges

2 Comments

tim Over a year ago

But even easier is: newlist = list('hello')

Tim Over a year ago

@tim Yeah, just noticed I hadn't put that in :)

John La Rooy · Accepted Answer · 2012-03-23 03:04:17Z

7

fO = open(filename, 'rU')
lst = list(fO.read())

answered Mar 23, 2012 at 3:04

John La Rooy

306k54 gold badges378 silver badges513 bronze badges

Comments

user2489252 · Accepted Answer · 2013-07-25 04:33:09Z

5

Or use a fancy list comprehension, which are supposed to be "computationally more efficient", when working with very very large files/lists

fd = open(filename,'r')
chars = [c for line in fd for c in line if c is not " "]
fd.close()

Btw: The answer that was accepted does not account for the whitespaces...

answered Jul 25, 2013 at 4:33

user2489252

Comments

Zhou Shuai-Ming · Accepted Answer · 2015-09-08 06:48:13Z

5

a='hello world'
map(lambda x:x, a)

['h', 'e', 'l', 'l', 'o', ' ', 'w', 'o', 'r', 'l', 'd']

An easy way is using function “map()”.

edited Sep 8, 2015 at 6:48

answered Jul 22, 2015 at 2:55

Zhou Shuai-Ming

1432 silver badges4 bronze badges

Comments

hostingutilities.com · Accepted Answer · 2012-03-23 03:44:21Z

In python many things are iterable including files and strings. Iterating over a filehandler gives you a list of all the lines in that file. Iterating over a string gives you a list of all the characters in that string.

charsFromFile = []
filePath = r'path\to\your\file.txt' #the r before the string lets us use backslashes

for line in open(filePath):
    for char in line:
        charsFromFile.append(char) 
        #apply code on each character here

or if you want a one liner

#the [0] at the end is the line you want to grab.
#the [0] can be removed to grab all lines
[list(a) for a in list(open('test.py'))][0]

.

Edit: as agf mentions you can use itertools.chain.from_iterable

His method is better, unless you want the ability to specify which lines to grab list(itertools.chain.from_iterable(open(filename, 'rU)))

This does however require one to be familiar with itertools, and as a result looses some readablity

If you only want to iterate over the chars, and don't care about storing a list, then I would use the nested for loops. This method is also the most readable.

ol mighty · Accepted Answer · 2019-03-26 15:14:08Z

0

Because strings are (immutable) sequences they can be unpacked similar to lists:

with open(filename, 'rU') as fd:
    multiLine = fd.read()
    *lst, = multiLine

When running map(lambda x: x, multiLine) this is clearly more efficient, but in fact it returns a map object instead of a list.

with open(filename, 'rU') as fd:
    multiLine = fd.read()
    list(map(lambda x: x, multiLine))

Turning the map object into a list will take longer than the unpacking method.

edited Mar 26, 2019 at 15:14

answered Mar 26, 2019 at 12:11

ol mighty

112 bronze badges

Collectives™ on Stack Overflow

Break string into list of characters in Python [duplicate]

10 Answers 10

2 Comments

Comments

2 Comments

python >= 3.5

Comments

2 Comments

Comments

Comments

Comments

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

10 Answers 10

2 Comments

Comments

2 Comments

python >= 3.5

Comments

2 Comments

Comments

Comments

Comments

Comments

Comments

Linked

Related