2

I am writing a script to validate a deb package is installing to a specific folder. I am pretty green and new to python. I had a script that worked with python-apt module that returned a list with file paths of all files in the package. Due to some dependency issues I can no longer use the python-apt module, so instead I am trying to call dpkg to collect the information and parse it to a list of file paths. Below is what I am using to get the list of items returned from the dpkg command. I need to parse out everything but the right of the last space. What would be the most efficient way to parse this?

self.lists = commands.getoutput("dpkg -c "+deb).split('\n')

results of this is this list:

list: ['drwxr-xr-x ejohnson/ejohnson 0 2012-03-06 15:24 ./', 'drwxr-xr-x ejohnson/ejohnson 0 2012-03-06 15:14 ./opt/', 'drwxr-xr-x ejohnson/ejohnson 0 2012-03-06 15:14 ./opt/usr/', 'drwxr-xr-x ejohnson/ejohnson 0 2012-03-06 15:15 ./opt/usr/apps/', '-r--r--r-- ejohnson/ejohnson 179491 2012-03-06 15:15 ./opt/usr/apps/xbean-spring-2.8.jar', '-rw-r--r-- ejohnson/ejohnson    518 2012-03-06 15:15 ./opt/usr/apps/Hello.class', '-r--r--r-- ejohnson/ejohnson 1901653 2012-03-06 15:15 ./opt/usr/apps/spring-1.2.6.jar']

I want to reformat the list so that each item in the list would be the item after the last space for example ['./','./opt/','./opt/usr/','./opt/usr/apps/'...]

Thanks for looking

2
  • self.lists = [line.split(' ')[-1] for line in commands.getoutput("dpkg -c "+deb).split('\n')]. Can be optimized though. Commented Mar 20, 2012 at 12:39
  • 1
    I suggest using ... [line.split()[-1] .... Commented Mar 20, 2012 at 12:50

3 Answers 3

6

Simple, just have your list in l variable and this code should work for you.

[el.split()[-1] for el in l]
Sign up to request clarification or add additional context in comments.

3 Comments

Don't suggest that anyone ever create a variable with the name of a built-in unless the intention is explicitly to shadow that built-in.
Looks like the OP already used that name. Don't hammer zeroos for that.
@Marcin, good point actually, just changed the name of this variable.
2

Turn each of the strings into a list by using split. Take the last element of each list using relative list indexing.

For extra credit, do it in a one line list comprehension.

Comments

2

str.rpartition may be more efficient than str.split

[x.rpartition(" ")[2] for x in your_list]

For the sample here, it is more than twice as fast

$ python -m timeit -s "L=['drwxr-xr-x ejohnson/ejohnson 0 2012-03-06 15:24 ./', 'drwxr-xr-x ejohnson/ejohnson 0 2012-03-06 15:14 ./opt/', 'drwxr-xr-x ejohnson/ejohnson 0 2012-03-06 15:14 ./opt/usr/', 'drwxr-xr-x ejohnson/ejohnson 0 2012-03-06 15:15 ./opt/usr/apps/', '-r--r--r-- ejohnson/ejohnson 179491 2012-03-06 15:15 ./opt/usr/apps/xbean-spring-2.8.jar', '-rw-r--r-- ejohnson/ejohnson    518 2012-03-06 15:15 ./opt/usr/apps/Hello.class', '-r--r--r-- ejohnson/ejohnson 1901653 2012-03-06 15:15 ./opt/usr/apps/spring-1.2.6.jar']" \
> "[x.split()[-1] for x in L]"
100000 loops, best of 3: 5.2 usec per loop

$ python -m timeit -s "L=['drwxr-xr-x ejohnson/ejohnson 0 2012-03-06 15:24 ./', 'drwxr-xr-x ejohnson/ejohnson 0 2012-03-06 15:14 ./opt/', 'drwxr-xr-x ejohnson/ejohnson 0 2012-03-06 15:14 ./opt/usr/', 'drwxr-xr-x ejohnson/ejohnson 0 2012-03-06 15:15 ./opt/usr/apps/', '-r--r--r-- ejohnson/ejohnson 179491 2012-03-06 15:15 ./opt/usr/apps/xbean-spring-2.8.jar', '-rw-r--r-- ejohnson/ejohnson    518 2012-03-06 15:15 ./opt/usr/apps/Hello.class', '-r--r--r-- ejohnson/ejohnson 1901653 2012-03-06 15:15 ./opt/usr/apps/spring-1.2.6.jar']" \
> "[x.rpartition(' ')[2] for x in L]"
100000 loops, best of 3: 2.55 usec per loop

$ python -m timeit -s "L=['drwxr-xr-x ejohnson/ejohnson 0 2012-03-06 15:24 ./', 'drwxr-xr-x ejohnson/ejohnson 0 2012-03-06 15:14 ./opt/', 'drwxr-xr-x ejohnson/ejohnson 0 2012-03-06 15:14 ./opt/usr/', 'drwxr-xr-x ejohnson/ejohnson 0 2012-03-06 15:15 ./opt/usr/apps/', '-r--r--r-- ejohnson/ejohnson 179491 2012-03-06 15:15 ./opt/usr/apps/xbean-spring-2.8.jar', '-rw-r--r-- ejohnson/ejohnson    518 2012-03-06 15:15 ./opt/usr/apps/Hello.class', '-r--r--r-- ejohnson/ejohnson 1901653 2012-03-06 15:15 ./opt/usr/apps/spring-1.2.6.jar']" \
> "[x.rsplit(' ',1)[1] for x in L]"
100000 loops, best of 3: 3.5 usec per loop

4 Comments

x.rsplit(' ',1)[1] was fastest for me.
@Steven: You could also use rsplit(None, 1)[1] to keep those any-amount-of-space semantics.
@StevenRumbalski,might depend on the cpu and the version of Python amongst other things. I ran it with Python2.7.2 on a Core2 cpu
rpartition is also fastest with Python2 and Python3 on a pentium G850

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.