1

i have list like this

["<name:john student male age=23 subject=\computer\sience_{20092973}>",
"<name:Ahn professor female age=61 subject=\computer\math_{20092931}>"]

i want to get student using {20092973},{20092931}.

so i want to split to list like this

my expect result 1 is this (input is {20092973})

"student"

my expect result 2 is this (input is {20092931})

"professor"

i already searching... but i can't find.. sorry..

how can i this?

7
  • Where did you get this list? The format is presumably defined somewhere; if that's available, it's always better to use it than to guess at it. Commented May 4, 2015 at 8:14
  • i get this list in scapy. i make function modify scapy. Commented May 4, 2015 at 8:16
  • origin list is ['<NetworkInterface: eth6 11.4.9.22 24:f5:aa:e4:fb:2f pcap_name=eth0 win_name=\\Device\\NPF_{CDC97813-CC28-4260-BA1E-F0CE3081DEC7}>'] Commented May 4, 2015 at 8:16
  • i want to get eth6 using {CDC97813-CC28-4260-BA1E-F0CE3081DEC7} Commented May 4, 2015 at 8:17
  • Scapy doesn't give you a string, it gives you an object that has attributes. If you save that by turning it into a string, then you have to parse it again, which is a pain. Why not just save the values you want in the first place? Commented May 4, 2015 at 8:22

3 Answers 3

5

I don't think you should be doing this in the first place. Unlike your toy example, your real problem doesn't involve a string in some clunky format; it involves a Scapy NetworkInterface object. Which has attributes that you can just access directly. You only have to parse it because for some reason you stored its string representation. Just don't do that; store the attributes you actually want when you have them as attributes.

The NetworkInterface object isn't described in the documentation (because it's an implementation detail of the Windows-specific code), but you can interactively inspect it like any other class in Python (e.g., dir(ni) will show you all the attributes), or just look at the source. The values you want are name and win_name. So, instead of print ni, just do something like print '%s,%s' % (ni.name, ni.win_name). Then, parsing the results in some other program will be trivial, instead of a pain in the neck.

Or, better, if you're actually using this in Scapy itself, just make the dict directly out of {ni.win_name: ni.name for ni in nis}. (Or, if you're running Scapy against Python 2.5 or something, dict((ni.win_name, ni.name) for ni in nis).)


But to answer the question as you asked it (maybe you already captured all the data and it's too late to capture new data, so now we're stuck working around your earlier mistake…), there are three steps to this: (1) Figure out how to parse one of these strings into its component parts. (2) Do that in a loop to build a dict mapping the numbers to the names. (3) Just use the dict for your lookups.

For parsing, I'd use a regular expression. For example:

<name:\S+\s(\S+).*?\{(\d+)\}>

Regular expression visualization

Debuggex Demo

Now, let's build the dict:

r = re.compile(r'<name:\S+\s(\S+).*?\{(\d+)\}>')
matches = (r.match(thing) for thing in things)
d = {match.group(2): match.group(1) for match in matches}

And now:

>>> d['20092973']
'student'
Sign up to request clarification or add additional context in comments.

3 Comments

d = {match.group(2): match.group(1) for match in matches} show me the invalid syntax error.. sorry..
@user3683061: There is no invalid syntax error. At least in Python 2.7, which is what you claim you're using.
@user3683061: Also, note that the pattern I gave you is for the toy format you asked about, not the real format you have.
2

Code:

def grepRole(role, lines):   
    return [line.split()[1] for line in lines if role in line][0]

l = ["<name:john student male age=23 subject=\computer\sience_{20092973}>",
     "<name:Ahn professor female age=61 subject=\compute\math_{20092931}>"]
print(grepRole("{20092973}", l))
print(grepRole("{20092931}", l))

Output:

student
professor

Comments

2
current_list = ["<name:john student male age=23 subject=\computer\sience_{20092973}>", "<name:Ahn professor female age=61 subject=\computer\math_{20092931}>"]

def get_identity(code):
    print([row.split(' ')[1] for row in current_list if code in row][0])


get_identity("{20092973}")

regular expression is good ,but for me, a rookie, regular expression is another big problem...

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.