1

I have a string :

V51M229D180728T132714_ACCEPT_EC_NC

This needs to be split into

String 1 : V51 (Can be variable but always ends before M)
String 2 : M22 (Can be variable but always ends before D)
String 3 : D180728 (Date in YYMMDD format)
String 4 : 132714 (Timestamp in HHMMSS format)
String 5 : ACCEPT (Occurs between "_")
String 6 : EC (Occurs between "_")
String 7 : NC (Occurs between "_")

I am new to python and hoping to get some help with this.

Thanks.

5 Answers 5

1

Use re module:

import re
a = 'V51M229D180728T132714_ACCEPT_EC_NCM'
re.search('(\w+)(M\w+)(D\d+)(T\d+)_(\w+)_(\w+)_(\w+)', a).groups()

You will get:

('V51', 'M229', 'D180728', 'T132714', 'ACCEPT', 'EC', 'NCM')
Sign up to request clarification or add additional context in comments.

Comments

1

If your data is of fixed pattern just sting slicing and list slicing works.

  aa = "V51M229D180728T132714_ACCEPT_EC_NC"                                          
  a = aa.split("_")                                                                 
  str1 = a[0][0:3]                                                                  
  str2 = a[0][3:6]                                                                  
  str3 = a[0][7:14]                                                                 
  str4 = a[0][15:21]                                                                
  str5 = a[1]                                                                       
  str6 = a[2]                                                                     
  str7 = a[3]                                
  print(str1,str2,str3,str4,str5,str6,str7)

Output

V51 M22 D180728 132714 ACCEPT EC NC

Comments

0

Use split(). From docs:

str.split(sep=None, maxsplit=-1)

Return a list of the words in the string, using sep as the delimiter string. If maxsplit is given, at most maxsplit splits are done (thus, the list will have at most maxsplit+1 elements). If maxsplit is not specified or -1, then there is no limit on the number of splits (all possible splits are made).

So you can use split('M', 1) to get the list of ['V51', '229D180728T132714_ACCEPT_EC_NC'], then split the second entry of the list with 'D' delimiter to get ['229', '180728T132714_ACCEPT_EC_NC']...

Hope you get the idea.

Comments

0

As mxmt said, use regular expressions. Here is another equivalent regex, which might be a little easier to read:

import re

s = 'V51M229D180728T132714_ACCEPT_EC_NC'

pattern = re.compile(r'''
    ^        # beginning of string
    (V\w+)   # first pattern, starting with V
    (M\w+)   # second pattern, starting with M
    (D\d{6}) # third string pattern, six digits starting with D
    T(\d{6}) # time, six digits after T
    _(\w+)
    _(\w+)
    _(\w+)   # final three patterns
    $        # end of string
    ''', re.VERBOSE
)

re.match(pattern, s).groups() -> ('V51', 'M229', 'D180728', '132714', 'ACCEPT', 'EC', 'NC')

Comments

0

You probably want to use a regex with matching groups. See the re module.

For example,

>>> mystr = 'V51M229D180728T132714_ACCEPT_EC_NC'
>>> re.match('(.*?)(M.*?)(D.*?)T(.*?)_(.*?)_(.*?)_(.*?)', mystr).groups()
('V51', 'M229', 'D180728', '132714', 'ACCEPT', 'EC', 'NC')

In the pattern, the () indicate a group, and .*? will match the minimal number of characters to make the pattern fit.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.