5

So my problem is this, I have a file that looks like this:

[SHIFT]this isrd[BACKSPACE][BACKSPACE] an example file[SHIFT]1

This would of course translate to

' This is an example file!'

I am looking for a way to parse the original content into the end content, so that a [BACKSPACE] will delete the last character(spaces included) and multiple backspaces will delete multiple characters. The [SHIFT] doesnt really matter as much to me. Thanks for all the help!

1
  • Are [BACKSPACE] and [SHIFT] the only markups that you need to worry about? Commented Feb 3, 2011 at 3:12

5 Answers 5

1

Here's one way, but it feels hackish. There's probably a better way.

def process_backspaces(input, token='[BACKSPACE]'):
    """Delete character before an occurence of "token" in a string."""
    output = ''
    for item in (input+' ').split(token):
        output += item
        output = output[:-1]
    return output

def process_shifts(input, token='[SHIFT]'):
    """Replace characters after an occurence of "token" with their uppecase 
    equivalent. (Doesn't turn "1" into "!" or "2" into "@", however!)."""
    output = ''
    for item in (' '+input).split(token):
        output += item[0].upper() + item[1:]
    return output

test_string = '[SHIFT]this isrd[BACKSPACE][BACKSPACE] an example file[SHIFT]1'
print process_backspaces(process_shifts(test_string))
Sign up to request clarification or add additional context in comments.

Comments

1

If you don't care about the shifts, just strip them, load

(defun apply-bspace ()
  (interactive)
  (let ((result (search-forward "[BACKSPACE]")))
    (backward-delete-char 12)
    (when result (apply-bspace))))

and hit M-x apply-bspace while viewing your file. It's Elisp, not python, but it fits your initial requirement of "something I can download for free to a PC".

Edit: Shift is trickier if you want to apply it to numbers too (so that [SHIFT]2 => @, [SHIFT]3 => #, etc). The naive way that works on letters is

(defun apply-shift ()
  (interactive)
  (let ((result (search-forward "[SHIFT]")))
    (backward-delete-char 7)
    (upcase-region (point) (+ 1 (point)))
    (when result (apply-shift))))

2 Comments

+1 for an Elisp answer! It's (not too suprisingly, I guess) quite good at this sort of thing... I'm a vim person, personally, but things like this sometimes pull me towards emacs.
@Joe Kington - Hehe. To be truthful, this is the sort of thing I'd handle with a keyboard macro and maybe an alist unless there were multiple, large files that needed parsing. It's just that a function is easier to share and explain.
1

This does exactly what you want:

def shift(s):
    LOWER = '`1234567890-=[];\'\,./'
    UPPER = '~!@#$%^&*()_+{}:"|<>?'

    if s.isalpha():
        return s.upper()
    else:
        return UPPER[LOWER.index(s)]

def parse(input):
    input = input.split("[BACKSPACE]")
    answer = ''
    i = 0
    while i<len(input):
        s = input[i]
        if not s:
            pass
        elif i+1<len(input) and not input[i+1]:
            s = s[:-1]
        else:
            answer += s
            i += 1
            continue
        answer += s[:-1]
        i += 1

    return ''.join(shift(i[0])+i[1:] for i in answer.split("[SHIFT]") if i)

>>> print parse("[SHIFT]this isrd[BACKSPACE][BACKSPACE] an example file[SHIFT]1")
>>> This is an example file!

2 Comments

Oops, I just spotted a bug... sorry. Fixing it now
The debug is complete and the result is exactly what you want
0

It seems that you could use a regular expression to search for (something)[BACKSPACE] and replace it with nothing...

re.sub('.?\[BACKSPACE\]', '', YourString.replace('[SHIFT]', ''))

Not sure what you meant by "multiple spaces delete multiple characters".

4 Comments

-1 How will this work for "blah[BACKSPACE][BACKSPACE][BACKSPACE]arf"?
But it needs to delete one space BEFORE the backspace as well as the '[BACKSPACE]' itslef
That's my point -- gahooa's solution won't work for my blah-barf example.
Yeah, i just saw what you were saying, so far the only way I can think of to do it would combine python with autoit or another manual macro/automation service, but the results would be tedious at best, and possibly not 100% functioning
0

You need to read the input, extract the tokens, recognize them, and give a meaning to them.

This is how I would do it:

# -*- coding: utf-8 -*-

import re

upper_value = {
    1: '!', 2:'"',
}

tokenizer = re.compile(r'(\[.*?\]|.)')
origin = "[SHIFT]this isrd[BACKSPACE][BACKSPACE] an example file[SHIFT]1"
result = ""

shift = False

for token in tokenizer.findall(origin):
    if not token.startswith("["):
        if(shift):
            shift = False
            try:
                token = upper_value[int(token)]
            except ValueError:
                token = token.upper()

        result = result + token
    else:
        if(token == "[SHIFT]"):
            shift = True
        elif(token == "[BACKSPACE]"):
            result = result[0:-1]

It's not the fastest, neither the elegant solution, but I think it's a good start.

Hope it helps :-)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.