2

I have to read a file (existing format not under my control) that contains an XML document and encoded data. This file unfortunately includes MQ-related data around it including hex zeros (end of files).

So, using Java, how can I read this file, stripping or ignoring the "garbage" I don't need to get at the XML and encoded data. I believe an acceptable solution is to just leave out the hex zeros (are there other values that will stop my reading?) since I don't need the MQ information (RFH header) anyway and the counts are meaningless for my purposes.

I have searched a lot and only find really heinous complicated "solutions". There must be a better way...

2
  • Is the pre/post garbage of a known length? Commented Jan 20, 2012 at 21:34
  • I have no documentation of it, so I'd have to say "no" Commented Jan 20, 2012 at 22:17

2 Answers 2

1

What worked was to pull out the XML documents - Groovy code:

    public static final String REQUEST_XML          = "<Request>";
    public static final String REQUEST_END_XML      = "</Request>";
    /**
 * @param xmlMessage
 * @return 1-N EncodedRequests for those I contain
 */
private void extractRequests( String xmlMessage ) {
    int start = xmlMessage.indexOf(REQUEST_XML);
    int end = xmlMessage.indexOf(REQUEST_END_XML);
    end += REQUEST_END_XML.length();
    while( start >= 0 ) {   //each <Request>
        requests.add(new EncodedRequest(xmlMessage.substring(start,end)));
        start = xmlMessage.indexOf(REQUEST_XML, end);
        end = xmlMessage.indexOf(REQUEST_END_XML, start);
        end += REQUEST_END_XML.length();
    }
}

and then decode the base64 portion:

    public String getDecodedContents() {
    if( decodedContents == null ) {
        byte[] decoded = Base64.decodeBase64(getEncodedContents().getBytes());
        String newString = new String(decoded);
        decodedContents = newString;
        decodedContents = decodedContents.replace('\r','\t');
    }
    return decodedContents;
}
Sign up to request clarification or add additional context in comments.

Comments

0

I've hit this issue before (well ... something similar). Have a look a my FilterInputStream for a file filter that you should be able to modify to your needs.

Essentially it implements a push-back buffer that chucks away anything you don't want.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.