4

I am using COPY to copy a field from a table to a file. this field is a zipped text file, so I use a binary copy. the file is created, the only problem is that COPY adds a header and a trailer (?) to the file, which I don't need. can this be changed? is there a parameter that can cause COPY to put the field exactly as it is in the database?

If I manually delete the unwanted header I can extract the file with zcat, or gunzip.

I am doing somthing like this:

psql -d some_database -c \
 "copy  (select some_column from a_table where id=900) to stdout with BINARY;" > /tmp/tmp.gz

And then I want to do

gunzip /tmp/tmp.gz

Any ideas?

6 Answers 6

4

One possibility, which works although you may not like it:

psql -At -c "select encode(content, 'base64') from t where ..."  | base64 -d

i.e. print the content as base64 and decode it. I think the reality is that psql is intended to produce readable output, and persuading it to disgorge raw binary data is intentionally difficult.

I suppose if you want to enough, you can write some tool (Perl/python script) to connect to the database and print raw output directly.

The "WITH BINARY" option to COPY doesn't just do a simple binary output, it performs some encoding which is probably dubious to rely on.

Sign up to request clarification or add additional context in comments.

Comments

1

Are your sure it's best way to store zipped text in database as binary ? According to documentation long text is implicitly/automatically compressed:

Long strings are compressed by the system automatically, so the physical requirement on disk might be less. Very long values are also stored in background tables so that they do not interfere with rapid access to shorter column values. In any case, the longest possible character string that can be stored is about 1 GB.

1 Comment

I agree it isn't the best way :-) but I am not the one who decided to do it, I just have to get it out of there :-)
1

I don't know a straightforward way... COPY has a binary format with a variable length header, not very easy to "trim". Outside that, PG is rather text-centric, I don't tkink there is way to force an "raw" (binary) output from a SELECT for a BYTEA field.

You could get a textual hexadecimal output and write yourself a little program (C, perl or whatever) to convert it from say \x000102414243 to binary. Not difficult, but not straightforward (and the hex format is in Postgresql 9.0)

psql  -t -q -c "select binaryfield from.. where ..." mydb  |  myhextobin > tmp.gz

BTW, Grzegorz's answer is very pertinent.

Added: not very clean, nor foolproof, just if something finds it useful...

/* expects a pg hexadecimal string, in "\x....." format, and converts to binary*/
/* warning: no checks! it just ignores chars outside [0-9a-f] */
#include<stdio.h>
int main() {
    int x, pos, v;
    char hex[3]={0,0,0};
    pos = 0;
    while( (x = getchar()) >= 0) {
        if(( x >='0' && x <= '9') || ( x >= 'a' && x <= 'f' )) {
            hex[pos++] = (char)x;
            if(pos == 2) {
                sscanf(hex, "%x", &v);
                putchar((char)v);
                pos = 0;
            }
        }
    }
    return pos==0 ? 0 : 1;
}

Comments

1

The copy command do the job. You only need to tell: --no-align and --tuples-only.

For compression, use gzip between psql and file

psql --tuples-only --no-align -d some_database -c \ "copy (select some_column from a_table where id=900) to stdout with BINARY;" | gzip > /tmp/tmp.gz

Comments

0

It is not suggested to attempt to decode the postgresql binary format. Just because the test file you are using works doesn't mean everything will work. For instance, perhaps certain character sequences (not appearing in your test file) get escaped.

2 Comments

I don't get it. If b is a bytea field (which I assumed) these would output a hexadecimal string, not the raw binary content. (I didn't downvote)
@leonbloy: Yes, the type of the data was not specified. The actual output format depends on bytea_output()--though the default formats are all encoded formats. PostgreSQL doesn't support the blob type inline, just references to large objects.
0

You may find it easier to do this by using a language that has client drivers and the ability to read the bytea type: PHP, python, ruby, perl, javascript, java, etc. Just perform your query there, use gzip libraries that probably already exist in that language, and write out the file.

Alternately, you could use a procedural language inside the database and create a stored procedure. You would pass the requested filename to the stored procedure.

1 Comment

Alternately, I like Araqnid's answer about using encode() with type base64.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.