3

I have a table with a text array (text[]) type column in it. I want to use the COPY command to copy a CSV in. I'm using Psycopg2's copy capability, but the question is relevant to Postgres in general.

It seems that Postgres only accepts arrays formatted like {"string1","string2","string3"}, not ARRAY['string1', 'string2', 'string3'] (see below). This is a problem because the string escaping in the former format is a huge pain, and Psycopg2's mogrify function outputs arrays in the latter format. Manual escaping in the first format is my last resort, but I really don't want to go there...

Is there any way to make Postgres take the latter format for copying or some other workaround?

Here are my tests:

-- proof that both syntaxes are valid and compare equal
db=# SELECT ARRAY['string1', 'string2', 'string3']::text[] = '{"string1","string2","string3"}'::text[];
 ?column?
----------
 t
(1 row)

-- COPY works with {} syntax
db=# CREATE TEMP TABLE test(words text[]);
CREATE TABLE
db=# COPY test FROM stdin;
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>> {"string1","string2","string3"}
>> \.
COPY 1

-- COPY fails with ARRAY syntax
db=# COPY test FROM stdin;
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>> ARRAY['string1', 'string2', 'string3']
>> \.
ERROR:  malformed array literal: "ARRAY['string1', 'string2', 'string3']"
DETAIL:  Array value must start with "{" or dimension information.
CONTEXT:  COPY test, line 1, column words: "ARRAY['string1', 'string2', 'string3']"
2
  • Are you using mogrify to generate the file? Commented Aug 18, 2016 at 19:13
  • No, I'm passing data into Python's csv writer. mogrify formats for queries, which have different rules (e.g. single quotes around strings). I did try using mogrify just for the array values, but, as I said, it gives me the ARRAY syntax. Commented Aug 18, 2016 at 19:24

2 Answers 2

3

Make your data a list of tuples:

data = [
    (1, ['a','b']),
    (2, ['c','d'])
]

Create a values syntax template to receive the data tuples:

values_template = ','.join(['%s'] * len(data))

Place it into a copy command:

copy_values = "copy (values {0}) to stdout (format csv)".format(values_template)

Use mogrify to adapt the Python types to Postgresql types:

copy_values = cursor.mogrify(copy_values, data)

copy_expert exports the file:

f = open('file.csv', 'wb')
cursor.copy_expert(copy_values, f, size=8192)
f.close()
Sign up to request clarification or add additional context in comments.

2 Comments

Ah, solved! This is a cool workaround. Weird that I never thought of it even though I was debugging by copying values to stdout. I've got a COPY utility in my Python project that writes to a file to be copied, and I'll change it to do it this way.
One drawback is that this is slower than Python's csv library, at least with my setup (database local on machine).
1

The first test (the proof) is not really correct. In this case this should be the test:

SELECT 'ARRAY["string1", "string2", "string3"]'::text[] = '{"string1","string2","string3"}'::text[]

and that does not work. So I would assume no, this format can not be used to COPY FROM stdin.

3 Comments

The {} syntax expects single quotes surrounding it in a query, and ARRAY does not. SELECT 'ARRAY["string1", "string2", "string3"]'::text[]; returns an error. But {} syntax does not expect single quotes when copying from a CSV.
That is exactly the problem. When you read it from stdin it behaves as if it was a quoted string, so you end up with an error.
Ah, I see. Thanks. Looks like I'll have to find another way to do this.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.