How to insert bytestrings into HANA

2.3k Views Asked by At

I am trying to insert two byte strings into a HANA table with VARBINARY columns, but I keep getting a syntax error, e.g.

SAP DBTech JDBC: [257]: sql syntax error: incorrect syntax near "G\xa2ac\xa0av\xf6": line 1 col 98 (at pos 98)

My two byte strings look like this:

STRING1 = b'G\xa2ac\xa0av\xf6'  
type(STRING1) == <class 'bytes'>
STRING2 = b'708ca7fbb701799bb387f2e50deaca402e8502abe229f705693d2d4f350e1ad6' 
type(STRING2) == <class 'bytes'>

My query to insert the values looks like this:

INSERT INTO testTable VALUES(
CAST(b'708ca7fbb701799bb387f2e50deaca402e8502abe229f705693d2d4f350e1ad6' AS VARBINARY),
CAST(b'G\xa2ac\xa0av\xf6' AS VARBINARY));

I've also tried to do a query how the documentation suggests:

INSERT INTO testTable VALUES(
CAST(x'708ca7fbb701799bb387f2e50deaca402e8502abe229f705693d2d4f350e1ad6' AS VARBINARY),
CAST(x'G\xa2ac\xa0av\xf6' AS VARBINARY));

As well as:

INSERT INTO testTable VALUES(
b'708ca7fbb701799bb387f2e50deaca402e8502abe229f705693d2d4f350e1ad6',
b'G\xa2ac\xa0av\xf6');

But all of these give me some syntax error. Any help would be greatly appreciated. Thanks!

2

There are 2 best solutions below

1
Lars Br. On

The problem here lies with your STRING1 value ( b'G\xa2ac\xa0av\xf6' ). It is not a valid hexadecimal string that can represent a binary value in SAP HANA. That's why any type casting will fail here. Instead, it seems that it is actually a string and some of the characters are represented hexadecimal values (UNICODE codepoints maybe?).
At least that's what I make of the \x escpace sequence in the string.

So, you can do different things now.

  1. you can store the string as-is with the escape sequences in the VARBINARY column. To do that, you can use to_binary('G\xa2ac\xa0av\xf6') in the insert statement.
  2. you can convert this string into a valid UNICODE string in your application code and store the data in an NVARCHAR column instead.
0
Goldfishslayer On

As far as I am aware HANA does not understand byte encode like python so I think there is the mix up if you use that representation within the sql console. So in python when printing b'G\xa2ac\xa0av\xf6' a byte that is non presentable in ascii (your local encoding?) is prefixed with \x.

If you want to do that you might first want to convert that to a hex representation in python

>>> import binascii
>>> binascii.hexlify(b'\xa2ac\xa0av\xf6')
b'47a26163a06176f6'

This will give you a uniform representation of your bytearray in hex which you can now use in your SQL console (as HANA Studio and the likes):

INSERT INTO TestTable VALUES(x'47a26163a06176f6');
-- OR
INSERT INTO TestTable VALUES(HEXTOBIN('47a26163a06176f6'));

Note that the prefix b changes to x in the first case to indicate HANA that it should consider this as binary data in hexadecimal representation.

To insert the value from Python 2 as prepared statement:

>>> cursor.execute("INSERT INTO TestTable Values(?)", \
        parameters=[binascii.hexlify(b'G\xa2ac\xa0av\xf6')])

PyHDB seems to expect a string to cope correctly, but in Python 3 hexlify will yield a byte array so you need to turn the result into a string again

>>> param = str(binascii.hexlify(b'G\xa2ac\xa0av\xf6'), 'ascii')
>>> cursor.execute("INSERT INTO TestTable Values(?)", parameters=[param])

I guess this could be considered a bug in PyHDB or at least an inconsistency. Just for completeness sake, in SAP's dbapi client there is a Binary class to wrap bytearrays for this purpose.

Now query that with your client

>>> import pyhdb
>>> con = pyhdb.connect(....)
>>> cursor = con.cursor()
>>> cursor.execute('SELECT * FROM TestTable')
>>> cursor.fetchall()
[(b'G\xa2ac\xa0av\xf6',)]

To sum the entire thing up: b'G\xa2ac\xa0av\xf6' is not a representation HANA understands as such when using it in a SQL statement. We need to find a common ground, for that we converted the bytearray to a hex representation (hexlify) and told HANA to handle it as such (x-prefix / HEXTOBIN).

As Lars Br. mentioned, if those are indeed unicode literals you might want to consider NVARCHAR as datatype.