According to spec VARCHAR2(max_size CHAR)
should store max_size
chars. I observe other/strange behavior for Unicode texts.
Let's consider that example:
create table test (id varchar2(3 char) not null, primary key (id));
insert into test(id) values('abc');
insert into test(id) values('ффф');
Query 1 ERROR: ORA-12899: value too large for column "TEST"."TEST"."ID" (actual: 6, maximum: 3)
So varchar2 3 chars actually mean the same as byte? NO :)
create table test (id varchar2(3 byte) not null, primary key (id))
insert into test(id) values('abc')
insert into test(id) values('ффф')
Query 1 ERROR: ORA-12899: value too large for column "TEST"."TEST"."ID" (actual: 18, maximum: 3)
And my question remains how to tell Oracle that varchar2 length is for Unicode text (UTF8 to be more precise)?
Update: Is it possible to write down a SQL query that will show all tables/columns that length was in bytes?
Actually, my issue split into 2 parts incorrect query encoding of TablePlus, length in bytes (w/o char suffix) for random columns :)
Update 2: Thanks to @Wernfried Domscheit!
The query show table and columns with varchar2
that length is provided in byte
s:
SELECT TABLE_NAME, COLUMN_NAME, DATA_LENGTH, CHAR_USED
FROM USER_TAB_COLUMNS WHERE DATA_TYPE = 'VARCHAR2' AND CHAR_USED = 'B'
Your example is working for me:
Maybe a typo on your side?
Update:
Looks like your client uses wrong character settings.
ф (U+0444: Cyrillic Small Letter Ef) has these byte values:
DUMP
should returnTyp=1 Len=6 CharacterSet=AL32UTF8: d1,84,d1,84,d1,84
but you getef,bf,bd
which is U+FFFD: Replacement CharacterYou don't insert
ффф
, it is converted to���
.I guess actually your client uses UTF-8 but you did not tell the database, so most likely the database assumes the client uses default
US7ASCII
(or something else). The client sends 6 Bytes (d1,84,d1,84,d1,84
) but the Oracle database interprets it as 6 Single-Byte characters.Typically you use the
NLS_LANG
environment variable to define this. However, dbeaver is Java based and Java/JDBC does not use theNLS_LANG
settings - at least not by default.