oracle value exists in collection

4.1k Views Asked by At

I have function (used within a view) with the end result being a list of unique values from a row within a table of comma separated values.

Essentially given a table of:

studentid    classes
12345        MATH 1301, HIST 1301, POLS 1301
57495        MATH 2309, HIST 1301
39485        MATH 1301, HIST 1301

I want to see

MATH 1301
MATH 2309
HIST 1301
POLS 1301

The below code works perfect if the source table is small, but when looking at a table of 30,000 rows I get the following error. ORA-06532: Subscript outside of limit

I'm pretty sure my problem is the collection is getting too large since it's getting duplicate values. The duplicate values in themselves only become a problem when the collection becomes too large. How do I keep the duplicate values out of the collection?

I've tried childnames.exists(element) but I believe this only works for seeing if there exists an element at the index value element correct? of I've looked at member of but I don't understand how to implement it.. Is there an easy way to check if a collection element exists? Or am I over looking something simple? Is there a different type other than odcivarchar2list that would allow a larger collection?

CREATE OR REPLACE FUNCTION unique_values_from_csv ( p_del VARCHAR2 := ',')
   RETURN SYS.odcivarchar2list
IS
   childnames   SYS.odcivarchar2list := sys.odcivarchar2list ();
   tempvar VARCHAR2(255);
    l_idx    PLS_INTEGER;
    l_list2    VARCHAR2(32767) ;
    l_value VARCHAR2(32767);

   CURSOR tablewalker_cur
   IS
      SELECT distinct classes
        FROM studentclasses;

BEGIN
   FOR recordwalker_rec IN tablewalker_cur
   LOOP
   l_list2 := recordwalker_rec.classes;
      LOOP
         l_idx := INSTR (l_list2, p_del);

         IF l_idx > 0
         THEN

            childnames.EXTEND;            
            tempvar := (LTRIM (RTRIM (SUBSTR (l_list2, 1, l_idx - 1))));
            childnames (childnames.COUNT) :=  tempvar;
            l_list2 := SUBSTR (l_list2, l_idx + LENGTH (p_del));
           end if;

            childnames.EXTEND;
            childnames (childnames.COUNT) :=  (LTRIM (RTRIM (l_list2)));
            EXIT;

      END LOOP;
   END LOOP;

   RETURN childnames;
END unique_values_from_csv;
/
2

There are 2 best solutions below

0
Jon Heller On
create or replace function unique_values_from_csv(p_del varchar2 := ',')
return sys.dbms_debug_vc2coll is
    childnames sys.dbms_debug_vc2coll := sys.dbms_debug_vc2coll();
    l_idx    pls_integer;
    l_list2    varchar2(32767) ;

    procedure add_if_not_member(new_element varchar2) is
    begin
        if new_element not member of childnames then
            childnames.extend;
            childnames(childnames.count) := new_element;
        end if;
    end;
begin
    for recordwalker_rec in (select distinct classes from studentclasses)
    loop
        l_list2 := recordwalker_rec.classes;
        loop
            l_idx := instr (l_list2, p_del);
            if l_idx > 0 then
                add_if_not_member(trim(substr (l_list2, 1, l_idx - 1)));
                l_list2 := substr(l_list2, l_idx + length(p_del));
            else
                add_if_not_member(trim(l_list2));
            exit;
            end if;
        end loop;
    end loop;
    return childnames;
end unique_values_from_csv;
/
  • Used SYS.DBMS_DEBUG_VC2COLL, which is a TABLE OF VARCHAR2(1000) and should support any number of elements. Although l_list2 varchar2(32767) will limit the results.
  • MEMBER OF is the correct condition.
  • Added an ELSE - the original function was only splitting the list in two.
  • Removed the cursor - for such a small query, another level of indirection isn't worth it.
  • Used TRIM() instead of LTRIM(RTRIM())
  • The best solution would be to throw out this function and stop storing non-atomic data in your database.

Here's some sample data and a query using the function:

create table studentclasses(studentid number, classes varchar2(4000));

insert into studentclasses
select  12345, 'MATH 1301,HIST 1301,POLS 1301' from dual union all
select  57495, 'MATH 2309,HIST 1301' from dual union all
select  39485, 'MATH 1301,HIST 1301' from dual;
commit;

select unique_values_from_csv from dual;

COLUMN_VALUE
MATH 1301
HIST 1301
POLS 1301
MATH 2309
2
Egor Skriptunoff On
select distinct
  regexp_substr(classes, '([^,]+)(,\s*|$)', 1, occ, '', 1) as class
from
  studentclasses,
  (
    select level as occ
    from dual
    connect by level <= (
      select max(regexp_count(classes, ','))+1 
      from studentclasses
  )
)
where
  regexp_substr(classes, '([^,]+)(,\s*|$)', 1, occ, '', 1) is not null
order by 1