admin管理员组

文章数量:1279120

My data looks like this

ID      Value
A       123, 456, 789
B       234, 567

I need my output to look like

ID  Value1 Value2  Value3
A    123    456     789
B    234    567     NULL 

I can do this using regexp_substr

For example

select regexp_substr(Value, '[^,]+', 1) AS Value1,
       regexp_substr(Value, '[^,]+', 1, 2) AS Value2,
       regexp_substr(Value, '[^,]+', 1, 3) AS Value3
.....

But this requires me to know the maximum length of value which may change in the future and therefore require regexp_substr statements to be added or deleted.

Is there a way to write this sql so that the number of columns created is based on the max length of value and not the number of regex statements.

For example, if I were doing this in python I would just use a for loop to loop

My data looks like this

ID      Value
A       123, 456, 789
B       234, 567

I need my output to look like

ID  Value1 Value2  Value3
A    123    456     789
B    234    567     NULL 

I can do this using regexp_substr

For example

select regexp_substr(Value, '[^,]+', 1) AS Value1,
       regexp_substr(Value, '[^,]+', 1, 2) AS Value2,
       regexp_substr(Value, '[^,]+', 1, 3) AS Value3
.....

But this requires me to know the maximum length of value which may change in the future and therefore require regexp_substr statements to be added or deleted.

Is there a way to write this sql so that the number of columns created is based on the max length of value and not the number of regex statements.

For example, if I were doing this in python I would just use a for loop to loop

Share Improve this question asked Feb 24 at 13:27 GingerhazeGingerhaze 7062 gold badges6 silver badges15 bronze badges 2
  • IMHO it would be easier and more stable to instead use a good scripting language like Python to edit your original CSV to make the commas balanced. Then, import this balanced CSV file into Oracle. – Tim Biegeleisen Commented Feb 24 at 13:38
  • @TimBiegeleisen Unfortunately the source of the data is not a CSV – Gingerhaze Commented Feb 24 at 13:55
Add a comment  | 

2 Answers 2

Reset to default 1

You cannot do that with a static SQL statement. SQL (in all dialects, not just Oracle) requires a fixed, known, number of columns so you either need to make a judgement call and generate a fixed number of columns or revise your requirements.


If you know you are going to have at most 3 values then you can use:

SELECT id,
       REGEXP_SUBSTR(value, '(.*?)(,|$)', 1, 1, NULL, 1) AS Value1,
       REGEXP_SUBSTR(value, '(.*?)(,|$)', 1, 2, NULL, 1) AS Value2,
       REGEXP_SUBSTR(value, '(.*?)(,|$)', 1, 3, NULL, 1) AS Value3
FROM   table_name

Which, for the sample data:

CREATE TABLE table_name (id, value) AS
SELECT 'A', '1,2,3' FROM DUAL UNION ALL
SELECT 'B', ',4' FROM DUAL;

Outputs:

ID VALUE1 VALUE2 VALUE3
A 1 2 3
B null 4 null

If you later find that you can have up-to 5 items then modify the query to generate 5 values.


If you want to dynamically generate the output for any number of items then this is not something that SQL is designed for. Instead of generating the values as columns, generate the values as rows:

WITH bounds (id, value, idx, spos, epos) AS (
  SELECT id, value, 1, 1, INSTR(value, ',', 1)
  FROM   table_name
UNION ALL
  SELECT id, value, idx + 1, epos + 1, INSTR(value, ',', epos + 1)
  FROM   bounds
  WHERE  epos > 0
)
SEARCH DEPTH FIRST BY id SET order_id
SELECT id,
       idx,
       CASE epos
       WHEN 0
       THEN SUBSTR(value, spos)
       ELSE SUBSTR(value, spos, epos - spos)
       END AS item
FROM   bounds

Which outputs:

ID IDX ITEM
A 1 1
A 2 2
A 3 3
B 1 null
B 2 4

Then, if you want it in columns, you can pivot the output in whatever third-party programming language (C++, Python, Java, Perl, etc.) you are using to access the database and display the output to the user in columns (after generating the initial result set in rows).

fiddle


If you must have the data in columns and cannot pivot in a third-party language than you are going to have to use PL/SQL to dynamically count the number of columns required and generate an SQL statement with the correct number of columns and then execute it dynamically. This is not a trivial task so my advice would be to avoid a dynamic solution and generate the output as either columns (and pivot using a third-party application when you want to display the data) or a known, fixed number of columns.

Dynamically creating columns from table data can be done in a SQL statement if you use a combination of a custom function to aggregate the data into a CLOB, a function to convert the CLOB into a BLOB, and the package APEX_DATA_PARSER.

Function to convert a CLOB into a BLOB.

CREATE OR REPLACE FUNCTION clob_to_blob (p_data  IN  CLOB)
  RETURN BLOB
-- -----------------------------------------------------------------------------------
-- File Name    : https://oracle-base/dba/miscellaneous/clob_to_blob.sql
-- Author       : Tim Hall
-- Description  : Converts a CLOB to a BLOB.
-- Last Modified: 26/12/2016
-- -----------------------------------------------------------------------------------
AS
  l_blob         BLOB;
  l_dest_offset  PLS_INTEGER := 1;
  l_src_offset   PLS_INTEGER := 1;
  l_lang_context PLS_INTEGER := DBMS_LOB.default_lang_ctx;
  l_warning      PLS_INTEGER := DBMS_LOB.warn_inconvertible_char;
BEGIN

  DBMS_LOB.createtemporary(
    lob_loc => l_blob,
    cache   => TRUE);

  DBMS_LOB.converttoblob(
   dest_lob      => l_blob,
   src_clob      => p_data,
   amount        => DBMS_LOB.lobmaxsize,
   dest_offset   => l_dest_offset,
   src_offset    => l_src_offset, 
   blob_csid     => DBMS_LOB.default_csid,
   lang_context  => l_lang_context,
   warning       => l_warning);

   RETURN l_blob;
END;
/

Custom function to convert table data into a CLOB.

Modify this function to fit your needs. It may be useful to pass in a table name.

create or replace function get_table_as_blob return blob is
    v_blob blob;
begin
    -- #3: Convert the CLOB into a BLOB.
    select clob_to_blob(the_clob) the_blob
    into v_blob
    from
    (
        -- #2: convert all the csv rows into a single clob.
        select
            rtrim
            (
                xmlagg
                (
                    xmlelement(e, row_value, chr(10)).extract('//text()')
                ).getclobval(),
                chr(10)
            ) as the_clob
        from
        (
            -- #1: Convert each row into a single CSV.
            select id || ', ' || value row_value
            from test1
        )
    );

    return v_blob;
end get_table_as_blob;
/

SQL statement that calls APEX_DATA_PARSER and ties it all together.

If it's not already installed, install Oracle Application Express 19.1, which comes with the useful package APEX_DATA_PARSER. Unfortunately, this package only works on Oracle 11g and above.

select *
from table(apex_data_parser.parse
(
    p_content => get_table_as_blob(),
    p_file_name => 'meaningless_file_name.csv'
));

The results include 300 columns with simple names like "COL001".

LINE_NUMBER COL001 COL002 COL003 COL004 COL005 COL006 ...
1 A 123 456 789 ...
2 B 234 567 ...

This can be useful for some quick and dirty data analysis. But I agree with MT0 that you should avoid this kind of dynamic solution if possible, especially in a production environment.

If you really need to do this in pure SQL and PL/SQL, and need only the minimum number of columns, and need to do it in Oracle 10g, then you'll need an Oracle Data Cartridge solution. Let me know if you need that solution.

本文标签: