SELECT statement to remove empty columns from resultset with variable data

3.1k Views Asked by At

I've got a table that has been imported from a CSV as a midstep to gather data into other tables. All columns in this table may be empty but as long as one column has data, is a valid row. Because of this, there may be full rows with empty data, that are not valid to take into consideration. Here's a (simplified) example of such a table:

/* Table Structure */
CREATE TABLE `imported_data` (
  `title` varchar(45) DEFAULT NULL,
  `description` varchar(45) DEFAULT NULL,
  `language` varchar(45) DEFAULT NULL,
  `url` varchar(45) DEFAULT NULL,
  `category` varchar(45) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8

/* Table Data */
INSERT INTO imported_data (title, `description`, `language`, url, category) VALUES
('Le Titre', '', 'fra', '', '1'),
('', 'English Letters for Kids', '', 'https://anything.net', '2'),
('', '', '', '', ''),
('Master', 'The greatest master.', 'eng', 'http://www.master.com', '3'),
('', '', '', '', ''),
('', '', 'spa', '', '');


╔══════════╤══════════════════════════╤══════════╤═══════════════════════╤══════════╗
║ title    │ description              │ language │ url                   │ category ║
╠══════════╪══════════════════════════╪══════════╪═══════════════════════╪══════════╣
║          │                          │ fra      │                       │ 1        ║
╟──────────┼──────────────────────────┼──────────┼───────────────────────┼──────────╢
║          │ English letters for Kids │          │ https://anything.net  │ 2        ║
╟──────────┼──────────────────────────┼──────────┼───────────────────────┼──────────╢
║          │                          │          │                       │          ║
╟──────────┼──────────────────────────┼──────────┼───────────────────────┼──────────╢
║          │ The greatest master.     │ eng      │ http://www.master.com │ 3        ║
╟──────────┼──────────────────────────┼──────────┼───────────────────────┼──────────╢
║          │                          │          │                       │          ║
╟──────────┼──────────────────────────┼──────────┼───────────────────────┼──────────╢
║          │                          │ spa      │                       │          ║
╚══════════╧══════════════════════════╧══════════╧═══════════════════════╧══════════╝

NOTE: Please take into consideration the above table/data comes from an external source and is not modifiable, it has no primary key either. It is just imported as is in MySQL. I won't get NULL values as all columns are VARCHAR.

I use the following SELECT statement to get this same table but without full empty rows this way:

SELECT title, `description`, `language`, url, category
FROM imported_data2 WHERE
title != ''  OR `description` != ''  OR `language` != ''  OR url != ''  OR category != '' 

╔═══════╤══════════════════════════╤══════════╤═══════════════════════╤══════════╗
║ title │ description              │ language │ url                   │ category ║
╠═══════╪══════════════════════════╪══════════╪═══════════════════════╪══════════╣
║       │                          │ fra      │                       │ 1        ║
╟───────┼──────────────────────────┼──────────┼───────────────────────┼──────────╢
║       │ English letters for Kids │          │ https://anything.net  │ 2        ║
╟───────┼──────────────────────────┼──────────┼───────────────────────┼──────────╢
║       │ The greatest master.     │ eng      │ http://www.master.com │ 3        ║
╟───────┼──────────────────────────┼──────────┼───────────────────────┼──────────╢
║       │                          │ spa      │                       │          ║
╚═══════╧══════════════════════════╧══════════╧═══════════════════════╧══════════╝

What do I need to do to remove columns whose rows are all empty? In this example, I need the table as above but without the empty column title.

As this data is dynamic, there may be times when another column or columns will be empty and I cannot know beforehand which one will be that way.

2

There are 2 best solutions below

0
On BEST ANSWER

Using @GMB answer as a base, I added to the prepared statement a dynamic way to check if any of the verified fields are empty in the where clause, thus cutting out the full empty rows, like this:

set @sql = null;

select concat_ws(', ',
    case when count(nullif(title, ''))       > 0 then 'title'       end,
    case when count(nullif(description, '')) > 0 then 'description' end,
    case when count(nullif(language, ''))    > 0 then 'language'    end,
    case when count(nullif(url, ''))         > 0 then 'url'         end,
    case when count(nullif(category, ''))    > 0 then 'category'    end
) into @sql
from imported_data;

set @sql = concat('select ', @sql, ' from imported_data where',
                 (
                    SELECT INSERT( GROUP_CONCAT('OR `', `COLUMN_NAME`, '`  != \'\' ' SEPARATOR ' '), 1, 3, '')
                    FROM `information_schema`.`COLUMNS`
                    WHERE `TABLE_SCHEMA` = 'mydb'
                        AND `TABLE_NAME` = 'imported_data'
                )
);  
prepare stmt from @sql;
execute stmt;
deallocate prepare stmt;
5
On

You can identify the "empty" columns with:

select 
    count(title), 
    count(description),
    count(language),
    count(url),
    count(cagegory)
from imported_data

Columns that come back with a count of 0 contain only null values. You can then drop the related columns, if that's what you want.

I won't get NULL values as all columns are VARCHAR.

A string column may sure contain null values. If you really mean empty string, not null, then use nullif():

select 
    count(nullif(title, '')), 
    count(nullif(description, '')),
    count(nullif(language, '')),
    count(nullif(url, '')),
    count(nullif(category, ''))
from imported_data

If you want something entirely dynamic, then consider building use a prepared statement:

set @sql = null;

select concat_ws(', ',
    case when count(nullif(title, ''))       > 0 then 'title'       end,
    case when count(nullif(description, '')) > 0 then 'description' end,
    case when count(nullif(language, ''))    > 0 then 'language'    end,
    case when count(nullif(url, ''))         > 0 then 'url'         end,
    case when count(nullif(category, ''))    > 0 then 'category'    end
) into @sql
from imported_data;

set @sql = concat('select ', @sql, ' from imported_data');  
prepare stmt from @sql;
execute stmt;
deallocate prepare stmt;