The problems of this kind are frequently discussed on the web sites of different forums. By the way, it is still not clear for me, why in such cases it is additionally required that UNION and/or JOIN clauses should not be used. I can only suppose that these are the problems, which are usually asked at an interview for a job.
Let me sidetrack a little bit. As far as I can guess, this question would be answered by Joe Celko in the following way: error of design is quite evident, as one attribute is split into two. But let us put aside the issues of designing and further consider the solution of such problem.
Let's create a test table and populate it with some data:
CREATE TABLE T (
col1 INT
, col2 INT
)
GO
INSERT INTO T
SELECT 1, 1
UNION ALL SELECT 1, 3
UNION ALL SELECT NULL, NULL
UNION ALL SELECT NULL, 2
GO
Thus, there is T table, which contains two columns of data of the same type:
SELECT col1, col2
FROM T
col1 col2 1 1 1 3 NULL NULL NULL 2
It is necessary to get the following result:
col 1 1 NULL NULL 1 3 NULL 2
I know of three methods of solution employing standard means of interactive SQL language.
The solution is quite evident and not calling for any comments. It should be noted however that UNION cannot be used for solving of such a problem as it eliminates the duplicates.
SELECT col1 col FROM T
UNION ALL
SELECT col2 FROM T
In order to preserve the duplicates from various columns, let us make FULL JOIN using the wittingly false predicate, say, 1 = 2:
SELECT T.col1,T1.col2
FROM T FULL JOIN T AS T1 ON 1=2
Results:
col1 col2 1 NULL 1 NULL NULL NULL NULL NULL NULL 1 NULL 3 NULL NULL NULL 2
Then we use COALESCE function, which will produce the desirable results:
SELECT COALESCE(T.col1,T1.col2) col
FROM T FULL JOIN T AS T1 ON 1=2
Constructions PIVOT and UNPIVOT were presented in the last releases of SQL standard and have been implemented in SQL Server beginning with 2005 release. The first of them enables us to present in a row the values of the column, and the second one will enable us to make the reverse operation:
SELECT col
FROM
(SELECT col1, col2
FROM T) p
UNPIVOT
(col FOR xxx IN
(col1, col2)
)AS unpvt
The values in columns col1 and col2 are grouped into one column (col) of the supplementary table unpvt. However there is a certain peculiar feature in the use of PIVOT and UNPIVOT clauses - they do not take into account NULL values. The result of the last query will be the following:
col 1 1 1 3 2
One can overcome such a difficulty in solution of this problem by replacing the NULL value by a pseudo-value on the entrance of UNPIVOT operator, i.e. by a value that could not be found in the initial data, and further on the reverse transformation should be made:
SELECT NULLIF(col,777)
FROM
(SELECT COALESCE(col1,777) col1, COALESCE(col2,777) col2
FROM T) p
UNPIVOT
(col FOR xxx IN
(col1, col2)
)AS unpvt
In this case COALESCE(col1,777) replaces NULL values in the column col1 by 777, while the NULLIF(col,777) function makes a reverse transformation.
The last solution provides us with the necessary result, however it is faulty for the following reason - the value 777 may sooner or later come up in the data, and it will lead to false results. In order to eliminate this error, one can use the value of another type, which cannot be found in the integer column, i.e., 'x' symbol. Naturally to employ such a method, one should convert integer columns into character string ones for the sake of compatibility, when needed making the reverse transformation of the final result:
SELECT CAST(NULLIF(col,'x') AS INT)
FROM
(SELECT COALESCE(CAST(col1 AS VARCHAR),'x') col1,
COALESCE(CAST(col2 AS VARCHAR),'x') col2
FROM T) p
UNPIVOT
(col FOR xxx IN
(col1, col2)
)AS unpvt
Let me say a few words about efficiency of the presented solutions. According to the query execution plan, the major costs are spent on reading of data (Table scan operation). For the first two solutions (with use of UNION and FULL JOIN) the scanning procedure is carried out twice, while for the last one (UNPIVOT) only once, that accounts for double advantage in terms of performance.
DROP TABLE T
There is one more variant which I use:
SELECT
CASE a WHEN 1 THEN col1 ELSE col2 END col
FROM T, (SELECT 1 a UNION ALL SELECT 2) B
The Cartesian product of the T table with derived table of 2 rows gives "doubling" (each row of T is presented 2 times, for =1 and =2). For the first case we take value from col1, and for the second - from col2.
Here, certainly, there is both union and join but, in my opinion, the title's question means single scanning the table.
Home | SELECT exercises (rating stages) | DML exercises | Developers |