反连接
反连接(Anti Join)是一种特殊的连接类型,与内连接和外连接不同,Oracle数据库里并没有相关的
关键字可以在SQL文本中专门表示反连接,所以这里把它单独拿出来说明。
为了方便说明反连接的含义,我们用“t1.x.anti=t2.y”来表示表T1和表T2做反连接,且T1是驱动表,T2
是被驱动表,反连接条件为t1.x=t2.y,这里“t1.x.anti=t2.y”的含义是只要表T2中有满足条件t1.x=t2.y
的记录存在,则表T1中满足条件t1.x=t2.y的记录就会被丢弃,最后返回的记录就是表T1中那些不满足
条件t1.x=t2.y的记录。
当做子查询展开时,Oracle经常会把那些外部where条件为NOT EXISTS,NOT IN或<>ALL的子查询转换成
对应的反连接。
我们来看如下的范例SQL1、2和3
SQL> select * from t1 where col2 not in (select col2 from t2);
SQL> select * from t1 where col2<>all (select col2 from t2);
SQL> select * from t1 where not exists (select 1 from t2 where col2=t1.col2);
现在表T1和t2在各自的连接列COL2上均没有NULL值,在这种情况下范例SQL1、2、3实际上是等价的。
SQL> select * from t1 where col2 not in (select col2 from t2);COL1 C
---------- -3 CExecution Plan
----------------------------------------------------------
Plan hash value: 1275484728---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 7 | 7 (15)| 00:00:01 |
|* 1 | HASH JOIN ANTI NA | | 1 | 7 | 7 (15)| 00:00:01 |
| 2 | TABLE ACCESS FULL| T1 | 3 | 15 | 3 (0)| 00:00:01 |
| 3 | TABLE ACCESS FULL| T2 | 3 | 6 | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------Predicate Information (identified by operation id):
---------------------------------------------------1 - access("COL2"="COL2")Statistics
----------------------------------------------------------1 recursive calls0 db block gets12 consistent gets0 physical reads0 redo size590 bytes sent via SQL*Net to client523 bytes received via SQL*Net from client2 SQL*Net roundtrips to/from client0 sorts (memory)0 sorts (disk)1 rows processed
SQL> select * from t1 where col2<>all (select col2 from t2);COL1 C
---------- -3 CExecution Plan
----------------------------------------------------------
Plan hash value: 1275484728---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 7 | 7 (15)| 00:00:01 |
|* 1 | HASH JOIN ANTI NA | | 1 | 7 | 7 (15)| 00:00:01 |
| 2 | TABLE ACCESS FULL| T1 | 3 | 15 | 3 (0)| 00:00:01 |
| 3 | TABLE ACCESS FULL| T2 | 3 | 6 | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------Predicate Information (identified by operation id):
---------------------------------------------------1 - access("COL2"="COL2")Statistics
----------------------------------------------------------1 recursive calls0 db block gets12 consistent gets0 physical reads0 redo size590 bytes sent via SQL*Net to client523 bytes received via SQL*Net from client2 SQL*Net roundtrips to/from client0 sorts (memory)0 sorts (disk)1 rows processed
SQL> select * from t1 where not exists (select 1 from t2 where col2=t1.col2);COL1 C
---------- -3 CExecution Plan
----------------------------------------------------------
Plan hash value: 2706079091---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 7 | 7 (15)| 00:00:01 |
|* 1 | HASH JOIN ANTI | | 1 | 7 | 7 (15)| 00:00:01 |
| 2 | TABLE ACCESS FULL| T1 | 3 | 15 | 3 (0)| 00:00:01 |
| 3 | TABLE ACCESS FULL| T2 | 3 | 6 | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------Predicate Information (identified by operation id):
---------------------------------------------------1 - access("COL2"="T1"."COL2")Statistics
----------------------------------------------------------1 recursive calls0 db block gets12 consistent gets0 physical reads0 redo size590 bytes sent via SQL*Net to client523 bytes received via SQL*Net from client2 SQL*Net roundtrips to/from client0 sorts (memory)0 sorts (disk)1 rows processed
上述3个范例SQL的执行结果是一样的,范例SQL1、2和范例SQL3的执行计划中,ID=1的执行步骤的列Operation的值分别为“HASH JOIN ANTI NA”和“HASH JOIN ANTI”,虽然不是完全一样,但它们都有关键字“ANTI”,这就说明Oracle在执行上述三个范例SQL时确实是在用反连接,即Oracle在执行时实际上是将他们转换成了如下的等价反连接形式:
select t1.* from t1,t2 where t1.col2 anti=t2.col2;
这里表T1、T2在各自的连接列COL2上没有NULL值,所以此时这三个范例SQL是等价的,但如果连接列COL2上有null值,则它们就不完全等价了。这种null值所带来的影响又细分为两种情况:
1、表T1的连接列COL2上出现了NULL值
SQL> insert into t1 values(4,null);1 row created.SQL> commit;Commit complete.
表T1中的记录如下:
SQL> select * from t1;COL1 C
---------- -1 A2 B3 C4
分别执行SQL1、2、3
SQL> select * from t1 where col2 not in (select col2 from t2);COL1 C
---------- -3 CExecution Plan
----------------------------------------------------------
Plan hash value: 1275484728---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 7 | 7 (15)| 00:00:01 |
|* 1 | HASH JOIN ANTI NA | | 1 | 7 | 7 (15)| 00:00:01 |
| 2 | TABLE ACCESS FULL| T1 | 3 | 15 | 3 (0)| 00:00:01 |
| 3 | TABLE ACCESS FULL| T2 | 3 | 6 | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------Predicate Information (identified by operation id):
---------------------------------------------------1 - access("COL2"="COL2")Statistics
----------------------------------------------------------1 recursive calls0 db block gets12 consistent gets0 physical reads0 redo size590 bytes sent via SQL*Net to client523 bytes received via SQL*Net from client2 SQL*Net roundtrips to/from client0 sorts (memory)0 sorts (disk)1 rows processed
SQL> select * from t1 where col2 <>all (select col2 from t2);COL1 C
---------- -3 CExecution Plan
----------------------------------------------------------
Plan hash value: 1275484728---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 7 | 7 (15)| 00:00:01 |
|* 1 | HASH JOIN ANTI NA | | 1 | 7 | 7 (15)| 00:00:01 |
| 2 | TABLE ACCESS FULL| T1 | 3 | 15 | 3 (0)| 00:00:01 |
| 3 | TABLE ACCESS FULL| T2 | 3 | 6 | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------Predicate Information (identified by operation id):
---------------------------------------------------1 - access("COL2"="COL2")Statistics
----------------------------------------------------------1 recursive calls0 db block gets12 consistent gets0 physical reads0 redo size590 bytes sent via SQL*Net to client523 bytes received via SQL*Net from client2 SQL*Net roundtrips to/from client0 sorts (memory)0 sorts (disk)1 rows processed
SQL> select * from t1 where not exists (select 1 from t2 where col2=t1.col2);COL1 C
---------- -43 CExecution Plan
----------------------------------------------------------
Plan hash value: 2706079091---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 7 | 7 (15)| 00:00:01 |
|* 1 | HASH JOIN ANTI | | 1 | 7 | 7 (15)| 00:00:01 |
| 2 | TABLE ACCESS FULL| T1 | 3 | 15 | 3 (0)| 00:00:01 |
| 3 | TABLE ACCESS FULL| T2 | 3 | 6 | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------Predicate Information (identified by operation id):
---------------------------------------------------1 - access("COL2"="T1"."COL2")Statistics
----------------------------------------------------------0 recursive calls0 db block gets12 consistent gets0 physical reads0 redo size645 bytes sent via SQL*Net to client523 bytes received via SQL*Net from client2 SQL*Net roundtrips to/from client0 sorts (memory)0 sorts (disk)2 rows processed
SQL> delete from t1 where col1=4;1 row deleted.SQL> commit;Commit complete.
2、表T2的连接列COL2上出现了NULL值
SQL> insert into t2 values(null,'E2');1 row created.SQL> COMMIT;Commit complete. SQL> select * from t1;COL1 C
---------- -1 A2 B3 CSQL> select * from t2;C CO
- --
A A2
B B2
D D2E2
再次执行上述三个SQL
SQL> select * from t1 where col2 not in (select col2 from t2);no rows selectedExecution Plan
----------------------------------------------------------
Plan hash value: 1275484728---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 7 | 7 (15)| 00:00:01 |
|* 1 | HASH JOIN ANTI NA | | 1 | 7 | 7 (15)| 00:00:01 |
| 2 | TABLE ACCESS FULL| T1 | 3 | 15 | 3 (0)| 00:00:01 |
| 3 | TABLE ACCESS FULL| T2 | 3 | 6 | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------Predicate Information (identified by operation id):
---------------------------------------------------1 - access("COL2"="COL2")Statistics
----------------------------------------------------------0 recursive calls0 db block gets10 consistent gets0 physical reads0 redo size398 bytes sent via SQL*Net to client512 bytes received via SQL*Net from client1 SQL*Net roundtrips to/from client0 sorts (memory)0 sorts (disk)0 rows processed
SQL> select * from t1 where col2 <>all (select col2 from t2);no rows selectedExecution Plan
----------------------------------------------------------
Plan hash value: 1275484728---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 7 | 7 (15)| 00:00:01 |
|* 1 | HASH JOIN ANTI NA | | 1 | 7 | 7 (15)| 00:00:01 |
| 2 | TABLE ACCESS FULL| T1 | 3 | 15 | 3 (0)| 00:00:01 |
| 3 | TABLE ACCESS FULL| T2 | 3 | 6 | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------Predicate Information (identified by operation id):
---------------------------------------------------1 - access("COL2"="COL2")Statistics
----------------------------------------------------------0 recursive calls0 db block gets10 consistent gets0 physical reads0 redo size398 bytes sent via SQL*Net to client512 bytes received via SQL*Net from client1 SQL*Net roundtrips to/from client0 sorts (memory)0 sorts (disk)0 rows processed
SQL> select * from t1 where not exists (select 1 from t2 where col2=t1.col2);COL1 C
---------- -3 CExecution Plan
----------------------------------------------------------
Plan hash value: 2706079091---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 7 | 7 (15)| 00:00:01 |
|* 1 | HASH JOIN ANTI | | 1 | 7 | 7 (15)| 00:00:01 |
| 2 | TABLE ACCESS FULL| T1 | 3 | 15 | 3 (0)| 00:00:01 |
| 3 | TABLE ACCESS FULL| T2 | 3 | 6 | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------Predicate Information (identified by operation id):
---------------------------------------------------1 - access("COL2"="T1"."COL2")Statistics
----------------------------------------------------------0 recursive calls0 db block gets12 consistent gets0 physical reads0 redo size590 bytes sent via SQL*Net to client523 bytes received via SQL*Net from client2 SQL*Net roundtrips to/from client0 sorts (memory)0 sorts (disk)1 rows processed
SQL> delete from t2 where col3='E2';1 row deleted.SQL> COMMIT;Commit complete.
从上述测试中我们可以得出以下结论:
1)表T1、T2在各自的连接列COL2上一但有了NULL值,则范例SQL1、2、3就不完全等价了。
2)NOT IN和<>ALL对NULL值敏感,这意味着NOT IN后面的子查询或者常量集合一旦有NULL值出现,则整个SQL的执行结果就会为NULL,即此时的执行结果将不包含任何记录。
3)NOT EXISTS对NULL值不敏感,这意味着NULL值对NOT EXISTS的执行结果不会有什么影响。正是因为NOT IN和<>ALL对NULL值敏感,所以一旦相关的连接列上出现NULL值,此时Oracle如果还按照通常的反连接的处理逻辑来处理,得到的结果就不对了。
为了解决NOT IN和<>ALL对NULL值敏感的问题,Oracle推出了改良的反连接,这种反连接能够处理NULL值,Oracle称其为Null-Aware Anti Join。上述范例SQL1、2的执行计划中,ID=1的执行步骤的列Operation的值为“HASH JOIN ANTI NA”,关键字NA就是Null-Aware的缩写。Oracle就是想告诉我们,这里采用的不是普通的哈希反连接,而是改良后的,能够处理NULL值的哈希反连接。
在Oracle 11g R2中,Oracle是否启用Null-Aware Anti Join受隐含参数_OPTIMIZER_NULL_AWARE_ANTIJOIN控制,其默认值为TRUE,表示启用Null-Aware Anti join。
如果我们把_OPTIMIZER_NULL_AWARE_ANTIJOIN的值修改为FALSE,则Oracle就不能再用Null-Aware Anti Join了,而又因为NOT IN对NULL值敏感,所以Oracle此时也不能用普通的反连接。
设置当前session中的_OPTIMIZER_NULL_AWARE_ANTIJOIN值为FALSE
alter session set "_OPTIMIZER_NULL_AWARE_ANTIJOIN"=false;
修改后执行范例SQL1:
从上述显示内容可以看出,当我们把_OPTIMIZER_NULL_AWARE_ANTIJOIN的值修改为FALSE后,Oracle果然没有走反连接(当然也不能走)。
这里Oracle选择了走FILTER类型的执行计划,FILTER类型的执行计划实际上是一种改良的嵌套循环连接。