原文:http://optimizermagic.blogspot.com/2008/06/why-are-some-of-tables-in-my-query.html
Oracle 10gR2時,我們引入了一個新的轉換技術,表消除(也稱之為“連接消除”),它會消除查詢中多余的表。如果表中的一個列只在連接謂詞中使用,而且保證這些連接既不會減少結果集,也不會擴展結果集中的記錄,那么這個表就是多余的。有若干種情況下,Oracle會消除多余的表,接下來我們會逐一討論。
主鍵 – 外鍵表消除
從10gR2開始,優化器消除因主鍵–外鍵約束而多余的表。考慮以下示例中的表:
create table jobs
(
job_id NUMBER PRIMARY KEY,
job_title VARCHAR2(35) NOT NULL,
min_salary NUMBER,
max_salary NUMBER
);
create table departments
(
department_id NUMBER PRIMARY KEY,
department_name VARCHAR2(50)
);
create table employees
(
employee_id NUMBER PRIMARY KEY,
employee_name VARCHAR2(50),
department_id NUMBER REFERENCES departments(department_id),
job_id NUMBER REFERENCES jobs(job_id)
);
以及查詢:
select e.employee_name
from employees e, departments d
where e.department_id = d.department_id;
在這個查詢中,對departments的連接就是多余的。departments表中的列只出現在連接謂詞中,并且主鍵-外鍵約束保證了employees中的每一行,最多只能在departments中匹配到一行。因此,查詢是等價于:
select e.employee_name
from employees e
where e.department_id is not null;
優化器會為該查詢生成如下的執行計劃:
-------------------------------------------
Id Operation Name
-------------------------------------------
0 SELECT STATEMENT
* 1 TABLE ACCESS FULL EMPLOYEES
-------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("E"."DEPARTMENT_ID" IS NOT NULL)
注:如果列上有非空約束,那么謂詞IS NOT NULL不是必須的。
自11gR1起,優化器還可以消除半連接或反連接中的表。考慮如下查詢:
select e.employee_id, e.employee_name
from employees e
where not exists (select 1
from jobs j
where j.job_id = e.job_id);
由于employees.job_id是jobs.job_id的外鍵,任何employees.job_id中的非空值,必須在jobs表中有一個對應。因此,只有employees表中的job_id列上存在NULL值的,才會出現在結果中。因此,該查詢等價于:
select e.employee_id, e.employee_name
from employees e
where job_id is null;
且優化器可以選擇這樣的執行計劃:
-------------------------------------------
Id Operation Name
-------------------------------------------
0 SELECT STATEMENT
* 1 TABLE ACCESS FULL EMPLOYEES
-------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("E"."JOB_ID" IS NULL)
假定employees.job_id有一個非空約束
alter table employees modify job_id not null;
在這種情況下,EMPLOYEES中不可能有任何滿足條件的行,優化器會選擇下面的執行計劃:
-------------------------------------------
Id Operation Name
-------------------------------------------
0 SELECT STATEMENT
* 1 FILTER
2 TABLE ACCESS FULL EMPLOYEES
-------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(NULL IS NOT NULL)
過濾條件"NULL IS NOT NULL" 是一個常為false的謂詞,這最終會阻止表掃描的發生。
還是在11gR1中,優化器支持對符合ANSI的連接做表消除,比如下面的查詢:
select employee_name
from employees e inner join jobs j
on e.job_id = j.job_id;
優化器會消除JOBS表,并產生這樣的執行計劃:
-------------------------------------------
Id Operation Name
-------------------------------------------
0 SELECT STATEMENT
1 TABLE ACCESS FULL EMPLOYEES
-------------------------------------------
外連接表消除
11gR1中,對外連接引入了一種新風格的表消除。它并不需要主鍵-外鍵約束。為了演示,我們需要一個新表并在EMPLOYEES中附加新列。
create table projects
(
project_id NUMBER UNIQUE,
deadline DATE,
priority NUMBER
);
alter table employees add project_id number;
現在,考慮下面外連接employees和projects表的查詢:
select e.employee_name, e.project_id
from employees e, projects p
where e.project_id = p.project_id (+);
外連接保證了employees中的每一行至少在結果中出現一次。在projects.project_id列上在的唯一約束,保證了employees中的每一行最多在projects表中匹配到一行。綜上,這兩個條件保證了employees表中的每一行,只會在結果集中出現一次。由于沒有projects表上的其它列被引用,projects表可以被消除。所以,優化器會選擇下面的執行計劃:
-------------------------------------------
Id Operation Name
-------------------------------------------
0 SELECT STATEMENT
1 TABLE ACCESS FULL EMPLOYEES
-------------------------------------------
為什么我會寫下這樣的查詢?
本文中的所有樣例查詢都是非常簡單的,一個人不太可能寫出明顯不需要連接的查詢。在現實中,有許多場景下,表消除是有幫助的,包括機器生成的SQL和消除視圖中的表。比如,由一堆表構成的視圖,是有連接的。為了檢索視圖中出現的所有列,連接也許是需要的。但是,一些視圖的使用者,可能只是訪問視圖列的一個子集。在這種情況下,連接的表可以被消除。
比如,考慮該視圖:
create view employee_directory_v as
select e.employee_name, d.department_name, j.job_title
from employees e, departments d, jobs j
where e.department_id = d.department_id
and e.job_id = j.job_id;
該視圖可以供一個簡單的員工花名冊的應用程序,用于通過job_title查找員工姓名。應用發出如下查詢:
select employee_name
from employee_directory_v
where department = 'ACCOUNTING';
由于job_title未被引用,jobs表可以從查詢中被消除,優化器會選擇如下的執行計劃:
--------------------------------------------
Id Operation Name
--------------------------------------------
0 SELECT STATEMENT
* 1 HASH JOIN
2 TABLE ACCESS FULL EMPLOYEES
* 3 TABLE ACCESS FULL DEPARTMENTS
--------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("E"."DEPARTMENT_ID"="D"."DEPARTMENT_ID")
3 - filter("D"."DEPARTMENT_NAME"='ACCOUNTING')
已知限制
目前有以下幾個表消除的限制:
多列主鍵-外鍵約束不支持
在查詢的其它位置引用了連接鍵,將阻止表消除。對于一個內連接,連接兩側的列是等價的,如果在查詢的其它位置引用了表的連接列,則該表不能被消除。變通方法是改寫查詢,引用另一個表的連接列。(但我們確認這不總是有效的)
原文鏈接:http://optimizermagic.blogspot.com/2008/06/why-are-some-of-tables-in-my-query.html
原文內容(注,因原文鏈接已不可訪問,故以下內容轉載自其它網址中的內容,故不能保證仍是“原文”):
In 10gR2, we introduced a new transformation, table elimination (alternately called “join elimination”), which removes redundant tables from a query. A table is redundant if its columns are only referenced to in join predicates, and it is guaranteed that those joins neither filter nor expand the resulting rows. There are several cases where Oracle will eliminate a redundant table. We will discuss each case in turn.
Primary Key-Foreign Key Table Elimination
Starting in 10gR2, the optimizer eliminates tables that are redundant due to primary key-foreign key constraints. Consider the following example tables:
create table jobs
(
job_id NUMBER PRIMARY KEY,
job_title VARCHAR2(35) NOT NULL,
min_salary NUMBER,
max_salary NUMBER
);
create table departments
(
department_id NUMBER PRIMARY KEY,
department_name VARCHAR2(50)
);
create table employees
(
employee_id NUMBER PRIMARY KEY,
employee_name VARCHAR2(50),
department_id NUMBER REFERENCES departments(department_id),
job_id NUMBER REFERENCES jobs(job_id)
);
and the query:
from employees e, departments d
where e.department_id = d.department_id;
In this query, the join to departments is redundant. The only column referenced in the query appears in the join predicate, and the primary key-foreign key constraint guarantees that there is at most one match in departments for each row in employees. Hence, the query is equivalent to:
select e.employee_name
from employees e
where e.department_id is not null;
The optimizer will generate this plan for the query:
-------------------------------------------
Id Operation Name
-------------------------------------------
0 SELECT STATEMENT
* 1 TABLE ACCESS FULL EMPLOYEES
-------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("E"."DEPARTMENT_ID" IS NOT NULL)
Note that the IS NOT NULL predicate is not necessary if the column has a NOT NULL constraint on it.
Starting in 11gR1, the optimizer will also eliminate tables that are semi-joined or anti-joined. Consider the following query:
select e.employee_id, e.employee_name
from employees e
where not exists (select 1
from jobs j
where j.job_id = e.job_id);
Since employees.job_id is a foreign key to jobs.job_id, any non-null value inemployees.job_id must have a match in jobs. So only employees with null values foremployees.job_id will appear in the result. Hence, this query is equivalent to:
select e.employee_id, e.employee_name
from employees e
where job_id is null;
and the optimizer can choose this plan:
-------------------------------------------
Id Operation Name
-------------------------------------------
0 SELECT STATEMENT
* 1 TABLE ACCESS FULL EMPLOYEES
-------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("E"."JOB_ID" IS NULL)
Suppose employees.job_id has a NOT NULL constraint:
alter table employees modify job_id not null;
In this case, there could not possibly be any rows in EMPLOYEES, and the optimizer could choose this plan:
Id Operation Name
-------------------------------------------
0 SELECT STATEMENT
* 1 FILTER
2 TABLE ACCESS FULL EMPLOYEES
-------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(NULL IS NOT NULL)
The “NULL IS NOT NULL” filter is a false constant predicate, that will prevent the table scan from even taking place.
“NULL IS NOT NULL”
Also in 11gR1, the optimization became available for ANSI compliant joins. For this query:
select employee_name
from employees e inner join jobs j
on e.job_id = j.job_id;
the optimizer can eliminate JOBS and produce this plan:
Id Operation Name
-------------------------------------------
0 SELECT STATEMENT
1 TABLE ACCESS FULL EMPLOYEES
-------------------------------------------
Outer Join Table Elimination
In 11gR1, a new form of table elimination was introduced for outer joins, which does not require PK-FK constraints. For the example, we require a new table and an addition to EMPLOYEES:
create table projects
(
project_id NUMBER UNIQUE,
deadline DATE,
priority NUMBER
);
alter table employees add project_id number;
Now consider a query that outer joins employees and projects:
select e.employee_name, e.project_id
from employees e, projects p
where e.project_id = p.project_id (+);
The outer join guarantees that every row in employees will appear at least once in the result. The unique constraint on projects.project_id guarantees that every row in employees will match at most one row in projects. Together, these two properties guarantee that every row inemployees will appear in the result exactly once. Since no other columns from projects are referenced, projects can be eliminated, and the optimizer can choose this plan:
Id Operation Name
-------------------------------------------
0 SELECT STATEMENT
1 TABLE ACCESS FULL EMPLOYEES
-------------------------------------------
Why Would I Ever Write Such a Query?
All of the example queries in this post are very simple, and one would be unlikely to write a query where the join is so obviously unnecessary. There are many real world scenarios where table elimination may be helpful, including machine-generated queries and elimination of tables in views. For example, a set of tables might be exposed as a view, which contains a join. The join may be necessary to retrieve all of the columns exposed by the view. But some users of the view may only access a subset of the columns, and in this case, the joined table can be eliminated.
For example, consider the view:
create view employee_directory_v as
select e.employee_name, d.department_name, j.job_title
from employees e, departments d, jobs j
where e.department_id = d.department_id
and e.job_id = j.job_id;
This view might be exposed to a simple employee directory application. To lookup employee names by job title, the application issues a query:
select employee_name
from employee_directory_v
where department = ‘ACCOUNTING’;
Since the job_title column is not referenced, jobs can be eliminated from the query, and the optimizer can choose this plan:
--------------------------------------------
Id Operation Name
--------------------------------------------
0 SELECT STATEMENT
* 1 HASH JOIN
2 TABLE ACCESS FULL EMPLOYEES
* 3 TABLE ACCESS FULL DEPARTMENTS
--------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("E"."DEPARTMENT_ID"="D"."DEPARTMENT_ID")
3 - filter("D"."DEPARTMENT_NAME"='ACCOUNTING')
Known Limitations
There are currently a few limitations of table elimination:
Multi-column primary key-foreign key constraints are not supported.
Referring to the join key elsewhere in the query will prevent table elimination. For an inner join, the join keys on each side of the join are equivalent, but if the query contains other references to the join key from the table that could otherwise be eliminated, this prevents elimination. A workaround is to rewrite the query to refer to the join key from the other table (we realize this is not always possible).




