索引統(tǒng)計(jì)信息中需要我們最為重點(diǎn)關(guān)注的是CLUSTERING_FACTOR(聚簇因子)。
創(chuàng)新互聯(lián)公司從2013年創(chuàng)立,先為準(zhǔn)格爾等服務(wù)建站,準(zhǔn)格爾等地企業(yè),進(jìn)行企業(yè)商務(wù)咨詢服務(wù)。為準(zhǔn)格爾企業(yè)網(wǎng)站制作PC+手機(jī)+微官網(wǎng)三網(wǎng)同步一站式服務(wù)解決您的所有建站問題。
在Oracle數(shù)據(jù)庫中,聚簇因子是指按照索引鍵值排序的索引行和存儲(chǔ)于對應(yīng)表中數(shù)據(jù)行的存儲(chǔ)順序和相似度。Oracle是按照如下的算法來計(jì)算聚簇因子的值:
聚簇因子的初始值為1。
Oracle首先定位到目標(biāo)索引處于最左邊的葉子塊。
從最左邊的葉子塊的第一個(gè)索引鍵值所在的索引行開始順序掃描,在順序掃描的過程中,Oracle會(huì)比對當(dāng)前索引行的rowid和它之前的那個(gè)索引行(它們是相鄰的關(guān)系)的rowid,如果這兩個(gè)rowid并不是指向同一個(gè)表塊,那么Oracle就將聚簇因子的當(dāng)前值遞增1;如果這兩個(gè)rowid是指向同一個(gè)表塊,Oracle就不改變聚簇因子的當(dāng)前值。注意,這里Oracle在比對rowid時(shí)不需要回表去訪問相應(yīng)的表塊。
上述比對rowid的過程會(huì)一直持續(xù)下去,直到順序掃描完目標(biāo)索引所有葉子塊里的所有索引行。
上述順序掃描操作完成后,聚簇因子的當(dāng)前值就是索引統(tǒng)計(jì)信息中的CLUSTERING_FACTOR,Oracle會(huì)將其存儲(chǔ)在數(shù)據(jù)字典里。
從上述聚簇因子的算法可以知道,如果聚簇因子的值接近對應(yīng)表的表塊的數(shù)量,則說明目標(biāo)索引索引行和存儲(chǔ)于對應(yīng)表中數(shù)據(jù)行的存儲(chǔ)順序相似程度非常高。這也就意味著Oracle走索引范圍掃描后取得目標(biāo)rowid再回表去訪問對應(yīng)表塊的數(shù)據(jù)時(shí),相鄰的索引行所對應(yīng)的rowid極有可能處于同一個(gè)表塊中,即Oracle在通過索引行記錄的rowid回表第一次讀取對應(yīng)的表塊并將該表塊緩存在buffer cache中后,當(dāng)再通過相鄰索引行記錄的rowid回表第二次讀取對應(yīng)的表塊時(shí),就不需要再產(chǎn)生物理I/O了,因?yàn)檫@次要訪問的和上次已經(jīng)訪問過的表塊是同一個(gè)塊,Oracle已經(jīng)將其緩存在了buffer cache中。而如果聚簇因子的值接近對應(yīng)表的記錄數(shù),則說明目標(biāo)索引索引行和存儲(chǔ)于對應(yīng)表中數(shù)據(jù)行的存儲(chǔ)順序和相似程度非常低,這也就意味著Oracle走索引范圍掃描取得目標(biāo)rowid再回表去訪問對應(yīng)表塊的數(shù)據(jù)時(shí),相鄰的索引行所對應(yīng)的rowid極有可能不處于同一個(gè)表塊中,即Oracle在通過索引行記錄的rowid回表第一次去讀取對應(yīng)的表塊并將表塊緩存在buffer cache中后,當(dāng)再通過相鄰索引行記錄的rowid回表第二次讀取對應(yīng)的表塊時(shí),還需要再產(chǎn)生物理I/O,因?yàn)檫@次要訪問的和上次已經(jīng)訪問過的表塊并不是同一個(gè)塊。
換句話說,聚簇因子高的索引走索引范圍掃描時(shí)比相同條件下聚簇因子低的索引要耗費(fèi)更多的物理I/O,所以聚簇因子高的索引走索引范圍掃描的成本會(huì)比相同條件下聚簇因子低的索引走索引范圍掃描的成本高。
這里構(gòu)造一個(gè)非常極端的例子,全索引中沒有任何相鄰的索引行記錄的rowid指向表中相同的數(shù)據(jù)塊:
根據(jù)上述聚簇因子的算法,我們可以算出此索引IDX_T1的聚簇因子的值應(yīng)是20。
zx@MYDB>create table t1 (id number,name char(1200)); Table created. zx@MYDB>insert into t1 values(1,'1'); 1 row created. zx@MYDB>insert into t1 values(3,'3'); 1 row created. zx@MYDB>insert into t1 values(5,'5'); 1 row created. zx@MYDB>insert into t1 values(7,'7'); 1 row created. zx@MYDB>insert into t1 values(9,'9'); 1 row created. zx@MYDB>insert into t1 values(11,'11'); 1 row created. zx@MYDB>insert into t1 values(13,'13'); 1 row created. zx@MYDB>insert into t1 values(15,'15'); 1 row created. zx@MYDB>insert into t1 values(17,'17'); 1 row created. zx@MYDB>insert into t1 values(19,'19'); 1 row created. zx@MYDB>insert into t1 values(2,'2'); 1 row created. zx@MYDB>insert into t1 values(4,'4'); 1 row created. zx@MYDB>insert into t1 values(6,'6'); 1 row created. zx@MYDB>insert into t1 values(8,'8'); 1 row created. zx@MYDB>insert into t1 values(10,'10'); 1 row created. zx@MYDB>insert into t1 values(12,'12'); 1 row created. zx@MYDB>insert into t1 values(14,'14'); 1 row created. zx@MYDB>insert into t1 values(16,'16'); 1 row created. zx@MYDB>insert into t1 values(18,'18'); 1 row created. zx@MYDB>insert into t1 values(20,'20'); 1 row created. zx@MYDB>commit; Commit complete. zx@MYDB>create index idx_t1 on t1(id); Index created. zx@MYDB>col location for a10 zx@MYDB>select id,dbms_rowid.rowid_relative_fno(rowid)||'_'||dbms_rowid.rowid_block_number(rowid) location from t1 order by location,id; ID LOCATION ---------- ---------- 1 4_300 3 4_300 5 4_300 7 4_300 9 4_300 11 4_301 13 4_301 15 4_301 17 4_301 19 4_301 2 4_302 4 4_302 6 4_302 8 4_302 10 4_302 12 4_303 14 4_303 16 4_303 18 4_303 20 4_303 20 rows selected.
從上述顯示結(jié)果可以看出1、3、5、7、9在4號(hào)文件的300號(hào)數(shù)據(jù)塊內(nèi),11、13、15、17、19在4號(hào)文件的301號(hào)數(shù)據(jù)塊內(nèi),2、4、6、8、10在4號(hào)文件的第302號(hào)數(shù)據(jù)塊內(nèi),12、14、16、18、20在4號(hào)文件的第303號(hào)數(shù)據(jù)塊內(nèi)。
收集統(tǒng)計(jì)信息并查看聚簇因子的值
#收集統(tǒng)計(jì)信息并查看聚簇因子的值 zx@MYDB>exec dbms_stats.gather_table_stats(ownname=>'ZX',tabname=>'T1',method_opt=>'for all columns size auto',cascade=>true,estimate_percent=>100); PL/SQL procedure successfully completed. zx@MYDB>select index_name,clustering_factor from dba_indexes where index_name='IDX_T1'; INDEX_NAME CLUSTERING_FACTOR ------------------------------------------------------------------------------------------ ----------------- IDX_T1 20
在Oracle數(shù)據(jù)庫中,能夠降低目標(biāo)索引的聚簇因子的唯一方法就是對表中的數(shù)據(jù)按照目標(biāo)索引的索引鍵值排序后重新存儲(chǔ)。需要注意的是,這種按某一個(gè)目標(biāo)索引的索引鍵值排序后重新存儲(chǔ)表中數(shù)據(jù)的方法確實(shí)可以降低該目標(biāo)索引聚簇因子的值,但可能會(huì)同時(shí)增加該表上存在的其他索引值的聚簇因子的值。
將表T1的數(shù)據(jù)原封不動(dòng)的照搬到表T2中,只不過表T2的數(shù)據(jù)在存儲(chǔ)時(shí)已經(jīng)按id列排好序了
zx@MYDB>create table t2 as select * from t1 order by id; Table created. zx@MYDB>create index idx_t2 on t2(id); Index created. zx@MYDB>select id,dbms_rowid.rowid_relative_fno(rowid)||'_'||dbms_rowid.rowid_block_number(rowid) location from t2 order by location,id; ID LOCATION ---------- ---------- 1 4_171 2 4_171 3 4_171 4 4_171 5 4_171 6 4_172 7 4_172 8 4_172 9 4_172 10 4_172 11 4_173 12 4_173 13 4_173 14 4_173 15 4_173 16 4_174 17 4_174 18 4_174 19 4_174 20 4_174 20 rows selected. zx@MYDB>exec dbms_stats.gather_table_stats(ownname=>'ZX',tabname=>'T1',method_opt=>'for all columns size auto',cascade=>true,estimate_percent=>100); PL/SQL procedure successfully completed. zx@MYDB>select index_name,clustering_factor from dba_indexes where index_name='IDX_T2'; INDEX_NAME CLUSTERING_FACTOR ------------------------------------------------------------------------------------------ ----------------- IDX_T2 4
重復(fù)與表T1相同的一系列的操作,從結(jié)果可以看出索引IDX_T2的聚簇因子降為了4。而相鄰的數(shù)據(jù)也都在同一數(shù)據(jù)塊中。
在Oracle數(shù)據(jù)庫里,CBO在計(jì)算索引范圍掃描(Index Range Scan)的成本時(shí)會(huì)使用如下公式:
IRS Cost = I/O Cost + CPU Cost
而I/O Cost的計(jì)算公式為:
I/O Cost = Index Access I/O Cost + Table Access I/O Cost
Index Access I/O Cost = BLEVEL + CEIL(#LEAF_BLOCKS * IX_SEL)
Table Access I/O Cost = CEIL(CLUSTERING_FACTOR * IX_SEL_WITH_FILTERS)
從這個(gè)公式可以推斷走索引范圍掃描的成本可以近似看作是和聚簇因子成正比,因此,聚簇因子值的大小實(shí)際上對CBO判斷是否走相關(guān)的索引起著至關(guān)重要的作用。
演示一個(gè)例子,通過修改聚簇索引的值就讓原本走索引范圍掃描的執(zhí)行計(jì)劃變成了走全表掃描:
zx@MYDB>create table t1 as select * from dba_objects; Table created. zx@MYDB>create index idx_t1 on t1(object_id); Index created. zx@MYDB>select clustering_factor from dba_indexes where index_name='IDX_T1'; CLUSTERING_FACTOR ----------------- 1063 zx@MYDB>select /*+ cluster_factor_expmple_1 */ object_id,object_name from t1 where object_id between 103 and 108; OBJECT_ID OBJECT_NAME ---------- ------------------------------ 103 MIGRATE$ 104 DEPENDENCY$ 105 ACCESS$ 106 I_DEPENDENCY1 107 I_DEPENDENCY2 108 I_ACCESS1 6 rows selected. zx@MYDB>select * from table(dbms_xplan.display_cursor(null,null,'all')); PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ SQL_ID ga3jv3kwwwmx5, child number 0 ------------------------------------- select /*+ cluster_factor_expmple_1 */ object_id,object_name from t1 where object_id between 103 and 108 Plan hash value: 50753647 -------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 3 (100)| | | 1 | TABLE ACCESS BY INDEX ROWID| T1 | 6 | 474 | 3 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | IDX_T1 | 6 | | 2 (0)| 00:00:01 | -------------------------------------------------------------------------------------- ......省略部分輸出
SQL走了索引范圍掃描,成本值為3
使用Hint強(qiáng)制SQL走全表掃描:
zx@MYDB>select /*+ full(t1) */ object_id,object_name from t1 where object_id between 103 and 108; OBJECT_ID OBJECT_NAME ---------- ------------------------------ 103 MIGRATE$ 104 DEPENDENCY$ 105 ACCESS$ 106 I_DEPENDENCY1 107 I_DEPENDENCY2 108 I_ACCESS1 6 rows selected. zx@MYDB>select * from table(dbms_xplan.display_cursor(null,null,'all')); PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ SQL_ID b7hjwuvmg2ncy, child number 0 ------------------------------------- select /*+ full(t1) */ object_id,object_name from t1 where object_id between 103 and 108 Plan hash value: 3617692013 -------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 287 (100)| | |* 1 | TABLE ACCESS FULL| T1 | 6 | 474 | 287 (1)| 00:00:04 | -------------------------------------------------------------------------- ......省略部分輸出
現(xiàn)在SQL走全表掃描,成本值為287。
我們已經(jīng)知道走索引范圍掃描的成本可以近似看作是和聚簇因子成正比,所以如果想讓上述SQL的執(zhí)行計(jì)劃從索引范圍掃描變?yōu)槿頀呙瑁敲粗恍枰{(diào)整聚簇因子的值,使走索引范圍掃描的成本值大于走全表掃描的成本值346即可達(dá)到目的。
先將索引IDX_T1的聚簇因子的值手工調(diào)整為100萬:
zx@MYDB>exec dbms_stats.set_index_stats(ownname=>'ZX',indname=>'IDX_T1',clstfct=>1000000,no_invalidate=>false); PL/SQL procedure successfully completed. zx@MYDB>select clustering_factor from dba_indexes where index_name='IDX_T1'; CLUSTERING_FACTOR ----------------- 1000000 zx@MYDB>select /*+ cluster_factor_expmple_2 */ object_id,object_name from t1 where object_id between 103 and 108; OBJECT_ID OBJECT_NAME ---------- ------------------------------ 103 MIGRATE$ 104 DEPENDENCY$ 105 ACCESS$ 106 I_DEPENDENCY1 107 I_DEPENDENCY2 108 I_ACCESS1 6 rows selected. zx@MYDB>select * from table(dbms_xplan.display_cursor(null,null,'all')); PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ SQL_ID 1ucqsj4j0j432, child number 0 ------------------------------------- select /*+ cluster_factor_expmple_2 */ object_id,object_name from t1 where object_id between 103 and 108 Plan hash value: 50753647 -------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 105 (100)| | | 1 | TABLE ACCESS BY INDEX ROWID| T1 | 6 | 474 | 105 (0)| 00:00:02 | |* 2 | INDEX RANGE SCAN | IDX_T1 | 6 | | 2 (0)| 00:00:01 | -------------------------------------------------------------------------------------- ......省略部分輸出
從執(zhí)行計(jì)劃中可以看出,成本已經(jīng)由3變?yōu)?05(即增加了102),這說明我們對索引IDX_T1的聚簇因子的調(diào)整生效了。
要使成本值大于287,只需要把聚簇因子的值調(diào)整到400萬。
zx@MYDB>exec dbms_stats.set_index_stats(ownname=>'ZX',indname=>'IDX_T1',clstfct=>4000000,no_invalidate=>false); PL/SQL procedure successfully completed. zx@MYDB>select clustering_factor from dba_indexes where index_name='IDX_T1'; CLUSTERING_FACTOR ----------------- 4000000 zx@MYDB>select /*+ cluster_factor_expmple_3 */ object_id,object_name from t1 where object_id between 103 and 108; OBJECT_ID OBJECT_NAME ---------- ------------------------------ 103 MIGRATE$ 104 DEPENDENCY$ 105 ACCESS$ 106 I_DEPENDENCY1 107 I_DEPENDENCY2 108 I_ACCESS1 6 rows selected. zx@MYDB>select * from table(dbms_xplan.display_cursor(null,null,'all')); PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ SQL_ID cwkc8q61bypa6, child number 0 ------------------------------------- select /*+ cluster_factor_expmple_3 */ object_id,object_name from t1 where object_id between 103 and 108 Plan hash value: 3617692013 -------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 287 (100)| | |* 1 | TABLE ACCESS FULL| T1 | 6 | 474 | 287 (1)| 00:00:04 | --------------------------------------------------------------------------
從上面顯示的內(nèi)容可以看出執(zhí)行計(jì)劃從索引范圍掃描變成了全表掃描。
參考《基于Oracle的SQL優(yōu)化》
新聞名稱:Oracle索引聚簇因子的含義及重要性
本文來源:http://vcdvsql.cn/article42/podiec.html
成都網(wǎng)站建設(shè)公司_創(chuàng)新互聯(lián),為您提供小程序開發(fā)、做網(wǎng)站、網(wǎng)頁設(shè)計(jì)公司、網(wǎng)站策劃、品牌網(wǎng)站設(shè)計(jì)、網(wǎng)站收錄
聲明:本網(wǎng)站發(fā)布的內(nèi)容(圖片、視頻和文字)以用戶投稿、用戶轉(zhuǎn)載內(nèi)容為主,如果涉及侵權(quán)請盡快告知,我們將會(huì)在第一時(shí)間刪除。文章觀點(diǎn)不代表本網(wǎng)站立場,如需處理請聯(lián)系客服。電話:028-86922220;郵箱:631063699@qq.com。內(nèi)容未經(jīng)允許不得轉(zhuǎn)載,或轉(zhuǎn)載時(shí)需注明來源: 創(chuàng)新互聯(lián)