Vertica使用partitions表對分區(qū)大小進行匯總出現(xiàn)差異的原因
使用partitions表對分區(qū)表的分區(qū)表數(shù)量進行匯總,部分分區(qū)表匯總結(jié)果和projection_storage結(jié)果差異巨大
table_name|acc_sum_MB| p_sum_MB |a
----------+----------+----------+----------------
s1.T1 |119,158,775 |1,768,641,038 |-1,649,482,263
s1.T2 |105,580,146 |105,580,146 |0
s1.T3 |43,350,583 |43,350,583 |0
s1.T4 |21,088 |5,032,638 |-5,011,550
原因在于partitions中的ros_id指定的ros中可能存放有多個分區(qū)的數(shù)據(jù),在按分區(qū)匯總分析的過程中重復計算了大量的ros大小。
- 正常情況下每個分區(qū)的數(shù)據(jù)一定是存放在不同的ROS中,但如果對該表使用mergeoutsingleros函數(shù)合并過文件,則會出現(xiàn)不同分區(qū)數(shù)據(jù)在一個ros的情況。
下面這個表既是如此:
dbadmin=> select ros_id,count(distinct partition_key) from partitions where table_schema = ‘s1’ and projection_name ilike ‘T1_b%’ group by 1 order by 2 desc limit 10
dbadmin-> ;
ros_id | count
--------------------±------
45035998330997143 | 124
49539597958367061 | 124
54043197585737455 | 124
58546797213107695 | 124
63050396840478331 | 124
76561195722589955 | 124
81064795349960353 | 124
90071994604701407 | 124
99079193859442525 | 124
108086393114183323 | 124
(10 rows)
這個表的一個ros_id只有一個分區(qū)的數(shù)據(jù),所以匯總出來的結(jié)果和projection_storage數(shù)據(jù)一致
dbadmin=> select ros_id,count(distinct partition_key) from partitions where table_schema = ‘s1’ and projection_name ilike ‘T1_b%’ group by 1 order by 2 desc limit 10;
ros_id | count
-------------------±------
49539597991809079 | 1
72057596095218277 | 1
72057596206148613 | 1
81064795614503925 | 1
85568395111826421 | 1
90071994604700539 | 1
94575594232070745 | 1
94575594232070765 | 1
94575594232071131 | 1
94575594232071157 | 1
(10 rows)
可以通過 alter table … reorganize; 分開再匯總。
alter table s1.t1 reorganize;
NOTICE 4785: Started background repartition table task
ALTER TABLE
dbadmin=>
dbadmin=> select ros_id,count(distinct partition_key) from partitions where table_schema = ‘s1’ and projection_name ilike ‘T1_b%’ group by 1 order by 2 desc limit 10;
ros_id | count
-------------------±------
45035998686463575 | 240
58546797569190371 | 240
63050397177898613 | 240
67553996804240455 | 240
81064795686208547 | 240
81064795686208721 | 240
90071994946351151 | 240
90071994946351323 | 240
94575594570276897 | 240
94575594570277079 | 240
(10 rows)
dbadmin=> select ros_id,count(distinct partition_key) from partitions where table_schema = ‘s1’ and projection_name ilike ‘T1_b%’ group by 1 order by 2 desc limit 10;
ros_id | count
-------------------±------
45035998711548941 | 1
54043197942002701 | 1
58546797589367811 | 1
58546797595422147 | 1
63050397202424841 | 1
67553996828519431 | 1
67553996828549109 | 1
72057596451331077 | 1
76561196059593735 | 1
76561196079090683 | 1
(10 rows)
select foo.table_name,foo.sum_MB as acc_sum_MB,lish.sum_MB as p_sum_MB from
(select ps.anchor_table_schema||’.’||ps.anchor_table_name table_name,sum(ps.used_bytes)//1024^2 sum_MB from projection_storage ps group by 1) foo
left join
(SELECT
prj.projection_schema||’.’||prj.anchor_table_name table_name,
SUM(par.ros_size_bytes)//1024^2 sum_MB
FROM partitions par JOIN projections prj USING (projection_id) GROUP BY 1
) lish
on foo.table_name = lish.table_name
WHERE foo.table_name = ‘s1.T1’
order by 2 desc;
table_name | acc_sum_MB | p_sum_MB
-------------------------------±-----------±---------
s1.T1 | 20739 | 20739
(1 row)

最后修改時間:2024-08-29 11:32:18
「喜歡這篇文章,您的關(guān)注和贊賞是給作者最好的鼓勵」
關(guān)注作者
【版權(quán)聲明】本文為墨天輪用戶原創(chuàng)內(nèi)容,轉(zhuǎn)載時必須標注文章的來源(墨天輪),文章鏈接,文章作者等基本信息,否則作者和墨天輪有權(quán)追究責任。如果您發(fā)現(xiàn)墨天輪中有涉嫌抄襲或者侵權(quán)的內(nèi)容,歡迎發(fā)送郵件至:contact@modb.pro進行舉報,并提供相關(guān)證據(jù),一經(jīng)查實,墨天輪將立刻刪除相關(guān)內(nèi)容。




