CMDBuild Forum

clean up history records?

We have a great many history entries for our records, resulting in query performance like this:

Sort (cost=5409733.20…5409733.70 rows=200 width=17) (actual time=25463.954…25463.964 rows=195 loops=1)
Sort Key: “Company”.“CompanyTitle”
Sort Method: quicksort Memory: 37kB
→ HashAggregate (cost=5409723.56…5409725.56 rows=200 width=17) (actual time=25463.676…25463.710 rows=195 loops=1)
Group Key: “Company”.“CompanyTitle”
→ Nested Loop (cost=0.00…4690935.24 rows=287515326 width=17) (actual time=0.089…24714.027 rows=2206394 loops=1)
→ Append (cost=0.00…362014.59 rows=2244460 width=4) (actual time=0.007…2251.179 rows=2224060 loops=1)
→ Seq Scan on “SFCircuitV1” (cost=0.00…0.00 rows=1 width=4) (actual time=0.001…0.001 rows=0 loops=1)
→ Seq Scan on “CustomerCircuitV1” (cost=0.00…1934.62 rows=5162 width=4) (actual time=0.006…4.807 rows=5162 loops=1)
→ Seq Scan on “CarrierCircuitV1” (cost=0.00…830.62 rows=2762 width=4) (actual time=0.003…2.364 rows=2762 loops=1)
→ Seq Scan on “BGPPeerCircuitV1” (cost=0.00…10.12 rows=12 width=4) (actual time=0.002…0.019 rows=12 loops=1)
→ Seq Scan on “CarrierForecastCircuitV1” (cost=0.00…2.02 rows=2 width=4) (actual time=0.002…0.005 rows=2 loops=1)
→ Seq Scan on “ColoCircuitV1” (cost=0.00…22.90 rows=90 width=4) (actual time=0.005…0.067 rows=90 loops=1)
→ Seq Scan on “CrossConnectCircuitV1” (cost=0.00…331.30 rows=1530 width=4) (actual time=0.002…1.012 rows=1447 loops=1)
→ Seq Scan on “CustomerMicrowaveCircuitV1” (cost=0.00…13.08 rows=8 width=4) (actual time=0.006…0.023 rows=7 loops=1)
→ Seq Scan on “DeviceManagementCircuitV1” (cost=0.00…567.89 rows=1189 width=4) (actual time=0.003…1.340 rows=1189 loops=1)
→ Seq Scan on “StandardCircuitV1” (cost=0.00…53.80 rows=180 width=4) (actual time=0.015…0.144 rows=180 loops=1)
→ Seq Scan on “SiteLeaseCircuitV1” (cost=0.00…11.34 rows=34 width=4) (actual time=0.009…0.035 rows=34 loops=1)
→ Seq Scan on “CustomerCircuitV1_history” (cost=0.00…198849.52 rows=1147152 width=4) (actual time=0.013…1070.516 rows=1136290 loops=1)
→ Seq Scan on “CarrierCircuitV1_history” (cost=0.00…92353.72 rows=590572 width=4) (actual time=0.040…546.452 rows=583271 loops=1)
→ Seq Scan on “BGPPeerCircuitV1_history” (cost=0.00…391.17 rows=2817 width=4) (actual time=0.003…2.038 rows=2734 loops=1)
→ Seq Scan on “CarrierForecastCircuitV1_history” (cost=0.00…37.80 rows=280 width=4) (actual time=0.003…0.169 rows=286 loops=1)
→ Seq Scan on “ColoCircuitV1_history” (cost=0.00…1558.94 rows=11094 width=4) (actual time=0.003…6.089 rows=10835 loops=1)
→ Seq Scan on “CrossConnectCircuitV1_history” (cost=0.00…31565.07 rows=236907 width=4) (actual time=0.034…165.453 rows=235361 loops=1)
→ Seq Scan on “CustomerMicrowaveCircuitV1_history” (cost=0.00…314.74 rows=1374 width=4) (actual time=0.003…0.767 rows=1379 loops=1)
→ Seq Scan on “DeviceManagementCircuitV1_history” (cost=0.00…27487.81 rows=195981 width=4) (actual time=0.040…145.624 rows=195919 loops=1)
→ Seq Scan on “StandardCircuitV1_history” (cost=0.00…4911.61 rows=40661 width=4) (actual time=0.069…23.456 rows=40547 loops=1)
→ Seq Scan on “SiteLeaseCircuitV1_history” (cost=0.00…766.52 rows=6652 width=4) (actual time=0.003…3.222 rows=6553 loops=1)
→ Append (cost=0.00…1.82 rows=11 width=111) (actual time=0.004…0.009 rows=1 loops=2224060)
→ Seq Scan on “Company” (cost=0.00…0.00 rows=1 width=222) (actual time=0.000…0.000 rows=0 loops=2224060)
Filter: (“SFCircuitV1”.buyer_customer_account__c = “Id”)
→ Index Scan using “Supplier_pkey” on “Supplier” (cost=0.12…0.15 rows=1 width=222) (actual time=0.001…0.001 rows=0 loops=2224060)
Index Cond: (“Id” = “SFCircuitV1”.buyer_customer_account__c)
→ Index Scan using “CorporateGroup_pkey” on “CorporateGroup” (cost=0.12…0.15 rows=1 width=222) (actual time=0.001…0.001 rows=0 loops=2224060)
Index Cond: (“Id” = “SFCircuitV1”.buyer_customer_account__c)
→ Index Scan using “Customer_pkey” on “Customer” (cost=0.27…0.29 rows=1 width=21) (actual time=0.001…0.002 rows=1 loops=2224060)
Index Cond: (“Id” = “SFCircuitV1”.buyer_customer_account__c)
→ Index Scan using “DataCenterOperator_pkey” on “DataCenterOperator” (cost=0.13…0.15 rows=1 width=12) (actual time=0.001…0.001 rows=0 loops=2224060)
Index Cond: (“Id” = “SFCircuitV1”.buyer_customer_account__c)
→ Index Scan using “Carrier_pkey” on “Carrier” (cost=0.14…0.16 rows=1 width=19) (actual time=0.001…0.001 rows=0 loops=2224060)
Index Cond: (“Id” = “SFCircuitV1”.buyer_customer_account__c)
→ Index Scan using “Supplier_history_pkey” on “Supplier_history” (cost=0.13…0.15 rows=1 width=222) (actual time=0.000…0.000 rows=0 loops=2224060)
Index Cond: (“Id” = “SFCircuitV1”.buyer_customer_account__c)
→ Index Scan using “CorporateGroup_history_pkey” on “CorporateGroup_history” (cost=0.13…0.15 rows=1 width=222) (actual time=0.001…0.001 rows=0 loops=2224060)
Index Cond: (“Id” = “SFCircuitV1”.buyer_customer_account__c)
→ Index Scan using “Customer_history_pkey” on “Customer_history” (cost=0.29…0.31 rows=1 width=21) (actual time=0.001…0.001 rows=0 loops=2224060)
Index Cond: (“Id” = “SFCircuitV1”.buyer_customer_account__c)
→ Index Scan using “DataCenterOperator_history_pkey” on “DataCenterOperator_history” (cost=0.14…0.16 rows=1 width=12) (actual time=0.001…0.001 rows=0 loops=2224060)
Index Cond: (“Id” = “SFCircuitV1”.buyer_customer_account__c)
→ Index Scan using “Carrier_history_pkey” on “Carrier_history” (cost=0.15…0.17 rows=1 width=21) (actual time=0.001…0.001 rows=0 loops=2224060)
Index Cond: (“Id” = “SFCircuitV1”.buyer_customer_account__c)
Planning time: 14.052 ms
Execution time: 25464.230 ms
(53 rows)

As you can see 90% of the query time is spent on history records, many of which go back several years. Also many of those records were triggered by “null updates” by automated scripts and are literally useless.

I’m curious how we can either:
query a table but not tables that inherit from it
or
clean up old unnecessary history records from the database