Sunday 17 February 2013

Some Useful System Queries

1. To see all the queries currently running:

vsql=> select user_name, session_id, transaction_id, statement_id, request, request_duration_ms, start_timestamp from query_requests where is_executing='t'  order by start_timestamp desc;


"query_requests" is a system view built over few system tables, but this is only available in Vertica 6.0 and above. To get this output in older versions:

vsql=> select user_name, session_id, transaction_id, statement_id, query,query_duration_us, query_start from query_profiles where is_executing='t';




2. To get the run time for a particular query:

vsql=> select user_name, session_id, transaction_id, statement_id, request, request_duration_ms, start_timestamp from query_requests where transaction_id=< transaction id > and statement_id= < statement id >;

Or we can filter on particular request(s):

vsql=> select user_name, session_id, transaction_id, statement_id, request, request_duration_ms, start_timestamp from query_requests where request ilike '%my_table%' order by start_timestamp desc limit 10;



Corresponding queries for version 5.x:

vsql=> select ri.time as Start_Time, rc.time as End_Time,rc.time - ri.time as Query_Duration, ri.request as Query from dc_requests_issued ri, dc_requests_completed rc where ri.session_id = rc.session_id and ri.request_id = rc.request_id and ri.transaction_id=< transaction id > and ri.statement_id=< statement id >;

vsql=> select ri.time as Start_Time, rc.time as End_Time,rc.time - ri.time as Query_Duration, ri.request as Query from dc_requests_issued ri, dc_requests_completed rc where ri.session_id = rc.session_id and ri.request_id = rc.request_id and ri.request like '%my_table%' order by ri.time desc limit 10;



3. To see the resources acquired by running queries:

vsql=> select transaction_id, statement_id, thread_count, open_file_handle_count, memory_inuse_kb/1024 memory_inuse_mb from resource_acquisitions where is_executing='t' limit 10;



4. To see the run time of various execution engine operators:

vsql=> select operator_name, avg(counter_value) as Avg_Time_Per_Counter from dc_execution_engine_profiles where transaction_id= < transaction id > and statement_id= < statement id > and counter_name='execution time (us)' group by operator_name order by avg(counter_value) desc;

Once we have the run time values for all the operators, we can drill down a particular operator to see the individual counter values.

vsql=> select counter_name, avg(counter_value) as Avg_Counter_Value from dc_execution_engine_profiles where transaction_id= < transaction id > and statement_id= <statement id > and operator_name='Join' group by counter_name order by avg(counter_value) desc;



5. To see the delete row counts schema wise and projection wise:

vsql=> select schema_name, sum(total_row_count) AS Total_Row_Count, sum(deleted_row_count) AS Deleted_Row_Count FROM storage_containers group by schema_name order by sum(deleted_row_count) desc;

vsql=> select schema_name,projection_name,storage_type,sum(deleted_row_count) as Deleted_Row_Count  from delete_vectors group by schema_name,projection_name,storage_type order by rows desc;



6. To see all the locks in the database and the associated session.

vsql=> select object_name as Object, lock_mode as Lock_Mode, lock_scope as Lock_Scope, session_id, substring(current_statement, 1, 30) || '...' as Statement from v_monitor.sessions s join v_monitor.locks l on to_hex(s.transaction_id) = substring(l.transaction_description, 6, 14);



7. To find the ROS container count for projection:

vsql=> select projection_name, node_name, count(distinct storage_oid) as ROS_Container_Count from storage_containers where projection_name= < projection name > group by 1,2 order by 1,2;

Saturday 9 February 2013

Tuple Mover Mergeout, Replay Delete & Locks

Tuple Mover operation Mergeout is the process of consolidating ROS containers and purging deleted records. These ROS containers (called ROSs are actually files that are subset of Vertica's Read Optimized Store) are created by bulk loads and DML operations to the data stored within a projection. The Tuple Mover periodically merges ROS containers to maximize performance.

Mergeout works in two phases. First phase takes a U (Usage) lock that doesn't conflict with any other lock type request. Second phase takes a T (Tuple Mover) lock, which conflicts with transactions that require an X (Exclusive lock needed for delete operations) or O (Owner lock needed for Drop/Truncate) lock.
X & O locks are not allowed if the object already has a T lock (as a matter of fact O lock is not compatible with any lock and X lock is compatible only with U lock)

In such scenario, X & O lock requests would fail with an error like below:

ERROR: Locking failure: Timed out X locking Table:<<TABLE NAME>>. 
T held by [user <user name> (Mergeout: (Table: <<SCHEMA NAME.TABLE NAME>>) 
(Projection: <<PROJECTION NAME>>))]. Your current transaction isolation level is SERIALIZABLE

One common reason for an extended T lock duration is when Mergeout is doing the Replay Delete operation. Replay delete is the process of reconciling deleted records to determine if they should or shouldn't be included in to the new ROS containers (resulting from merging). This reconciliation is based on configuration parameter "HistoryRetentionTime", and it determines how long deleted data is saved (in seconds) as a historical reference. TM purges the records deleted prior to HistoryRetentionTime and records post this time are reorganized in new ROS containers. Replay delete is a rather slow operation in Vertica, since Vertica primarily is optimized for query intensive workloads.

But if a TM Mergeout is stuck in replay delete, you either have the option to wait for it to complete, or you will have to stop it to allow other operations to complete. To cancel a mergeout, we need to take the following steps:

1. Query table TUPLE_MOVER_OPERATIONS to get the associated SESSION_ID:

vsql=> select SESSION_ID from TUPLE_MOVER_OPERATIONS where IS_EXECUTING = True and PLAN_TYPE = 'Replay Delete';

2. Once we have the session id, we can use function CLOSE_SESSION to close the session and hence mergeout. e.g.

vsql=> select CLOSE_SESSION('myhost-12990:0x51c31');

(PS: Long running Replay delete can be avoided if we advance the AHM (function MAKE_AHM_NOW()) after any bulk deletes, this will make sure that Tuple Mover purges the deleted records rather than having to reorganize it.)