Friday, November 17, 2017

Teradata tricks

select * from dbc.SessionInfo;

SELECT   SYSLIB. AbortSessions(1 ,'dbc', 0,'Y' ,'Y') ; -- All sessions for a user
SELECT   SYSLIB. AbortSessions(1 ,'DBC', 123340,'Y' ,'Y') ; -- Specific session

Param1: Hostno
Param2:UserName
Param3: SessionNo
Param4: LogoffSessions
Param5: UserOverride

SELECT * FROM TABLE (MonitorSession (1, '*',0 )) AS dt WHERE PEstate <> 'Idle' or AMPstate <> 'Idle' ;

---
SELECT      TOP 10 username , clientAddr, defaultDatabase , CacheFlag ,
collectTimeStamp ,
StartTime, ElapsedTime , NumResultRows, AMPCPUTime, ERRORCODE ,
StatementType , QueryText , NumOfActiveAmps, SpoolUsage , ReqIoKB ,
ReqPhysIO
FROM dbc. qrylog
WHERE     defaultDatabase LIKE 'TPCDS%'
ORDER    BY StartTime DESC;

BEGIN QUERY LOGGING
WITH      ALL LIMIT SQLTEXT= 0
    ON   ALL;

SELECT TOP 10 *
FROM dbc. QryLogTDWM
ORDER BY collectTimeStamp DESC

SELECT TOP 100 *
FROM dbc. QryLogV
WHERE StatementType IN ('Insert' ,'Update', 'Delete')
ORDER BY collectTimeStamp DESC

SELECT TOP 100 *
FROM dbc. QryLogEvents
ORDER BY collectTimeStamp DESC
---
$ sudo pdestate -a
root's password:
PDE state: DOWN/TDMAINT

$ sudo /etc/init.d/tpa start
Teradata Database Initiator service is starting...
Teradata Database Initiator service started successfully.

$ sudo pdestate -a
PDE state is RUN/STARTED.
DBS state is 1/3: DBS Startup - Starting AMP Partitions

$ sudo pdestate -a
PDE state is RUN/STARTED.
DBS state is 4: Logons are enabled - Users are logged on
---

COLLECT STATISTICS ON tablename COLUMN columnname;
COLLECT STATISTICS ON tablename INDEX ( columnname);
COLLECT STATISTICS ON tablename INDEX ( col1, col2 , ...);
HELP STATISTICS tablename;
COLLECT STATISTICS tablename; -- refresh table statistics
DROP STATISTICS ON tablename ;

DIAGNOSTIC HELPSTATS ON FOR SESSION;

Then run query EXPLAIN. Optimizer will return something like the following at the end of the query plan:

BEGIN RECOMMENDED STATS FOR FINAL PLAN->
     -- "COLLECT STATISTICS COLUMN (I_ITEM_ID) ON TPCDS1000G.item"
     (High Confidence)
     -- "COLLECT STATISTICS COLUMN (D_DATE) ON TPCDS1000G.date_dim"
     (High Confidence)
     -- "COLLECT STATISTICS COLUMN (S_STORE_ID) ON TPCDS1000G.store"
     (High Confidence)
     TPCDS1000G.store_sales" (High Confidence)
     -- "COLLECT STATISTICS COLUMN (P_PROMO_SK) ON TPCDS1000G.promotion"
     (High Confidence)
     -- "COLLECT STATISTICS COLUMN (D_DATE_SK) ON TPCDS1000G.date_dim"
     (High Confidence)
     <- END RECOMMENDED STATS FOR FINAL PLAN
     BEGIN RECOMMENDED STATS FOR OTHER PLANS ->
     -- "COLLECT STATISTICS COLUMN (PARTITION) ON
     TPCDS1000G.store_sales" (High Confidence)
     <- END RECOMMENDED STATS FOR OTHER PLANS


Collect Full Statistics
  • Non-indexed columns used in predicates
  • All NUSIs with an uneven distribution of values * 
  • NUSIs used in join steps
  • USIs/UPIs if used in non-equality predicates (range constraints)
  • Most NUPIs (see below for a fuller discussion of NUPI statistic collection)
  • Full statistics always need to be collected on relevant columns and indexes on small tables (less than 100 rows per AMP)


Can Rely on Random AMP Sampling   

  • USIs or UPIs if only used with equality predicates
  • NUSIs with an even distribution of values
  • NUPIs that display even distribution, and if used for joining, conform to assumed uniqueness (see Point #2 under “Other Considerations” below)
  • See “Other Considerations” for additional points related to random AMP sampling


Option to use USING SAMPLE

  • Unique index columns
  • Nearly-unique columns or indexes**


Collect Multicolumn Statistics

  • Groups of columns that often appear together in conditions with equality predicates, if the first 16 bytes of the concatenated column values are sufficiently distinct. These statistics will be used for single-table estimates.
  • Groups of columns used for joins or aggregations, where there is either a dependency or some degree of correlation among them.***  With no multicolumn statistics collected, the optimizer assumes complete independence among the column values. The more that the combination of actual values are correlated, the greater the value of collecting multicolumn statistics will be.

No comments: