name: hologres-slow-query-analysis description: | Hologres slow query log analysis and diagnosis skill. Use for analyzing slow queries, failed queries, query performance diagnosis, and log management in Alibaba Cloud Hologres. Triggers: "hologres slow query", "hg_query_log", "query diagnosis", "慢Query分析", "Hologres性能诊断"
Prerequisites
This skill requires hologres-cli to be installed first:
pip install hologres-cli
All SQL execution and GUC parameter operations depend on hologres-cli commands (hologres sql run, hologres guc set).
Hologres Slow Query Analysis
Diagnose and analyze slow/failed queries in Alibaba Cloud Hologres using the hologres.hg_query_log system table.
Version Requirements
| Hologres Version | Feature |
|---|---|
| V0.10+ | Basic slow query log |
| V2.2+ | SQL fingerprint (digest) |
| V2.2.7+ | Default threshold 100ms |
| V3.0.2+ | Aggregated records for <100ms queries |
Quick Start
1. Check Permissions
-- Superuser: view all DB logs
ALTER USER "cloud_account_id" SUPERUSER;
-- Or join pg_read_all_stats group
GRANT pg_read_all_stats TO "cloud_account_id";
-- For current DB only (SPM model)
CALL spm_grant('<db_name>_admin', 'cloud_account_id');
2. Basic Query Count
SELECT count(*) FROM hologres.hg_query_log;
3. Recent Slow Queries (10 min)
SELECT status AS "Status",
duration AS "Duration(ms)",
query_start AS "Start Time",
(read_bytes/1048576)::text || ' MB' AS "Read",
(memory_bytes/1048576)::text || ' MB' AS "Memory",
(cpu_time_ms/1000)::text || ' s' AS "CPU",
query_id AS "QueryID",
query::char(50) AS "Query"
FROM hologres.hg_query_log
WHERE query_start >= now() - interval '10 min'
ORDER BY duration DESC
LIMIT 100;
Core Diagnostic Workflows
Workflow 1: Find Resource-Heavy Queries
Use when CPU/memory usage is high.
-- Top 10 CPU-consuming queries (past day)
SELECT digest, avg(cpu_time_ms), sum(cpu_time_ms)
FROM hologres.hg_query_log
WHERE query_start >= CURRENT_DATE - INTERVAL '1 day'
AND digest IS NOT NULL AND usename != 'system'
GROUP BY 1 ORDER BY 3 DESC LIMIT 10;
Workflow 2: Find Failed Queries
SELECT status, message::char(100), duration, query_start, query_id, query::char(80)
FROM hologres.hg_query_log
WHERE query_start BETWEEN '2024-01-01 00:00:00'::timestamptz
AND '2024-01-01 01:00:00'::timestamptz
AND status = 'FAILED'
ORDER BY query_start ASC LIMIT 100;
Workflow 3: Query Phase Analysis
Identify bottleneck phase (optimization/startup/execution).
SELECT status, duration AS "Total(ms)",
optimization_cost AS "Optimize(ms)",
start_query_cost AS "Startup(ms)",
get_next_cost AS "Execute(ms)",
duration - optimization_cost - start_query_cost - get_next_cost AS "Other(ms)",
query_id, query::char(50)
FROM hologres.hg_query_log
WHERE query_start >= now() - interval '10 min'
ORDER BY duration DESC LIMIT 100;
Workflow 4: Compare with Yesterday
SELECT query_date, count(1), sum(read_bytes), sum(cpu_time_ms)
FROM hologres.hg_query_log
WHERE query_start >= now() - interval '3 h'
GROUP BY query_date
UNION ALL
SELECT query_date, count(1), sum(read_bytes), sum(cpu_time_ms)
FROM hologres.hg_query_log
WHERE query_start >= now() - interval '1d 3h' AND query_start <= now() - interval '1d'
GROUP BY query_date;
Key Fields Reference
| Field | Description |
|---|---|
query_id | Unique query identifier |
digest | SQL fingerprint (MD5 hash) |
duration | Total query time (ms) |
cpu_time_ms | CPU time consumed |
memory_bytes | Peak memory usage |
read_bytes | Data read volume |
engine_type | Query engine (HQE/PQE/SDK/PG) |
optimization_cost | Plan generation time |
start_query_cost | Query startup time |
get_next_cost | Execution time |
Configuration
-- Set slow query threshold (DB level, superuser only)
ALTER DATABASE dbname SET log_min_duration_statement = '250ms';
-- Session level
SET log_min_duration_statement = '250ms';
-- Set log retention (V3.0.27+, 3-30 days)
ALTER DATABASE dbname SET hg_query_log_retention_time_sec = 2592000;
Or use the CLI for database-level settings:
hologres guc set log_min_duration_statement '250ms'
hologres guc set hg_query_log_retention_time_sec 2592000
References
| Document | Content |
|---|---|
| diagnostic-queries.md | Complete diagnostic SQL collection |
| log-export.md | Export logs to internal/external tables |
| configuration.md | Configuration parameters |
Best Practices
- Always filter by
query_startfor better performance - Use
digestto group similar queries for pattern analysis - Check
engine_type- PQE queries may need optimization - For
start_query_costhigh: check locks or resource contention - For
get_next_costhigh: optimize SQL or add indexes - Regular cleanup: set appropriate retention period