name: grafana-tst description: > Grafana test environment specialist for metrics, logs, traces, dashboards, and alerting. Use when asked to: query Prometheus metrics, search Loki logs, analyze Tempo traces, manage dashboards, configure alert rules, or investigate issues in the test environment. Trigger keywords: grafana-tst. Do NOT use for production environment (use grafana-prd instead). compatibility: Requires grafana-tst agent (.claude/agents/grafana-tst.agent.md) and grafana-tst MCP server.
Grafana Test Environment Skill
Agent Delegation
Delegate ALL operations to the grafana-tst agent.
"Use the grafana-tst agent to complete this task."
The grafana-tst agent provisions the grafana-tst MCP server and has the required
tools (list_datasources, query_prometheus, query_loki_logs, search_traces, etc.).
Description
Access Grafana test environment for observability data. Covers metrics (Prometheus), logs (Loki), traces (Tempo), profiling (Pyroscope), incident management, OnCall schedules, Sift investigations, alerting, and dashboard management. Use for development, testing, and staging workflows.
Capabilities
Metrics (Prometheus)
- Query PromQL against test Prometheus datasources
- List metric names, label names, label values, and metadata
- Query histograms and percentiles
Logs (Loki)
- Execute LogQL queries for test log retrieval
- List log label names and values
- Query log patterns and statistics
- Stream log data with filtering and parsing
Traces (Tempo)
- Find slow requests in test environment
- Analyze distributed traces
Profiling (Pyroscope)
- Fetch CPU, memory, goroutine profiles for test services
- List profile types and label values per service
Incidents & OnCall
- Create, list, and get incident details
- Add timeline notes to incidents
- List OnCall schedules, teams, users
- Get current on-call users for a schedule
Sift Investigations
- Create and retrieve automated investigations
- Find error patterns in logs
- Find slow requests across services
Alerting
- List, get, create, update, delete alert rules
- List contact points and notification policies
- List alert groups from OnCall
Dashboards
- Search, get, create, update dashboards
- Get panel queries and dashboard summaries
- Generate deeplinks to dashboards or panels
- Render panel/dashboard images (PNG)
- Create and manage annotations
Datasources
- List, get datasources by name or UID
Key Concepts
- Datasources: Named connections to backends (Prometheus, Loki, Tempo, Pyroscope). Always resolve UIDs via
list_datasourcesbefore querying. - Prometheus / PromQL: Metrics backend. Use
list_prometheus_metric_namesbefore writing PromQL to ensure valid metric names. - Loki / LogQL: Log aggregation backend. Always call
list_loki_label_names/list_loki_label_valuesbefore building LogQL queries. - Tempo: Distributed tracing backend. Use to find slow spans and trace request paths across services.
- Dashboards: JSON-defined panels; use
get_dashboard_summaryfor large dashboards to avoid token limits.
Workflow
- Connectivity check — call
list_datasources(withlimit=1) as a lightweight probe. If it fails, stop and report: "grafana-tst MCP server is unavailable. Cannot proceed." - Discover datasources — resolve correct datasource UIDs using
list_datasources. - Discover schema — list metric names, label names/values, or log labels before querying.
- Query or manage — use
query_prometheus,query_loki_logs,search_traces, dashboard/alert tools as appropriate. - Report — structure the response as: datasource → time range → results summary → anomalies/patterns → recommended actions.
Usage Examples
Test Metrics
grafana-tst Show CPU usage for test-api service in test environment
Log Investigation
grafana-tst Search Loki logs for errors in auth service last 1 hour
Dashboard Editing
grafana-tst Add a new panel to the test environment overview dashboard
Alert Testing
grafana-tst Create a test alert rule for high memory usage
Trace Analysis
grafana-tst Find slow requests for checkout service in test environment
Best Practices
- Always call
list_datasourcesfirst to get datasource UIDs before any query - Always call
list_prometheus_metric_namesbefore writing PromQL - Always call
list_loki_label_names/list_loki_label_valuesbefore writing LogQL - Use
query_loki_statsto check log volume before fetching raw entries - Use
get_dashboard_summaryinstead ofget_dashboard_by_uidfor large dashboards - Do NOT search the filesystem for
mcp-config.json— always route through the agent file
Limitations
- Test environment only — data does not reflect production
- Different data retention policies than production
- Write operations (create/update alerts, incidents) require appropriate permissions
- Image rendering requires Grafana Image Renderer service
Environment Isolation
CRITICAL: Never use grafana-tst and grafana-prd in the same request. Choose only one environment per query.