name: carbon.data.qa description: Answer analytical questions about carbon accounting data using internal datasets, APIs, and emission factor calculations.
carbon.data.qa
Purpose
This skill enables Claude to answer factual, analytical questions about carbon accounting data by querying Carbon ACX's internal datasets (CSV files in data/ directory), derived artifacts, and the local API when running. It encodes domain knowledge about:
- Carbon accounting terminology and units (tCO2e, kWh, pkm, etc.)
- Emission factor structures and relationships
- Activity-to-emissions calculations
- Temporal data queries (Q1 2024, monthly totals, etc.)
- Layer, sector, and profile hierarchies
When to Use
Trigger Patterns:
- User asks about emissions data: "What were total CO2 emissions for Q1 2024?"
- Queries about specific activities: "What's the emission factor for streaming video?"
- Comparative questions: "Compare emissions from cloud storage vs local storage"
- Data exploration: "Show me all activities in the professional services layer"
- Unit conversions: "Convert 500 kWh to tCO2e"
- Source/provenance queries: "Where does the video streaming data come from?"
Do NOT Use When:
- User wants to generate reports (use
carbon.report.geninstead) - User wants to write code (use
acx.code.assistantinstead) - Questions about repo structure or development setup
- Non-carbon-accounting questions
Allowed Tools
read_file- Read CSV data files, JSON artifacts, schemaspython- Process data, perform calculations, query APIsgrep- Search for specific activities or emission factorsbash- Run simple data queries via command line (read-only)
Access Level: 1 (Local Execution - read-only, no file writes, no external network)
Tool Rationale:
read_file: Required to access canonical CSV data indata/directorypython: Needed for parsing CSVs, JSON artifacts, performing unit conversions and emission calculationsgrep: Efficient searching through data files for specific patternsbash: Helpful for quick file inspection and data exploration
Explicitly Denied:
write_file,edit_file- This is a read-only analytical skillweb_fetchwith external URLs - Only internal localhost API endpoints allowed
Expected I/O
Input:
- Type: Natural language question (string)
- Format: Free-form query about carbon data
- Constraints: Must relate to carbon accounting, emissions, or activities in the dataset
- Examples:
- "What is the emission factor for coffee?"
- "Total emissions from video streaming in 2024"
- "List all military operations activities"
- "What units are used for grid intensity?"
Output:
- Type: Structured answer with data, units, and citations
- Format: Markdown with tables, bullet lists, and inline values
- Requirements:
- MUST include units (tCO2e, kWh, etc.) with all numeric answers
- MUST cite data sources - reference
source_idfromdata/sources.csv - MUST include timestamp - data vintage or "as of" date
- Handle ambiguity by asking clarifying questions
- Example:
**Emission Factor for HD Video Streaming:** - Activity: `MEDIA.STREAM.HD.HOUR` (HD video streaming per hour) - Emission Factor: 0.055 kgCO2e/hour - Unit: kgCO2e per hour of streaming - Source: [SOURCE_ID_123] - "Streaming Energy Report 2023" - Vintage: 2023 - Notes: Includes device playback + network delivery
Validation:
- Every numeric value has explicit units
- Sources are referenced by
source_id - "Unknown" or "Data not available" for missing data (never guess)
- Calculations show methodology
Dependencies
Required:
- Access to Carbon ACX data directory (
data/) - Python 3.11+ with pandas, PyYAML
- Understanding of data schema (see
reference/data_schema.md) - Carbon accounting units glossary (see
reference/units_glossary.md)
Data Files:
data/activities.csv- Activity catalogdata/emission_factors.csv- Emission factorsdata/layers.csv- Layer definitionsdata/sectors.csv- Sector taxonomydata/units.csv- Unit definitions and conversionsdata/sources.csv- Data provenancedata/profiles.csv- Activity profilescalc/outputs/- Derived artifacts (if available)
Optional:
- Local API at
http://localhost:8787/api(when Worker is running) - Derived JSON manifests in
dist/artifacts/
Examples
Example 1: Basic Emission Factor Query
User: "What's the emission factor for a cup of coffee?"
Claude Process:
- Recognize this is a carbon data query → invoke
carbon.data.qaskill - Search
data/activities.csvfor coffee-related activities - Find matching emission factor in
data/emission_factors.csv - Look up source details in
data/sources.csv - Format answer with units and citation
Output:
**Coffee Emission Factor:**
- Activity: `FOOD.COFFEE.CUP.HOT` (12 oz hot coffee cup)
- Emission Factor: 0.021 kgCO2e/cup
- Source: [SRC_FOOD_2023] Coffee supply chain LCA
- Vintage: 2023
- Scope: Includes bean production, roasting, brewing, waste
Example 2: Aggregation Query
User: "Show me total emissions from all social media activities"
Claude Process:
- Query
data/activities.csvfor activities withcategory=social - Load corresponding emission factors
- Aggregate by activity type
- Present as table with sources
Output:
**Social Media Activity Emissions (per hour):**
| Activity | Emission Factor | Source |
|----------|----------------|--------|
| YouTube streaming | 0.046 kgCO2e/hr | SRC_MEDIA_2024 |
| Facebook browsing | 0.038 kgCO2e/hr | SRC_SOCIAL_2024 |
| Instagram scrolling | 0.041 kgCO2e/hr | SRC_SOCIAL_2024 |
| TikTok watching | 0.043 kgCO2e/hr | SRC_SOCIAL_2024 |
*All values include device use + network transfer*
*Data vintage: 2024*
Example 3: Ambiguous Query Handling
User: "What are the emissions for streaming?"
Claude Process:
- Recognize ambiguity - multiple streaming types exist
- Ask clarifying question before searching
Output:
I found multiple streaming activities in the dataset:
1. **Video streaming:**
- HD video (1080p)
- UHD/4K video
- SD video (mobile)
2. **Audio streaming:**
- Standard quality music
3. **Platform-specific:**
- YouTube
- Social media video (TikTok, Instagram, etc.)
Which type of streaming would you like emission data for?
Example 4: Unit Conversion Query
User: "How much CO2 does 500 kWh of Ontario grid electricity produce?"
Claude Process:
- Look up Ontario grid emission factor in
data/emission_factors.csv - Find activity
ENERGY.CA-ON.GRID.KWH - Perform calculation: 500 kWh × emission factor
- Show calculation with units
Output:
**Ontario Grid Electricity Emissions:**
- Activity: `ENERGY.CA-ON.GRID.KWH`
- Consumption: 500 kWh
- Emission Factor: 0.025 kgCO2e/kWh (Ontario IESO grid average)
- **Total Emissions: 12.5 kgCO2e** (0.0125 tCO2e)
*Calculation: 500 kWh × 0.025 kgCO2e/kWh = 12.5 kgCO2e*
*Source: [SRC_GRID_ON_2024]*
*Vintage: 2024 annual average*
Example 5: Edge Case - Missing Data
User: "What's the emission factor for blockchain mining?"
Claude Process:
- Search
data/activities.csvfor blockchain/crypto activities - No matches found
- Return graceful "not found" response
Output:
**Data Not Available**
I could not find emission factors for blockchain or cryptocurrency mining in the current Carbon ACX dataset (v1.2).
The dataset currently focuses on:
- Professional services and food
- Digital infrastructure (streaming, cloud, AI)
- Industrial operations
- Transportation and logistics
- Military and defense operations
- Earth system feedbacks
You might be interested in related activities:
- Cloud server operations (`ONLINE.DC.CLOUD.SERVER.HOUR`)
- Data center rack usage (`ONLINE.DC.COLOCATION.RACK.MONTH`)
Would you like information on any of these instead?
Limitations
Known Edge Cases:
- Cannot answer questions requiring data not in the CSV files
- Temporal queries limited to vintage years present in dataset
- Cannot perform predictive modeling or forecasting
- Regional data limited to what's explicitly coded (e.g., Ontario grid)
- Some activities have emission factors marked as "to be added"
Performance Constraints:
- Large aggregations across all activities may take 5-10 seconds
- Complex cross-layer queries require multiple file reads
- Derived artifacts may not always be up-to-date with source CSVs
Security Boundaries:
- Read-only access to data files
- No external API calls (except localhost Worker API)
- Cannot modify source data
- Cannot access files outside
data/orcalc/outputs/directories
Scope Limitations:
- Answers based solely on Carbon ACX dataset - no external knowledge
- Does not perform lifecycle assessments beyond what's in emission factors
- Does not provide regulatory compliance advice
- Does not make emission reduction recommendations (analytical only)
Validation Criteria
Success Metrics:
- ✅ All numeric answers include explicit units (kgCO2e, tCO2e, etc.)
- ✅ Every emission factor cites
source_idor notes if source missing - ✅ Data vintage/timestamp included in responses
- ✅ Ambiguous queries prompt for clarification before answering
- ✅ Missing data returns graceful "not found" rather than guessing
- ✅ Calculations show methodology (formula with units)
- ✅ Responses match data files exactly (no hallucination)
Failure Modes:
- ❌ Returns emission values without units → REJECT
- ❌ Makes up data not in CSV files → REJECT
- ❌ Provides answers without source attribution → WARN
- ❌ Performs calculations with wrong units → REJECT
- ❌ Answers ambiguous questions without clarification → WARN
Recovery:
- If uncertain about data interpretation: Ask user for clarification
- If data missing: Explicitly state "Data not available" and suggest alternatives
- If calculation complex: Show step-by-step methodology
- If source missing: Note "Source not specified in dataset"
Related Skills
Dependencies:
- None - this is a foundational skill
Composes With:
carbon.report.gen- Use this skill to gather data, then generate reportsacx.code.assistant- This skill informs what data structures exist for code generation
Alternative Skills:
- For report generation:
carbon.report.gen - For code generation:
acx.code.assistant - For schema validation:
schema.linter
Maintenance
Owner: ACX Team Review Cycle: Monthly (align with dataset releases) Last Updated: 2025-10-18 Version: 1.0.0
Maintenance Notes:
- Update when new CSV files added to
data/ - Review when emission factor schema changes
- Validate examples against current dataset version
- Keep
reference/data_schema.mdsynchronized with actual schema