Files
pm-claude-skills/exports/aider/pm-data/chart-data-extractor/chart-data-extractor.md
T
mohitagw15856 036511ab3e Windsurf + Aider targets, MCP server, and demo placement (#33)
Broadens both reach (more tools) and content types (an MCP server), continuing
the multi-platform story.

Windsurf + Aider:
- build-exports.mjs gains two platforms: exports/windsurf/*.md (workspace rules,
  trigger: model_decision) and exports/aider/*.md (conventions for `aider --read`).
  Now 5 platforms (ChatGPT, Gemini, Cursor, Windsurf, Aider).
- install.sh + bin/cli.mjs install both (windsurf -> .windsurf/rules, aider ->
  .aider/skills with a --read hint); generated README index is excluded from copies.
- One-line windsurf-install.sh / aider-install.sh wrappers for parity.

MCP server (new content type):
- mcp/server.mjs — zero-dependency stdio MCP server exposing list_skills,
  search_skills, get_skill. Published as a second bin (pm-claude-skills-mcp).
  Logs to stderr; reads bundled skills/ at startup. mcp/README.md documents
  client config.

Also: README hero "See it in action" demo placement (ready to swap in a GIF;
recording guide in web/docs-assets/README.md), Works-With table + exports +
install docs updated, CHANGELOG Unreleased. package.json files/bin updated.


Claude-Session: https://claude.ai/code/session_016JWn5jRD5tcEFKrubjQ6Px

Co-authored-by: Claude <noreply@anthropic.com>
2026-06-17 23:15:38 +01:00

4.2 KiB

Chart Data Extractor Skill

Extracts data from images of charts and graphs — bar charts, line charts, pie charts, scatter plots, and tables in images — producing a structured data table that can be used in spreadsheets or rebuilt in any charting tool. Built to leverage Opus 4.7 pixel-level image analysis capabilities.

Required Inputs

Ask the user for these if not provided:

  • The chart image (upload a screenshot or image file)
  • Chart type (if ambiguous — bar / line / pie / scatter / other)
  • What matters most (approximate trends / precise values / specific data points / categorisation)
  • Known axis values (optional — if the user knows the max/min values to anchor the extraction)

Output Structure

1. Chart Identification

Attribute Value
Chart type [Bar / Line / Pie / Scatter / Area / Other]
Chart title (if visible) [Title text]
X-axis label [Label + unit]
Y-axis label [Label + unit]
Number of series N
Legend categories [List]
Data period (if time-based) [Start — End]

2. Extracted Data Table

[X axis] [Series 1] [Series 2] ...
[Value] [Value] [Value]

3. Confidence Levels

For each data point or series, flag confidence:

  • High confidence: data points where the value is clearly readable against gridlines or labels
  • Medium confidence: data points where the value is interpolated between gridlines
  • Low confidence: data points where the value is ambiguous or overlaps with other elements

Low-confidence points should be explicitly listed — not silently included in the main table.

4. Notable Observations

Observations that the data itself reveals:

  • Peak value: [Value, when, in which series]
  • Lowest value: [Value, when, in which series]
  • Largest delta between series: [Details]
  • Any anomalies or outliers visible in the chart

5. Reconstructed Source

CSV format for direct use:

[x_axis],[series_1],[series_2]
[value],[value],[value]

6. Assumptions and Caveats

  • Grid resolution: [How precisely values could be read — e.g. "Y-axis has major gridlines every 10 units, minor every 2"]
  • Interpolation used: [Any values that required estimating between gridlines]
  • Unclear data: [Anything in the chart that could not be read reliably]
  • Axis scale: [Linear/logarithmic/etc — note if not obvious]

7. Follow-up Options

Ask the user which of these they want:

  • Rebuild the chart in a specified format (Excel formula, Python matplotlib, D3, etc.)
  • Produce a narrative description of what the chart shows
  • Compare this data against another chart or source
  • Flag potentially misleading visual choices in the original (truncated axes, misleading scales, etc.)

Quality Checks

  • Every extracted number specifies which series it belongs to
  • Confidence levels are explicit for ambiguous points
  • Low-confidence values are flagged separately, not silently included
  • Assumptions about axis scale and interpolation are stated
  • CSV output is clean and directly usable

Anti-Patterns

  • Do not silently include low-confidence data points in the main table — flag them separately so the user knows which values to verify
  • Do not assume a linear scale without confirming it — logarithmic axes make extracted values incorrect by orders of magnitude if misread
  • Do not report extracted values with false precision — if the chart's Y-axis only shows gridlines every 10 units, a reported value of 37 is invented, not extracted
  • Do not omit the assumptions and caveats section — partial image quality, overlapping bars, or unlabelled axes must be disclosed

Example Trigger Phrases

  • "Extract the data from this chart"
  • "Transcribe the numbers in this graph"
  • "Turn this chart image into a spreadsheet"
  • "Digitise this chart so I can rebuild it"
  • "What are the exact values in this bar chart?"

Why This Works Better on Opus 4.7

Earlier models struggled with pixel-level data transcription from charts, often hallucinating values or misreading gridline positions. Opus 4.7 uses a higher image resolution (2576px vs 1568px) with coordinates mapping 1:1 to pixels, making chart data extraction reliable for practical use.