v1.0

2026-03-02 00:36:23 +01:00
parent 61004d0c4e
commit 77dbdfa1b9
118 changed files with 11087 additions and 0 deletions
@@ -0,0 +1,82 @@
+---
+description: Generate realistic dummy datasets for testing — CSV, JSON, SQL inserts, or Python scripts
+argument-hint: "<description of the data you need>"
+---
+
+# /generate-data -- Test Data Generator
+
+Create realistic dummy datasets for development, testing, demos, or prototyping. Outputs as ready-to-use files in your preferred format.
+
+## Invocation
+
+```
+/generate-data 1000 users with names, emails, plan tier, signup date, and activity score
+/generate-data E-commerce orders dataset: products, customers, timestamps, amounts
+/generate-data Sample data matching this schema: [paste table definition]
+```
+
+## Workflow
+
+### Step 1: Define the Dataset
+
+Understand:
+- What entities? (users, orders, events, products, etc.)
+- What columns? (with data types and constraints)
+- How many rows?
+- Any relationships between tables?
+- Any specific distributions? (e.g., "80% should be on the free plan")
+- Any realistic constraints? (emails should be unique, dates should be chronological)
+
+### Step 2: Generate the Data
+
+Apply the **dummy-dataset** skill:
+
+- Create a Python script that generates the dataset
+- Use realistic-looking data (not random strings): proper names, valid email formats, real-seeming dates
+- Respect constraints: unique IDs, foreign key relationships, chronological ordering
+- Apply specified distributions
+- Execute the script and produce the output file
+
+### Step 3: Deliver
+
+Output in the requested format (or ask):
+- **CSV**: Most common, works everywhere
+- **JSON**: For API testing or frontend development
+- **SQL INSERT**: For populating test databases
+- **Python script**: For reproducible generation (user can tweak and re-run)
+
+```
+## Generated Dataset: [Description]
+
+**Rows**: [count]
+**Columns**: [list]
+**Format**: [CSV / JSON / SQL / Python]
+
+### Schema
+| Column | Type | Constraints | Distribution |
+|--------|------|-----------|-------------|
+
+### Sample (first 5 rows)
+[Preview of the data]
+
+### Files
+- [data file]
+- [generator script, if applicable]
+```
+
+Save data file and generator script to the user's workspace.
+
+### Step 4: Offer Follow-ups
+
+- "Want me to **add more columns** or **increase the dataset size**?"
+- "Should I **create related tables** (e.g., orders for these users)?"
+- "Want me to **write test scenarios** that use this data?"
+- "Should I **create SQL queries** to analyze this dataset?"
+
+## Notes
+
+- Always provide the generator script so the user can regenerate with different parameters
+- For demo datasets, make the data tell a story (e.g., seasonal trends, a retention problem, a power user segment)
+- Respect realistic cardinality: 1000 users don't have 1000 unique cities
+- For financial data, use realistic price distributions — not uniform random
+- Never include real personal data — all names, emails, and identifiers must be fake