ClawGear

Posted on May 11

35 ChatGPT Prompts for Data Analysts: SQL, Dashboards, and Insight Reports That Drive Decisions

#ai #chatgpt #productivity #career

Data analysts don't have a shortage of data. They have a shortage of time between raw numbers and the answer the business actually needs.

ChatGPT doesn't replace analytical judgment. But it writes boilerplate SQL faster, generates dashboard specs in minutes, and turns data findings into executive-ready narratives — so you can spend your hours on the analysis that requires your expertise.

These 35 prompts cover the full analyst workflow from data preparation to executive presentation. Each uses bracket placeholders. Replace them with your actual context before running.

1. Data Cleaning and Preparation

Bad data in, bad analysis out. These prompts handle the messy prep work that consumes 60–80% of analyst time.

Prompt 1 — Data quality audit checklist

Create a data quality audit checklist for a dataset called [DATASET NAME] with the following fields:
[LIST FIELD NAMES AND DATA TYPES]

The checklist should cover:
- Completeness (null/missing value checks per field)
- Uniqueness (duplicate row detection rules)
- Validity (format and range checks per field type)
- Consistency (cross-field checks that should always be true)
- Timeliness (freshness expectations for each data source)

For each check: describe what to look for, the SQL or Python logic to identify violations, and the business impact if that check fails.

Prompt 2 — Data cleaning SQL

Write SQL to clean the following table: [TABLE NAME] in [DATABASE TYPE — e.g., PostgreSQL, BigQuery, Snowflake].

Known issues to fix:
1. [ISSUE 1 — e.g., email addresses with trailing spaces]
2. [ISSUE 2 — e.g., dates stored as strings in MM/DD/YYYY format]
3. [ISSUE 3 — e.g., duplicate rows based on customer_id and event_date]
4. [ISSUE 4 — e.g., negative values in revenue column]

For each fix: write the SQL, explain what it does, and flag any edge cases I should test.

Prompt 3 — Column profiling query

Write SQL to profile the column [COLUMN NAME] in the table [TABLE NAME] (database: [DATABASE TYPE]).

The profile should return:
- Total row count
- Null count and null percentage
- Distinct value count
- Min, max, and average (if numeric)
- Top 10 most frequent values and their counts
- Count of values that don't match expected format: [EXPECTED FORMAT OR PATTERN]

Format as a single query or a short series of queries I can run together.

Prompt 4 — ETL pipeline documentation

Write technical documentation for an ETL pipeline that:

Source: [SOURCE SYSTEM AND TABLE]
Transformations: [LIST WHAT IT DOES — e.g., deduplication, aggregation, join with X table]
Destination: [TARGET TABLE]
Refresh schedule: [FREQUENCY]
Dependencies: [OTHER PIPELINES OR TABLES IT DEPENDS ON]

Documentation should include:
- Purpose (one paragraph)
- Data lineage diagram in text form
- Field mapping table (source → target)
- Business rules applied
- Known limitations or edge cases
- Monitoring and alerting approach

Prompt 5 — Outlier detection query

Write SQL to identify outliers in the [COLUMN NAME] column of [TABLE NAME].

Business context: [WHAT THIS COLUMN REPRESENTS AND WHY OUTLIERS MATTER]
Expected range: [WHAT NORMAL LOOKS LIKE]
Dimension to segment by: [e.g., by product, by region, by month]

Use at least two methods:
1. IQR-based (1.5× IQR fence)
2. Z-score based (flag values beyond ±3 standard deviations)

Output a query that returns each outlier row with: the outlier value, the method that flagged it, and the expected range for that segment.

2. SQL Query Building

SQL is the analyst's primary tool. These prompts generate, explain, and optimize queries faster.

Prompt 6 — Complex query from plain English

Write a SQL query for [DATABASE TYPE — e.g., PostgreSQL, BigQuery] to answer this business question:

"[BUSINESS QUESTION IN PLAIN ENGLISH]"

Available tables:
- [TABLE 1]: [BRIEF DESCRIPTION OF FIELDS]
- [TABLE 2]: [BRIEF DESCRIPTION OF FIELDS]
- [TABLE 3]: [BRIEF DESCRIPTION OF FIELDS]

Requirements:
- Filter to [DATE RANGE OR SEGMENT]
- Group by [DIMENSIONS]
- Sort by [COLUMN, ORDER]
- Limit to [NUMBER OF ROWS] if applicable

Add comments explaining what each CTE or subquery does. Prefer CTEs over nested subqueries.

Prompt 7 — Window function query

Write a SQL query using window functions to calculate [METRIC] for each [DIMENSION] in the table [TABLE NAME].

Specific window function needed: [e.g., running total, 7-day rolling average, rank within group, lag/lead comparison]
Partition by: [COLUMN]
Order by: [COLUMN]
Date range: [IF APPLICABLE]

Explain what each window function does and why it's the right choice here. Include a sample of what the output should look like for 3-5 rows.

Prompt 8 — Query optimization

Here is a SQL query that is running slowly on [DATABASE TYPE]:

[PASTE QUERY]

Table sizes:
- [TABLE 1]: [ROW COUNT]
- [TABLE 2]: [ROW COUNT]

Analyze the query and suggest optimizations. Consider:
- Missing or unused indexes
- Expensive operations (DISTINCT, CROSS JOIN, subquery in SELECT)
- Filter pushdown opportunities
- CTE vs. temp table vs. subquery trade-offs
- Partitioning or clustering benefits if applicable

Show the optimized version with comments on what changed and why.

Prompt 9 — Cohort analysis query

Write a SQL query to perform cohort retention analysis on the following data.

Users table: [TABLE NAME WITH FIELDS — user_id, signup_date minimum]
Events table: [TABLE NAME WITH FIELDS — user_id, event_date, event_type minimum]
Cohort definition: Users grouped by [SIGNUP MONTH / SIGNUP WEEK / ACQUISITION CHANNEL]
Retention metric: [WHAT "RETAINED" MEANS — e.g., logged in, made purchase, used feature]
Time periods: [e.g., Day 1, Day 7, Day 30, Day 90]

Output: a cohort retention matrix showing % of users from each cohort who performed the retention event in each time period.

Prompt 10 — Funnel analysis query

Write a SQL query to analyze a conversion funnel for [PRODUCT/PROCESS].

Funnel steps (in order):
1. [STEP 1 — event name or condition]
2. [STEP 2]
3. [STEP 3]
4. [STEP 4]
5. [STEP 5 — CONVERSION]

Data source: [TABLE WITH user_id and event_name and event_timestamp minimum]
Time window: Users who started the funnel within [DATE RANGE]
Attribution rule: [FIRST TOUCH / LAST TOUCH / ANY TOUCH within X days]

Output per funnel step: users who reached this step, conversion rate from previous step, and drop-off rate. Flag where the biggest drop occurs.

3. Dashboard and Visualization Planning

A good dashboard drives decisions. A bad one collects dust. These prompts design dashboards that get used.

Prompt 11 — Dashboard requirements doc

Write a dashboard requirements document for a [DASHBOARD TYPE — e.g., executive KPI, marketing performance, product usage] dashboard.

Primary audience: [WHO WILL USE IT — roles, seniority]
Decisions this dashboard should enable: [LIST 3-5 DECISIONS]
Key metrics to include: [LIST]
Data sources: [LIST]
Refresh frequency: [REAL-TIME / HOURLY / DAILY / WEEKLY]

For each metric: describe it, define the calculation, specify the ideal visualization type, and identify the drill-down the user should be able to do. End with a priority rank for the 5 most important charts.

Prompt 12 — Chart type recommendation

I need to visualize the following data for a [AUDIENCE TYPE] audience:

Data description: [WHAT THE DATA IS]
Number of dimensions: [HOW MANY CATEGORIES OR GROUPS]
Type of comparison: [e.g., over time, part-to-whole, ranking, correlation, distribution]
Key question the chart should answer: [THE QUESTION]

Recommend:
1. Primary chart type (with rationale)
2. Alternative chart type (with trade-offs)
3. What NOT to use here and why
4. Any formatting guidelines specific to this audience (e.g., executive prefers simple bar charts, not multi-line trend charts)

Prompt 13 — KPI definition document

Write formal definitions for the following business KPIs for [COMPANY/TEAM]:

KPIs to define:
- [KPI 1]
- [KPI 2]
- [KPI 3]
- [KPI 4]
- [KPI 5]

For each KPI:
- Plain-English definition (what it measures and why it matters)
- Calculation formula
- Data source(s) and table(s)
- Reporting period (weekly, monthly, quarterly)
- Benchmark or target
- Known data quality issues or caveats
- What a sudden spike or drop in this metric should prompt you to investigate

Format as a reference document the business can use to resolve disagreements about metric definitions.

Prompt 14 — Dashboard critique

Critique the following dashboard based on data visualization best practices.

Dashboard description: [DESCRIBE WHAT'S ON IT — charts, metrics, layout]
Audience: [WHO USES IT]
Primary use case: [WHAT DECISIONS IT SUPPORTS]

Evaluate:
1. Does each chart match its intended comparison type?
2. Are there charts that could be removed without losing insight?
3. Is the layout organized from highest to lowest priority?
4. Are any metrics likely to be misread or misleading?
5. What one addition would most improve decision-making?

Flag the three most important changes and rate the current state: not usable / needs work / good but improvable / strong.

Prompt 15 — Self-serve analytics documentation

Write end-user documentation for the [DASHBOARD NAME] dashboard built in [TOOL — e.g., Tableau, Looker, Power BI, Metabase].

Audience: Non-technical business users in [DEPARTMENT]
Dashboard purpose: [BRIEF DESCRIPTION]
Key sections of the dashboard: [LIST CHARTS OR SECTIONS]

Documentation should cover:
- How to interpret each key metric
- How to use filters and date controls
- What to do if a number looks wrong
- 3 common questions this dashboard answers with step-by-step instructions
- Glossary of terms that might be confusing

4. Insight Generation and Storytelling

Data without narrative is noise. These prompts turn findings into decisions.

Prompt 16 — Insight extraction

I ran an analysis and have the following raw results:

[PASTE YOUR DATA, TABLE, OR SUMMARY]

Business context: [WHAT THIS ANALYSIS WAS FOR AND WHAT QUESTION IT WAS ANSWERING]
Target audience for the insights: [WHO WILL READ THIS]

Extract:
1. The 3 most important findings (ranked by business impact)
2. One surprising or counter-intuitive finding
3. What these findings suggest we should do differently
4. The most important caveat or limitation of this analysis
5. The single most important follow-up question this analysis raises

Write in plain English. No statistical jargon. If you need to explain a finding, use an analogy.

Prompt 17 — Executive summary for analysis

Write a one-page executive summary for the following analysis:

Analysis topic: [WHAT YOU ANALYZED]
Key findings: [LIST YOUR TOP 5 FINDINGS]
Methodology note: [ONE SENTENCE ON HOW]
Limitations: [WHAT THE DATA CAN'T TELL US]
Recommended action: [WHAT YOU THINK SHOULD HAPPEN]

The summary is for [EXECUTIVE ROLE] who has [HIGH / LOW] data literacy.

Format: headline finding, 3-bullet key takeaways, recommendation, and caveats. No tables, no charts — just clear prose under 300 words.

Prompt 18 — "So what" translation

Here is a finding from my analysis:

"[FINDING IN TECHNICAL OR DATA LANGUAGE]"

The audience is [ROLE — e.g., VP of Marketing, CFO, non-technical CEO].

Translate this finding into:
1. A headline that states the implication, not the statistic
2. A 2-3 sentence explanation in plain English
3. The business decision this should inform
4. A visual description (how would you show this in a single chart?)

Make the finding impossible to ignore without making it alarmist.

Prompt 19 — Data story structure

Help me structure a data story about [TOPIC] for a presentation to [AUDIENCE].

The central finding: [YOUR MAIN INSIGHT]
Supporting findings: [LIST 3-5 SUPPORTING POINTS]
The recommended action: [WHAT YOU WANT THE AUDIENCE TO DO]

Build a narrative arc:
1. Opening hook (what's at stake)
2. Context setting (what we expected and why)
3. The finding (what the data actually shows)
4. Implication (what this means for the business)
5. Recommendation (what to do)
6. The ask (specific decision needed from this audience)

For each section: suggested slide title and 2-3 bullet points of content.

Prompt 20 — Anomaly investigation script

A key metric, [METRIC NAME], showed an unusual [INCREASE / DECREASE] of [MAGNITUDE] on [DATE].

Business context: [WHAT THIS METRIC MEASURES AND WHY IT MATTERS]
Current value: [VALUE]
Expected value: [BENCHMARK OR RECENT AVERAGE]

Build an investigation checklist:
1. Data integrity checks (is the anomaly real?)
2. Segmentation cuts to isolate the source (by product, region, user type, device, etc.)
3. External events to check (campaign launches, outages, seasonality, etc.)
4. Cross-metric checks (which related metrics should have moved if X caused this?)
5. Queries to run at each step

End with a template for the "anomaly report" I'll share with stakeholders.

5. Statistical Analysis and Interpretation

Sound statistical thinking separates analysts who get trusted from those who get ignored. These prompts support proper analysis.

Prompt 21 — A/B test design

Help me design an A/B test for the following experiment:

Hypothesis: [WHAT WE BELIEVE AND WHY]
Metric we're measuring: [PRIMARY METRIC]
Expected baseline rate: [CURRENT VALUE]
Minimum detectable effect we care about: [SMALLEST CHANGE WORTH DETECTING]
Target statistical significance: [95% / 99%]
Expected traffic per day: [VISITS OR USERS]

Calculate:
- Required sample size per variant
- Minimum test duration
- How to set up the control and treatment groups
- Secondary metrics to monitor (guardrail metrics)
- Risks that could invalidate the test

Prompt 22 — A/B test results interpretation

Interpret the results of the following A/B test:

Experiment: [WHAT WAS TESTED]
Control: [METRIC] = [VALUE], N = [SAMPLE SIZE]
Treatment: [METRIC] = [VALUE], N = [SAMPLE SIZE]
Test duration: [DAYS]
P-value: [VALUE]
Confidence interval: [RANGE]

Explain:
1. Whether the result is statistically significant and what that means
2. Whether the result is practically significant
3. Common mistakes someone could make interpreting these numbers
4. My recommendation: ship, don't ship, or run longer — with rationale
5. What I'd test next based on this result

Prompt 23 — Correlation vs. causation explanation

I found a strong correlation between [VARIABLE A] and [VARIABLE B] in our data.

Context: [DESCRIBE THE BUSINESS CONTEXT]
Correlation coefficient: [IF KNOWN]

Help me:
1. List 3 alternative explanations for this correlation besides A causing B
2. Describe a confounding variable that could explain both
3. Suggest one practical experiment or analysis that would help establish causality
4. Write a one-paragraph summary I can share with a non-technical stakeholder that accurately represents what we know and don't know

I need to avoid overclaiming in my report.

Prompt 24 — Regression output explanation

Explain the following regression output to a non-technical business audience.

Model: [TYPE — e.g., linear regression, logistic regression]
Outcome variable: [WHAT WE'RE PREDICTING]
Key coefficients: [LIST VARIABLE NAME, COEFFICIENT, P-VALUE]
R-squared: [VALUE]
Sample size: [N]

Write a plain-English interpretation that covers:
- What the model is predicting and for whom
- Which variables are most predictive and what that means practically
- What the R-squared tells us (and what it doesn't)
- Three business actions that follow from these findings
- One limitation the audience should know before acting on this

Prompt 25 — Forecasting methodology brief

Write a brief explaining the forecasting methodology for [METRIC] at [COMPANY].

Current forecasting approach: [DESCRIBE WHAT YOU DO NOW]
Data available: [WHAT INPUTS THE FORECAST USES]
Forecast horizon: [HOW FAR OUT]
Accuracy requirement: [WHAT TOLERANCE IS ACCEPTABLE]

The brief should cover:
- Recommended method (time series, regression, ensemble — with rationale)
- Data requirements and preprocessing steps
- How to measure forecast accuracy (MAE, MAPE, RMSE — explain each briefly)
- Refresh cadence and ownership
- How to present uncertainty to non-technical stakeholders

6. Stakeholder Reports and Presentations

Analysis that doesn't influence decisions is wasted analysis. These prompts build the communication layer.

Prompt 26 — Weekly metrics report

Write a weekly metrics report template for [TEAM/PRODUCT].

Metrics to include: [LIST — e.g., DAU, revenue, conversion rate, support volume]
Audience: [WHO RECEIVES IT AND THEIR DATA LITERACY LEVEL]
Format: email or Slack message (choose one)
Cadence: Monday morning

The template should:
- Lead with the most important change from last week
- Use a consistent format that builds a readable pattern over time
- Flag green (on track), yellow (watch), red (action needed)
- Include space for 1-2 sentences of interpretive commentary per metric
- Stay under 400 words total

Prompt 27 — Stakeholder data request response

A business stakeholder has asked me: "[VERBATIM REQUEST]"

This request is [CLEAR / AMBIGUOUS / TECHNICALLY INFEASIBLE AS STATED].

Help me:
1. Clarify what they actually want (restate the question in precise analytical terms)
2. Identify 2-3 ways to answer this question with available data
3. Recommend the best approach and explain why
4. Write a response email that confirms the approach, sets expectations on delivery time, and asks any clarifying questions I need before starting

Keep the email under 150 words. Don't use analyst jargon.

Prompt 28 — Analysis presentation structure

Structure a 20-minute analysis presentation for [AUDIENCE] on [TOPIC].

Findings: [SUMMARIZE YOUR TOP 5 FINDINGS]
Recommendation: [WHAT YOU WANT THEM TO DO]
Data source: [BRIEF DESCRIPTION]
Most important objection they'll raise: [ANTICIPATED PUSHBACK]

Slide-by-slide outline with:
- Slide title
- The one thing the audience should take from this slide
- Suggested visual element
- Presenter note for key talking points

The presentation should build to the recommendation, not bury it in findings. Allocate time to discussion.

Prompt 29 — Data-to-decision memo

Write a data-to-decision memo summarizing the following analysis for a business decision.

Decision to be made: [WHAT DECISION THIS INFORMS]
Analysis summary: [PASTE KEY FINDINGS]
Options on the table: [LIST 2-3 OPTIONS THE DECISION-MAKER IS CHOOSING BETWEEN]
Data supporting each option: [SUMMARIZE]
Risks and uncertainties: [WHAT THE DATA DOESN'T TELL US]

The memo format:
- Decision question (one sentence)
- Options (table)
- Data evidence for each option
- Recommendation with confidence level (high / medium / low)
- Key assumptions and risks
- Next steps if the recommended option is selected

No longer than one page.

Prompt 30 — Automated report commentary

Write a templated commentary script for an automated [DAILY / WEEKLY / MONTHLY] report on [METRIC].

The report contains: [LIST METRICS OR CHARTS]
Target audience: [ROLE AND DATA LITERACY]
Scenarios to script commentary for:
- Metric is up significantly (>10%)
- Metric is down significantly (>10%)
- Metric is flat (within ±5%)
- Metric is at an all-time high
- Metric is at an all-time low

For each scenario: a 2-3 sentence commentary template with [VARIABLE] placeholders for the actual numbers. The commentary should provide context, not just restate the number.

7. Documentation and Process Templates

Institutional knowledge lives in documentation. These prompts build the infrastructure that makes teams repeatable.

Prompt 31 — Analysis request intake form

Create an analysis request intake form for the data analytics team at [COMPANY].

The form should capture:
- Business question (not the metric — the decision it's for)
- Requestor and team
- Priority and deadline
- Data sources available or needed
- Known constraints or caveats
- Definition of done (what does "complete" look like?)
- Stakeholder who will act on the analysis

Add instructions for requestors on how to fill in each field well, with a good example and a bad example for the "business question" field specifically.

Prompt 32 — Data dictionary entry

Write a data dictionary entry for the following field:

Field name: [FIELD NAME]
Table: [TABLE NAME]
Data type: [TYPE]
Description: [RAW DESCRIPTION OR WHAT YOU KNOW ABOUT IT]
Business context: [WHAT IT'S USED FOR]

The entry should include:
- Plain-English definition
- Calculation or derivation logic (if it's a derived field)
- Valid values or ranges
- Null handling rule
- Last updated date and source system
- Known issues or caveats
- Example values

Write it so a new analyst could use this field correctly without asking anyone.

Prompt 33 — Runbook for recurring analysis

Write a runbook for the following recurring analysis that runs [FREQUENCY]:

Analysis name: [NAME]
Business purpose: [WHAT IT'S USED FOR AND BY WHOM]
Data sources: [TABLES, SYSTEMS]
Steps to run: [DESCRIBE THE PROCESS]
Output format: [WHAT IS DELIVERED AND HOW]
Common issues: [WHAT TYPICALLY GOES WRONG]

The runbook should be detailed enough for a new team member to run this analysis independently on their first try. Include screenshots or example outputs if you're describing them textually.

Prompt 34 — SQL query library documentation

Write documentation for a shared SQL query library for the [TEAM NAME] analytics team.

Queries to document:
- [QUERY 1 NAME]: [BRIEF DESCRIPTION]
- [QUERY 2 NAME]: [BRIEF DESCRIPTION]
- [QUERY 3 NAME]: [BRIEF DESCRIPTION]

For each query:
- Purpose (one sentence)
- Parameters it accepts (inputs user must specify)
- Output columns and their definitions
- Example usage
- Performance notes (how long it typically takes, any known slow conditions)
- Last validated date

Also include: contribution guidelines (how to add new queries to the library) and a naming convention guide.

Prompt 35 — Post-analysis retrospective

Write a post-analysis retrospective for the following analysis project:

Analysis: [NAME AND BRIEF DESCRIPTION]
Duration: [HOW LONG IT TOOK]
Stakeholders: [WHO WAS INVOLVED]
What we set out to answer: [ORIGINAL QUESTION]
What we actually answered: [IF DIFFERENT]
Business decision it informed: [OUTCOME]

Retrospective structure:
1. What went well (process, data, communication)
2. What was harder than expected
3. Accuracy check — if we have outcome data, did our analysis predict correctly?
4. One thing we'd do differently next time
5. Any reusable artifacts (queries, templates, frameworks) worth adding to the team library

Keep it under one page. This is for the team, not the stakeholder.

Want 35 More Prompts for Advanced Analytics Work?

These 35 prompts handle the core analyst workflow. The full pack includes prompts for machine learning project scoping, advanced statistical testing, and building the analyst-as-consultant communication style.

Get the complete data analyst prompt library — Use LAUNCH30 for 30% off. Limited uses remaining.

DEV Community