SEO Services
E-commerce automation guide
How to Automate Customer Service Email Triage for Shopify with n8n + Claude (2026)
A field-tested blueprint for routing every inbound ticket — refunds, shipping, sizing, fraud — through a Claude classifier, a Shopify Admin GraphQL lookup, and a Gorgias handoff queue. Code, prompts, and the failure modes nobody tells you about.
Updated May 2026
Intermediate-Advanced
The problem: 600 tickets a week, 80% are five questions on repeat
Every Shopify merchant past a few hundred orders per month hits the same wall. Your inbox, your Gorgias queue, your Shopify Inbox forwards, and your WhatsApp Business number all converge on the same overworked team. Roughly 80% of what they answer is the same five questions: where is my order, can I return this, do you have it in a medium, can I change the address, this arrived broken. The remaining 20% — chargebacks, fraud, escalations, B2B inquiries — actually needs a human brain.
The fix is not a chatbot bolted onto your storefront. The fix is a triage layer that intercepts every channel, looks up the order in Shopify, classifies the intent with a real LLM, drafts a reply that cites real shipment data, and only escalates the genuinely hard ones. This guide walks through that pipeline end-to-end with n8n, Claude, Shopify Admin GraphQL, Gorgias, and Klaviyo — including the privacy guardrails that keep PCI DSS auditors happy.
If you also run brick-and-mortar or showroom traffic, the companion retail automation playbook covers POS-side triage; for booking-driven brands the hospitality guide handles reservation logic.
Architecture: seven nodes, one queue, zero leaked PANs
Every inbound message becomes an n8n execution. The execution must be idempotent (a retried webhook can’t double-refund a customer) and side-effect-free until the final branch fires. Here is the canonical flow.
Inbox / Webhook
Gmail, Gorgias, WA
Order Lookup
Shopify GraphQL
Classify
Claude intent
Auto-Reply
Templated draft
Sentiment Check
Score < 0.3 ?
Handoff / Resolve
Gorgias agent
Klaviyo Event
Post-purchase flow
Unify the channels — one webhook to rule them all
Most stores have at least four inbound channels: a shared Gmail or Outlook inbox ([email protected]), a Gorgias instance receiving Shopify Inbox forwards, a WhatsApp Business number, and the contact form on the storefront. The first job is normalising every one of them into a single n8n webhook payload so the rest of the pipeline doesn’t care where the message came from.
Use n8n’s native Gmail trigger for the inbox, a Gorgias webhook trigger (configure it inside Gorgias under Settings → REST API → Add HTTP Integration), and a Twilio WhatsApp webhook for messaging. Each one transforms into the same canonical envelope:
De-duplicate aggressively. Buyers forward, reply-all, and CC themselves. Hash external_id + from.email in a Postgres table and skip anything you’ve seen in the last 60 seconds.
Order lookup — Shopify Admin GraphQL, the right way
Before any classification, enrich the message with order context. Extract candidate identifiers from the body — order number (#1234, SHOP-1234), email, phone — using a small regex node, then query Shopify Admin GraphQL. Always GraphQL, never REST: REST is rate-limited per call (2/sec), GraphQL is rate-limited by query cost (50 points/sec) and you can fetch order, fulfillment, and refunds in a single trip.
Pass q as email:[email protected] or name:#1234. If no order matches, set order_context: null and let the classifier handle it as a pre-purchase question.
extensions.cost.requestedQueryCost in dev to keep every query under 50.For WooCommerce stores, swap the query for a GET /wp-json/wc/v3/orders?search=... call and merge the shape; for BigCommerce use GET /v2/orders?email=.... The classifier downstream doesn’t care which backend produced the context — only that the shape matches.
Intent classifier — Claude with a tight schema
The classifier is the brain. We send Claude the message body plus a redacted order summary and ask for a single JSON object with a fixed enum of intents. No free-form output, no creative interpretation. The schema below is what we run in production.
Use Claude Sonnet for this — Haiku is too literal with sentiment, Opus is overkill at the volume. Pin the model version (claude-sonnet-4-7@20260301 or whichever is current) and treat it like a contract. Schema drift will silently break your downstream branches.
paymentDetails.creditCardLastDigits — that's all the LLM ever sees. PCI DSS scope creep is a multi-figure mistake.Validate the JSON output with a strict schema before routing. If parsing fails (it will, occasionally) fall back to requires_human=true, intent=other and let a human triage it.
Auto-response branch — drafts that cite real data
For every intent with confidence ≥ 0.85 and requires_human = false, generate a personalized reply that cites the actual order. The trick is to compose the reply from two layers: a hand-written template per intent, and Claude's job is to fill the slots — not to write English from scratch.
For shipping queries, do a parallel call to the carrier API (EasyPost wraps UPS, USPS, FedEx, DHL behind one endpoint) so the reply contains a live status, not a stale "shipped on Tuesday." For refunds, compute eligibility against the return window in n8n — never let the LLM decide policy.
Human handoff — context bundle, suggested reply, calm priority
Anything flagged requires_human = true goes to Gorgias as a new ticket with a structured internal note. The note is the difference between a 12-minute resolution and a 4-minute one — give the agent everything they would have looked up themselves, plus a draft reply they can accept, edit, or discard.
tickets:delete for this workflow, only tickets:write.Route by intent: fraud and chargebacks to a dedicated reviewer, sizing/fit to the merchandising team if you have one, the rest to general queue. Use Gorgias' "rules" engine to enforce SLAs based on the urgency tag.
Klaviyo event push — close the loop, fuel the next campaign
Every resolved ticket pushes a custom event into Klaviyo. This event is the trigger for win-back flows after refunds, post-purchase NPS asks after happy resolutions, and a "we've fixed it" check-in 7 days after a defective-product replacement. The event payload should be lean and queryable.
In Klaviyo, build flows that branch on the intent property: refund_return triggers a 14-day win-back; defective_damaged triggers a quality-assurance follow-up; shipping_tracking + resolved_by=ai goes into a "first-time positive AI experience" segment for the brand team to tag. Respect email_consent — Klaviyo enforces this, but n8n should pre-check before the POST to avoid wasted API calls.
Common failure modes and how we caught them in production
Every team building this hits the same six bugs. Here is the short list, ordered by how much money they cost before we noticed.
Idempotency drift on retries
Webhook retries fired the refund branch twice. Fix: a Postgres processed_messages table keyed on (channel, external_id) with a 7-day TTL.
GraphQL cost spikes
A nested fulfillments(first: 50) blew the budget. Fix: cap to first 5 and paginate only on demand.
Sentiment false-positives in Hebrew
Polite Hebrew customer-service language scored as negative. Fix: pass language hint into the system prompt.
Auto-replies to no-reply addresses
Bounce loops. Fix: maintain a regex blocklist (noreply@, postmaster@, mailer-daemon@) before sending.
Stale carrier ETAs
Cached EasyPost responses showed 3-day-old ETAs. Fix: 15-minute cache TTL with explicit refresh=true on customer reply.
Schema drift after model upgrade
A pinned model deprecated mid-quarter; default rolled forward and broke our enum. Fix: regression suite of 200 historical tickets in CI.
Privacy & compliance — the non-negotiables
A customer-service pipeline touches PII on every execution. The compliance regime depends on where your buyers live, but four standards always apply.
| Regime | Applies when | What it forces you to do |
|---|---|---|
| PCI DSS | Always | Never pass full card PAN to the LLM. Last-4 only via Shopify. |
| GDPR | Any EU/UK buyer | DSAR endpoint, 30-day deletion, lawful basis logged per profile. |
| CCPA / CPRA | CA buyers | "Do Not Sell" honored; per-event opt-out flag in Klaviyo. |
| CAN-SPAM | Any US buyer | Physical address + working unsubscribe in every auto-reply. |
| Shopify ToS | Always | 2 req/sec REST, 50 cost-points/sec GraphQL, no scraping admin UI. |
Anthropic's API does not train on your inputs by default, but log retention is 30 days. If your DPA requires zero retention, request the zero-data-retention amendment before go-live. Klaviyo's consent rules are stricter than the law in most jurisdictions — never push events for un-consented profiles.
Results — what a 12-month rollout looked like
Numbers from a US-based apparel brand running roughly 800 inbound tickets per week across Shopify Plus, Gorgias, and WhatsApp Business. Twelve months from go-live.
| Metric | Before | After | Delta |
|---|---|---|---|
| Avg. first-response time | 3 h 42 min | 12 sec | −99.9% |
| Tickets resolved without human | 0% | 65% | +65 pts |
| Annual CS labor cost | $148K (2 FTE) | $58K (0.7 FTE) | −$90K / yr |
| Refund-related cost | 8.1% of revenue | 4.9% | −3.2 pts |
| NPS (post-CS interaction) | +24 | +42 | +18 |
The refund-cost number is worth dwelling on: proactive shipping updates (Step 4) intercepted enough "where is my order" panic to suppress a measurable share of speculative chargebacks.
Four-week implementation timeline
WEEK 1
Audit & access
Shopify Admin API token, Gorgias REST creds, sample 1000 historical tickets, classify by hand to build the gold set.
WEEK 2
Classifier + prompts
Wire n8n webhooks, prompt-engineer Claude against the 1000-ticket gold set, target ≥92% intent accuracy.
WEEK 3
Gorgias + reply templates
Build handoff bundle, draft templates per intent, sentiment fallback, internal QA on shadow traffic.
WEEK 4
Klaviyo + go-live
Wire Klaviyo events, launch behind a manual approval queue, watch metrics for 5 days, then unlock auto-send for high-confidence intents.
Frequently asked questions
/wp-json/wc/v3/orders) or BigCommerce v3 Catalog/Orders API. The classifier, reply templates, and Gorgias handoff are platform-agnostic. Watch out: WooCommerce rate limits depend on the host (often nginx-level), and BigCommerce enforces per-store hourly caps.order.createdAt against your return-window constant. The reply template only renders the eligible/ineligible branch — Claude fills the personalisation slots, the deterministic code holds the policy line. This is the single most important guardrail in the whole pipeline.Ready to ship this in your store?
We've built this exact pipeline for Shopify Plus brands, DTC apparel, beauty, and home-goods stores. Four weeks to live, fixed-price, your data stays in your stack.
Talk to an e-commerce automation engineerSee also: E-commerce AI services · Beauty & cosmetics · All AI services
