ArticleMarch 5, 202612 min readmakedata-storestate-machineidempotencyautomation

Make.com Data Store State Machine: Eliminate Replay Errors

Q: How many states do I actually need to start?

Start with new, processing, completed, and failed. Add dead_letter when you need escalation after repeated failures. More states are not automatically better. Clear transitions and ownership are more important than a complex status model.

make.com data store tracks event state when retries, failures, and replays hit production. This guide shows how to build a rerun-safe state machine fast.

Short on time

Start with the key sections below, then jump to FAQ for direct answers. If you need implementation help, use the contact button and I will map the shortest safe rollout path.

Jump to FAQ Ask for implementation help

On this page (20)

The problem: Make.com scenarios have no memory by default
What "state machine" means in plain terms
Why Make.com Data Store is the right place for workflow state
Architecture overview
Step 1: Create a Data Store schema that supports operations
Step 2: Generate a stable processing_id
Step 3: Lookup state before every side effect
Step 4: Enforce explicit state transitions
Step 5: Put error handling on the write path, not only at scenario level
Step 6: Separate retry processing from main ingestion
Step 7: Add lock protection for concurrent arrivals
Copy-paste blueprint (router + Data Store row)
Real implementation: Typeform to HubSpot with state tracking
When you do not need this pattern
Common mistakes that break state-machine reliability
Pre-release verification checklist
Operational metrics that prove this works
FAQ
Next steps
Related reading

On this page

The problem: Make.com scenarios have no memory by default
What "state machine" means in plain terms
Why Make.com Data Store is the right place for workflow state
Architecture overview
Step 1: Create a Data Store schema that supports operations
Step 2: Generate a stable processing_id
Step 3: Lookup state before every side effect
Step 4: Enforce explicit state transitions
Step 5: Put error handling on the write path, not only at scenario level
Step 6: Separate retry processing from main ingestion
Step 7: Add lock protection for concurrent arrivals
Copy-paste blueprint (router + Data Store row)
Real implementation: Typeform to HubSpot with state tracking
When you do not need this pattern
Common mistakes that break state-machine reliability
Pre-release verification checklist
Operational metrics that prove this works
FAQ
Next steps
Related reading

The problem: Make.com scenarios have no memory by default

Across dozens of production Make.com workflow fixes, duplicate writes and hidden retries were damaging CRM and finance operations. In almost every incident, the scenario logic looked reasonable in isolation, but there was no durable memory of what already happened for each business event.

That gap causes 3 operational failures:

duplicate creates when the same webhook event is retried,
partial completion when one module fails after earlier side effects,
unsafe manual reruns that multiply damage instead of recovering state.

Make.com run history gives execution traces, but execution traces are not business state. If you need implementation context for how I run these fixes, start at About. If you already have retries creating real incidents, the relevant delivery lane is Make.com error handling.

Two recurring incident snapshots:

Typeform to HubSpot intake: webhook retry windows replayed the same submission, so contact writes duplicated; source-keyed state gating blocked replay writes.
Finance reconciliation lane: one side effect succeeded and the next failed; failed-state write plus owner alert made replay deterministic and safe.

The fix is to add explicit memory. In Make.com, the most practical way is a Data Store used as a state machine.

What "state machine" means in plain terms

You do not need academic computer science for this pattern.

A state machine here is just a strict record of where each business event currently sits. Every event gets one stable key. That key can move through allowed statuses only. When a replay or retry arrives, you do not guess. You look up state and route deterministically.

Minimal state model:

new: event accepted but not yet processed,
processing: in-flight and not confirmed,
completed: side effects confirmed,
failed: processing stopped with known reason,
dead_letter: retried and escalated for manual handling.

Allowed transitions are explicit:

new -> processing -> completed,
processing -> failed,
failed -> processing -> completed,
failed -> dead_letter.

Anything else is blocked and logged. This alone removes most ambiguity during incidents.

Why Make.com Data Store is the right place for workflow state

Teams often ask if Google Sheets or Airtable can do the same job. They can store rows, but they are weak for state control inside high-frequency scenario execution.

Option	Operational issue in production
Google Sheets	Slow under burst load, row-level race conditions, fragile concurrent writes.
Airtable	Extra API dependency, rate limits, external outage can block critical path.
Make.com Data Store	Native, low-latency in-scenario access, fewer moving parts.

For production reliability, every extra API dependency increases incident surface area. A Data Store keeps state tracking in the same runtime as your scenario logic. That reduces failure modes and simplifies debugging.

State schema diagram for Make.com Data Store with core and incident fields

This schema diagram matches the exact field model used in this guide.

Architecture overview

Use this routing shape as baseline:

Trigger (webhook or schedule)
  -> Normalize payload
  -> Generate processing_id from source event
  -> Lookup processing_id in Data Store
      -> completed: skip + log duplicate-prevented
      -> processing: skip + lock-protection log
      -> failed: route to controlled retry path
      -> not found: create state row and continue
  -> Set status=processing
  -> Execute write actions (CRM, ERP, billing, alerts)
      -> success: set status=completed + updated_at
      -> failure: set status=failed + error + owner alert

Two design rules matter most:

processing_id must come from source data, never Make execution ID.
State update must happen before alert send on failure branch.

If alert send fails first and state is not written, you lose incident truth.

Step 1: Create a Data Store schema that supports operations

Name the store by workflow lane, for example hubspot_intake_state or invoice_sync_state. Avoid one giant global store without partitioning.

Minimum fields:

Field	Type	Why it exists
`processing_id`	text key	Deterministic idempotency key.
`status`	text	`new`, `processing`, `completed`, `failed`, `dead_letter`.
`source`	text	Source system and event type.
`created_at`	text	First-seen timestamp.
`updated_at`	text	Last transition timestamp.
`error_code`	text	Stable class for failure grouping.
`error_message`	text	Fast operator context.
`execution_id`	text	Trace link to Make run history.
`owner`	text	Incident ownership route.

This schema is intentionally small. It supports both dedupe and incident operations without becoming a warehouse.

Step 2: Generate a stable processing_id

The key must represent business intent, not technical attempt.

Good key sources:

Typeform submission ID,
HubSpot event ID,
invoice number plus source system ID,
deterministic hash of normalized payload fields.

Bad key source:

Make execution ID (changes on retry),
current timestamp alone,
random UUID generated per run.

In one finance reconciliation lane, I inherited keys based on execution timestamp. Same invoice event generated new keys on every retry, so dedupe never triggered. Converting key logic to source invoice ID removed duplicate write incidents in the first week.

Flow segment where processing_id is generated before any write module

Generate processing_id before routing so every branch uses the same key.

Step 3: Lookup state before every side effect

Do not write first and reconcile later. Always check state before each critical write.

Routing policy:

If record exists with completed: skip write, log duplicate_prevented=true.
If record exists with processing: skip or delay. This protects against overlap and lock contention.
If record exists with failed: route to retry branch, do not re-enter normal happy path blindly.
If record not found: create row, then continue.

This is the core idempotency gate. Without it, retries and manual reruns remain unsafe even if downstream APIs are stable.

Router branch based on Data Store search result and current status

Status-aware routing is the control point that prevents duplicate writes under retry pressure.

Step 4: Enforce explicit state transitions

Update state at three mandatory points:

Before first write action: status=processing.
After confirmed success: status=completed.
On failure handler: status=failed plus error details.

If these updates are inconsistent, your ledger stops being trustworthy. Teams then fall back to manual interpretation of run logs, which is exactly what this pattern is meant to avoid.

Add transition guards:

reject completed -> processing unless manual replay flag is present,
reject processing -> new,
reject direct new -> completed without write confirmation.

You can enforce guards with branch conditions before Update record modules.

Scenario view showing processing, completed, and failed transitions as separate updates

Transition modules should be explicit and auditable, not implicit side effects.

Step 5: Put error handling on the write path, not only at scenario level

Scenario-level failure notifications are too broad. You need branch-level failure context tied to the same processing_id.

Failure branch sequence:

Capture module error context.
Update Data Store to failed with error_code and error_message.
Send Slack or email alert with owner, key, source, and link to run.
Stop branch.

Alert payload should answer operational questions immediately:

What failed?
Which record is affected?
Who owns it?
Is replay safe now?

If the answer is missing, alert quality is low and mean time to resolution will stay high.

Error handler sequence showing failed-state write before owner alert and branch stop

Required order is explicit: failed-state write, then alert, then stop.

Step 6: Separate retry processing from main ingestion

Do not cram complex retry rules inside the main scenario. Keep main flow focused on first-pass processing and route failures to a dedicated retry scenario.

Retry scenario pattern:

schedule every hour,
search Data Store for status=failed and retry count below threshold,
replay deterministic business step using same processing_id,
on success set completed,
on repeated failure set dead_letter and escalate.

This keeps ingestion fast and makes retry behavior observable.

Dedicated retry lane flow for failed records with threshold and dead-letter escalation

Retry logic is isolated from ingestion and escalates to dead letter after threshold.

Step 7: Add lock protection for concurrent arrivals

State machines fail if concurrent runs both think they own the same event. Add lock-like semantics with processing state and short timeout windows.

Practical approach:

first run sets processing with timestamp,
second run encountering processing exits as duplicate or waits,
stale processing rows older than threshold are moved to failed and reviewed.

In a marketing intake lane, this removed duplicate task creation during campaign spikes where webhook arrivals overlapped heavily.

Rerun-safe routing with explicit handling for already processing records

Concurrent protection is required when webhook bursts hit the same key repeatedly.

Copy-paste blueprint (router + Data Store row)

Use this as a first-pass runbook baseline:

processing_id = build_from_source_event(payload)
record = data_store.get(processing_id)

if record.status == "completed":
  log("duplicate_prevented", processing_id)
  stop

if record.status == "processing":
  log("lock_protection", processing_id)
  stop

if record.status == "failed":
  route_to_retry_lane(processing_id)
  stop

if record not found:
  data_store.create({
    processing_id,
    status: "new",
    source,
    created_at: now(),
    updated_at: now(),
    execution_id,
    owner
  })

data_store.update(processing_id, { status: "processing", updated_at: now() })

try:
  run_external_writes(payload)
  data_store.update(processing_id, {
    status: "completed",
    error_code: "",
    error_message: "",
    updated_at: now()
  })
except err:
  data_store.update(processing_id, {
    status: "failed",
    error_code: normalize(err),
    error_message: operator_safe(err),
    updated_at: now()
  })
  alert_owner(processing_id, owner, execution_id, err)
  stop

Reference row shape:

{
  "processing_id": "source_event_key",
  "status": "new",
  "source": "typeform.submit",
  "created_at": "2026-03-05T10:20:00Z",
  "updated_at": "2026-03-05T10:20:00Z",
  "error_code": "",
  "error_message": "",
  "execution_id": "make_run_12345",
  "owner": "revops_oncall"
}

Real implementation: Typeform to HubSpot with state tracking

Here is a concrete runbook pattern based on a real lane similar to Typeform to HubSpot dedupe:

Typeform webhook arrives.
Normalize payload and compute processing_id=submission_id.
Search Data Store.
If not found, create state row with new and metadata.
Set state to processing.
Check HubSpot for existing contact by email plus external id.
Create or update contact.
On success set state completed with timestamp.
On failure set state failed, push owner alert, stop.

Before state machine:

repeated submissions created duplicate contacts,
failed handoffs were discovered late,
manual replay caused more duplicates.

After state machine:

duplicate writes were blocked by key,
failed records were visible immediately,
replay became deterministic and low risk.

If your current intake and lifecycle flows already show these symptoms, this is usually the right time to review HubSpot workflow audit: 7 silent failures alongside Make.com retry logic.

When you do not need this pattern

Use this model where reliability and replay safety matter. Skip it when complexity does not pay back.

Usually not needed:

personal automations with no external writes,
one-off migration scripts,
low-value prototypes where duplicate side effects are acceptable.

Usually required:

scenarios writing to CRM, ERP, billing, or finance systems,
workflows with webhook retries or burst traffic,
processes with audit requirements,
any lane where duplicate writes have business cost.

A simple decision rule:

if duplicate write has near-zero cost, keep it simple,
if duplicate write creates cleanup and trust cost, add state machine now.

Common mistakes that break state-machine reliability

Mistake 1: Using execution_id as the primary key

Execution ID changes every retry. Deduplication fails by design. Use source-derived keys.

Mistake 2: Updating state only at the end

If a module fails before final update, records stay ambiguous. Update state before and after critical operations.

Mistake 3: No handling for `processing` status

Concurrent runs collide and both proceed. Always route processing to lock-safe path.

Mistake 4: Alerting before failed-state write

If notification fails, incident evidence disappears. Write state first, alert second.

Mistake 5: No retention policy

State tables grow forever, search performance degrades, and operations slow down. Archive completed rows by age.

Pre-release verification checklist

Run this before marking the scenario production-ready:

processing_id comes from source event and is deterministic.
Every external write has a state lookup gate.
processing, completed, and failed transitions are explicit.
Error handler writes failed state before sending alerts.
Retry lane exists for failed rows with threshold and escalation.
Concurrent arrivals with same key are tested and safe.
Duplicate webhook simulation does not create duplicate downstream writes.
Manual replay of a failed item completes missing side effect only once.

For full cross-system checks, use the free 12-point checklist. If you want direct implementation help, use Contact.

Operational metrics that prove this works

Architecture is not enough. You need measurable outputs.

Track at minimum:

duplicate-created count,
duplicate-prevented count,
failed backlog older than SLA,
median owner response time,
replay success rate,
dead-letter volume per week.

If duplicate-created is not near zero after rollout, inspect key design and branch gates first. In most cases, key instability or missed write-path gating is the root cause.

For finance-critical lanes, compare with outcomes in the VAT automation case, where rerun safety and explicit ownership are non-negotiable.

FAQ

Is Make.com Data Store reliable enough for production state tracking?

Yes, for many B2B automation lanes it is reliable enough when schema, key design, and transition rules are explicit. The failures I see usually come from weak key logic or missing branch controls, not from Data Store itself. If your volume is extreme, partition by workflow and retention window.

Should I keep automatic retries enabled in Make.com modules?

Keep automatic retries only where operations are safely idempotent. For write-heavy branches, controlled retries through state-machine logic are safer because they preserve ownership and avoid hidden duplicate side effects. Do not assume "retry equals safe" without explicit state checks.

How many states do I actually need to start?

Start with new, processing, completed, and failed. Add dead_letter when you need escalation after repeated failures. More states are not automatically better. Clear transitions and ownership are more important than a complex status model.

How do I explain this pattern to non-technical ops stakeholders?

Use business language: every incoming event gets a tracking card, and that card can only move through approved statuses. This prevents duplicate actions and makes failures visible with owner accountability. Most stakeholders understand this model quickly when shown one real incident timeline.

Next steps

If your Make.com workflows are already creating retries and duplicate risk:

Get the free 12-point reliability checklist
Review Make.com retry logic without duplicates
Review HubSpot workflow audit: 7 silent failures
Review Make.com monitoring in production
See Make.com error handling service
If you want a direct assessment, use Contact

Make.com webhook debugging playbook

Cluster path

Make.com, Retries, and Idempotency

Implementation notes for retry-safe HubSpot-connected flows: Make.com, state, monitoring, and replay control.

March 5, 2026

Make.com Duplicate Prevention: Stop Duplicate Records on Retry

March 9, 2026

HubSpot Contact Creation Webhooks: Stop Duplicate Contacts

March 9, 2026

HubSpot Webhook Timeout in Make.com: 5-Second Limit and Safe ACK

View Stripe Connect & ops services service

Related guides

Continue with these articles to close adjacent reliability gaps in the same stack.

March 5, 2026

Make.com Duplicate Prevention: Stop Duplicate Records on Retry

Make.com duplicate prevention stops duplicate records when webhook retries, reruns, or manual replays fire twice. Learn Data Store gates and safe replay.

March 5, 2026

Make.com Retry Logic: Replay Failed Webhooks Without Duplicates

Make.com retry logic must survive timeouts, retries, and manual replay without creating duplicate CRM or finance writes. Learn the production-safe pattern.

March 9, 2026

HubSpot Contact Creation Webhooks: Stop Duplicate Contacts

HubSpot contact creation webhooks can fire multiple create and property-change events in Make.com. Learn burst control, dedupe keys, and safe contact writes.

Free checklist: Stripe Connect Ops Checklist

Get the PDF after submission. Use it to run through payout, verification, and triage checks when connected account behavior breaks in production.

Free 30-minute discovery call available after review. Paid reliability audit from €500 if fit is confirmed.

Need reliability work in production?

Book a scoping call. I map the highest-risk lane and confirm fit before a paid audit. Start with a free 30-minute audit-scoping call. Paid reliability audit starts from €500 if fit is confirmed.

Book scoping call Ask for paid audit

The problem: Make.com scenarios have no memory by default

What "state machine" means in plain terms

Why Make.com Data Store is the right place for workflow state

Architecture overview

Step 1: Create a Data Store schema that supports operations

Step 2: Generate a stable processing_id

Step 3: Lookup state before every side effect

Step 4: Enforce explicit state transitions

Step 5: Put error handling on the write path, not only at scenario level

Step 6: Separate retry processing from main ingestion

Step 7: Add lock protection for concurrent arrivals

Copy-paste blueprint (router + Data Store row)

Real implementation: Typeform to HubSpot with state tracking

When you do not need this pattern

Common mistakes that break state-machine reliability

Mistake 1: Using execution_id as the primary key

Mistake 2: Updating state only at the end

Mistake 3: No handling for processing status

Mistake 4: Alerting before failed-state write

Mistake 5: No retention policy

Pre-release verification checklist

Operational metrics that prove this works

FAQ

Is Make.com Data Store reliable enough for production state tracking?

Should I keep automatic retries enabled in Make.com modules?

How many states do I actually need to start?

How do I explain this pattern to non-technical ops stakeholders?

Next steps

Related reading

Make.com, Retries, and Idempotency

Make.com Duplicate Prevention: Stop Duplicate Records on Retry

HubSpot Contact Creation Webhooks: Stop Duplicate Contacts

HubSpot Webhook Timeout in Make.com: 5-Second Limit and Safe ACK

Related guides

Make.com Duplicate Prevention: Stop Duplicate Records on Retry

Make.com Retry Logic: Replay Failed Webhooks Without Duplicates

HubSpot Contact Creation Webhooks: Stop Duplicate Contacts

Free checklist: Stripe Connect Ops Checklist

Need reliability work in production?

Mistake 3: No handling for `processing` status