Skip to content
ArticleMarch 5, 202610 min readmakeretryidempotencyerror-handlingautomation

Make.com Retry Logic: Replay Failed Webhooks Without Duplicates

Make.com retry logic must survive timeouts, retries, and manual replay without creating duplicate CRM or finance writes. Learn the production-safe pattern.

How to replay failed webhooks in Make.com without writing twice

Most teams reach this page after searching for the safest way to replay failed webhooks in production without causing duplicates.

The most expensive Make.com failures are not red-error crashes. They are green runs that retried quietly, wrote twice, and looked fine until someone compared records on Monday morning.

The pattern repeats:

  • webhook fires from a form or source system,
  • Make.com starts execution,
  • one module times out or returns a transient error,
  • scenario retries,
  • external write has already succeeded once,
  • second write creates a duplicate contact, duplicate task, or conflicting status.

Most teams detect this late because default dashboards track run success, not business-state correctness. A scenario can show "completed" and still corrupt CRM state.

I run this type of stabilization work in production lanes where duplicate writes directly affect routing, reporting, and finance handoff. If you want context on how I operate these projects, start with About. If you are already seeing retries create business issues, the implementation scope is Make.com error handling.

Why default retry settings are dangerous

Make.com has retry behavior that is useful for resilience but risky for data integrity if you do not add idempotency controls.

Default retries are dangerous for one reason: the platform can retry a technical operation even when the business operation already happened.

Example:

  1. Module sends Create Contact to HubSpot.
  2. HubSpot creates the record.
  3. Response is delayed, dropped, or times out.
  4. Make.com treats the step as failed.
  5. Retry executes.
  6. Second Create Contact call inserts a duplicate.

From the automation side, this looks like recoverable retry logic. From the business side, this is duplicate state that someone has to clean manually.

The risk increases in multi-step scenarios where early steps already produce side effects. A retry at step 4 can still break consistency if step 3 made a write and no dedupe guard exists.

Make.com scenario showing branch-level retry paths and external writes

Scenario-level view used during reliability audit to identify unsafe retry branches.

Three patterns that cause duplicates

Pattern 1: Webhook retry plus CRM create

Source platforms retry webhooks when they do not receive confirmation fast enough. If your scenario treats each webhook as new intent and directly calls "create", you get duplicate records.

Typical signs:

  • same payload hash appears multiple times,
  • records have near-identical timestamps,
  • operators notice duplicates only in reporting or assignment queues.

This pattern is common in intake flows and is already visible in Typeform to HubSpot dedupe.

Pattern 2: API timeout plus automatic retry

A downstream API can process your first request but return late. Make.com marks failure and retries. Both requests complete, and now you have two writes for one business event.

Typical signs:

  • provider-side logs show one successful call before retry,
  • Make execution history shows error then success,
  • no dedupe key links both writes to one source event.

Pattern 3: Partial failure in multi-step scenario

Step 1 to 3 succeed. Step 4 fails. Retry path re-enters with incomplete state awareness. If any earlier step was non-idempotent, replay can multiply side effects.

Typical signs:

  • branch-level inconsistency between CRM and notification system,
  • one system updated, another missing,
  • team runs manual replay and creates further duplication.

For broader incident detection around this pattern, read Make.com monitoring in production.

Implementation path

Need one replay-safe retry path in Make.com?

Use Make.com error handling for processing IDs, replay-safe state, and owner-routed alerts. If retries already created duplicate CRM records, clean that backlog separately instead of rerunning the lane blind.

How to build retry-safe logic in Make.com

Step 1: Add a dedupe check before every write

Treat dedupe as a mandatory gate, not an optional branch.

Use a Data Store ledger that records one stable processing_id per business event. Before any create or update against external systems:

  1. lookup processing_id,
  2. if exists and state=completed, skip write,
  3. if exists and state=processing, defer or route to controlled queue,
  4. if missing, create ledger row and continue.

This one guard eliminates most duplicate writes from retries.

Data Store state table used as dedupe gate before external writes

Data Store view with per-record state used for idempotent gating.

Step 2: Use a processing_id for every run

Your processing ID must come from source intent, not execution context.

Good key sources:

  • webhook event ID,
  • submission ID,
  • deterministic hash of business object and event timestamp.

Bad key source:

  • Make execution ID (changes on rerun).

If key changes on retry, dedupe fails by design.

Flow showing processing_id generation before routing and write modules

Processing key is generated before any branch that can write downstream.

Step 3: Track state per record

Use a minimal state machine in Data Store:

  • processing: execution started, write not yet confirmed,
  • completed: write confirmed and safe against replay,
  • failed: exception caught and routed.

State transitions should be explicit and auditable. Do not rely on implied status from module colors in run history.

A practical rule I use:

  • transition to processing before first non-idempotent call,
  • transition to completed only after downstream confirmation,
  • transition to failed with reason code and module reference.

This gives operators fast incident triage and replay safety.

Step 4: Route errors to alerts, not silence

Every critical write module needs an error handler branch.

On failure:

  1. update ledger state to failed,
  2. include error class, processing_id, source object, and module name,
  3. send alert to Slack or email with owner and SLA target.

If your alert is only "scenario failed", operators still have to investigate from scratch.

Slack alert payload with processing_id and module failure context

Alert format that supports fast ownership and replay decisions.

Step 5: Disable automatic retry where unsafe

Automatic retries can stay enabled for read-only or idempotent-safe operations. For write modules with external side effects, disable auto-retry and use controlled retries via your state machine.

This gives you:

  • deterministic replay,
  • clear ownership,
  • no silent duplicate writes.

Branch map used to separate safe automatic retries from manual replay lanes

Branch separation between safe retries and controlled manual replay.

Complete setup: visual overview

A reliable production shape for Make.com retry logic:

Webhook Intake -> Normalize Payload -> Generate processing_id -> Lookup Data Store

  • If exists and completed -> skip write -> log duplicate-prevented
  • If exists and processing -> hold or route to operator queue
  • If missing -> create state processing -> run write modules
    • On success -> update state completed
    • On failure -> update state failed -> alert owner

The key point: the business event owns execution, not the other way around.

When teams adopt this model, retries become controlled recovery actions instead of random duplication risk.

Pre-production test matrix (mandatory)

Before go-live, run simulation tests that reproduce real incident classes.

Test 1: Duplicate webhook delivery

  • Send identical webhook payload twice within 5 seconds.
  • Expected: first run writes; second run returns skip path.
  • Evidence: Data Store has one processing_id, one completed write, one duplicate-prevented log.

Test 2: Timeout after downstream success

  • Simulate delayed response from target API.
  • Expected: no second write if first succeeded.
  • Evidence: one business object created, retry path marked as prevented or controlled.

Test 3: Partial failure on step 4

  • Force module 4 to fail after modules 1 to 3 pass.
  • Expected: state changes to failed, alert sent, no duplicate side effects from replay.
  • Evidence: one failed state row with clear reason and owner route.

Test 4: Manual replay by operator

  • Replay one failed item using approved runbook.
  • Expected: only missing side effect is completed; existing side effects are not duplicated.
  • Evidence: state timeline shows deterministic transitions.

Test 5: Alert payload quality

  • Trigger controlled failure and inspect alert body.
  • Expected fields: processing_id, scenario name, module name, error class, owner, SLA window.
  • Evidence: operator can identify next action in under 2 minutes.

Test 6: High-volume burst

  • Fire 100+ events in short interval.
  • Expected: no duplicate writes under queue pressure.
  • Evidence: duplicate-created metric remains zero, duplicate-prevented metric increases but stays owned.

If this matrix is not passed, do not call the scenario production-safe yet.

Operational metrics that prove retry safety is real

Do not stop at architecture. Track metrics that prove behavior under load.

Minimum daily set:

  • duplicate-created count,
  • duplicate-prevented count,
  • failed-state backlog age,
  • mean owner response time,
  • replay success rate,
  • unresolved failed records older than SLA.

This metric model pairs directly with hubspot and make.com error handling, where branch-level failures create the same ownership and replay pressure.

Common implementation mistakes

Mistake 1: One dedupe check at scenario start only

If you check once and then run multiple write branches, you can still duplicate on downstream retries. Every external write path needs its own gate.

Mistake 2: State updates after write only

If you never mark processing before write, timeouts create ambiguity: did the first call happen or not? You need pre-write state to reason about safe replay.

Mistake 3: Replay without scoped selection

Running full-scenario replay for one failed record often creates extra side effects. Replay should target one processing_id with controlled path.

Mistake 4: No service-level ownership

Engineering gets alerts but operations owns outcome, or vice versa. Define one owner per failure class and escalation target. Otherwise incidents age in queue.

Mistake 5: No audit evidence

If you cannot explain one event path end-to-end with timestamps and state transitions, you are not running production control yet.

For HubSpot-specific duplicate fallout and cleanup strategy, align this with how to prevent duplicate contacts in HubSpot workflows and CRM data cleanup.

Checklist before you ship

  • Every write module has a dedupe check before execution.
  • processing_id comes from source payload, not Make execution ID.
  • Data Store keeps processing, completed, and failed states.
  • Every critical module has an error handler and owner-routed alert.
  • Automatic retry is disabled on unsafe write modules.
  • Replay runbook is tested on intentionally duplicated input.
  • Operator can trace one event path in under 10 minutes.
  • Weekly review tracks duplicate-created versus duplicate-prevented trend.

Next steps

If your team runs Make.com in production and you want duplicate-safe retries:

I can map your highest-risk scenario first and scope the fastest safe rollout path.

FAQ

Should I disable retries everywhere in Make.com?

No. Keep automatic retries for read-only or idempotent-safe modules. Disable or tightly control retries for side-effect writes to CRM, finance, or messaging systems where duplicates have direct business cost.

Can I use Make execution ID as idempotency key?

No. Execution IDs change on rerun, so they cannot identify the same business intent across retries. Use source-level event identifiers or deterministic payload-derived keys instead.

Do I need Data Store, or can I use only module history?

Module history is not a reliable state ledger for replay decisions. You need explicit per-record state in Data Store or an equivalent persistent store to enforce dedupe and recovery logic.

What is the fastest way to audit existing scenarios for retry risk?

Start with one critical scenario, map every external write module, then verify whether each path has source-based keying, state checks, and owner-routed error handling. Missing any one of these is material risk.

Free checklist: HubSpot workflow reliability audit.

Get the PDF immediately after submission. Use it to catch duplicate contacts, retries, routing gaps, and required-field misses before your next workflow change.

Free 30-minute discovery call available after review. Paid reliability audit from €500 if fit is confirmed.

Next step

Need failed webhooks replayed without creating duplicates?

Start with Make.com error handling to stop timeouts, retries, and manual replay from creating second writes. If repeated delivery already polluted CRM records, pair the implementation fix with CRM data cleanup next.