Case Study

Payment orchestration with Temporal, Kafka, and Stripe

This case describes a backend service for payment workflows where the hard part was not charging a card, but keeping state correct when requests were retried, downstream systems were slow, and billing steps had to stay consistent across multiple services.

10K+ monthly payment transactions
Go service implementation and internal APIs
Temporal workflow engine for retries and state progression

Context

What problem needed to be solved

The platform needed a reliable payment workflow around billing and subscription operations. A single user action could trigger several dependent steps: validate intent, create or confirm a charge, persist local state, publish events, and reconcile the final outcome for internal systems.

The system could not rely on simple request-response logic because payment providers and internal consumers fail in different ways. Some failures are transient, some are partially completed, and some return ambiguous outcomes that must be reconciled later.

Product-level context for this system is described on the MyAutoData technical context page.

MyAutoData Context

What kind of product this workflow lived in

MyAutoData is a multi-domain vehicle data platform combining user vehicle data, analytics, marketplace-style flows, and payment-related operations. That matters because the payment service was not isolated: it had to fit into a broader system with internal APIs, asynchronous processing, and user/account state that needed to stay consistent.

In practice, the payment workflows sat next to other business processes and had to handle retries, provider uncertainty, and downstream event consumers without breaking the rest of the platform.

Constraints

Why the naive approach was not enough

Solution

Architecture and engineering decisions

1. Temporal owned workflow progression

Instead of encoding retries and compensation logic across handlers and cron tasks, the payment flow was modeled as an explicit workflow. That made step ordering, retry policy, and timeout behavior visible in one place.

2. Idempotency was enforced at command boundaries

Each externally visible payment action used stable identifiers and repeat-safe commands so the same request could be replayed without creating duplicate charges. This mattered for both client retries and worker restarts.

3. Kafka publication went through Outbox

Instead of writing domain state and publishing events in separate non-atomic operations, state changes were stored together with an outbox record in PostgreSQL. Event publication happened asynchronously, reducing the risk of lost or premature messages.

4. Reconciliation handled ambiguous provider outcomes

Some failures were not true failures but unknown outcomes. Separate reconciliation logic verified final Stripe state and brought internal records back to a consistent terminal state.

Client/API -> Go service -> Temporal workflow -> Stripe activity -> PostgreSQL state + outbox record -> Kafka publisher -> reconciliation flow on ambiguous outcomes

Trade-offs

Why this design was chosen

Temporal adds operational and conceptual weight compared with simple background jobs. That trade-off was worth it because the problem space was inherently stateful and failure-heavy. The team needed a deterministic place to reason about retries, timeouts, and compensation.

Kafka plus Outbox also adds moving parts, but it gives a clearer boundary between transactional state changes and asynchronous communication. For payment-related workflows, that boundary is more valuable than reducing the number of components.

Result

Outcome

Go
Temporal
Kafka
PostgreSQL
Stripe
Outbox