Stripe Integration in AI-Built Apps: What the AI Misses
Stripe is easy to wire up in a demo and hard to get right in production. The nine edge cases AI tools miss, from raw-body webhooks to dispute alerts, and how to fix each one.
Stripe is the happiest API in the world during a demo. Paste a publishable key, drop in a Checkout link, ship a Buy button, and your Lovable or Bolt prototype is accepting payments before lunch. The AI wired it up cleanly, the test card worked, and the founder is telling investors they have a monetised product.
Then real money starts moving. A customer gets charged twice because a React re-render fired two PaymentIntents. A webhook that logged fine in development starts returning 400 in production because a middleware parsed the body before signature verification. A subscription upgrade charges the full new amount without crediting the unused portion of the old plan. A chargeback lands three weeks after the sale and the founder has no evidence workflow. A European customer gets declined because 3DS was never handled, and another European customer pays 20 percent less than they should because the tax field is hardcoded to a flat 10 percent.
None of this is exotic. These are the same nine categories of bug we see every time WitsCode is called in to harden a Stripe integration that an AI coding tool generated. The tools are getting better at the happy path, but the failure paths, the regulatory paths, and the retry paths are where production revenue actually lives, and those paths are still overwhelmingly wrong out of the box. This piece walks through each one, explains exactly why the AI misses it, and gives you the pattern to fix it.
Webhook signature verification and the raw body trap
Every Stripe webhook handler needs one thing before it does anything else: stripe.webhooks.constructEvent(rawBody, signature, secret). That rawBody argument is not a string the framework handed you after JSON parsing. It is the exact bytes Stripe sent, down to whitespace and key ordering. If anything has touched the body, even to re-serialise it, the signature will not match and verification will throw.
AI-generated handlers almost always fail here in a specific way. Next.js App Router code will do const body = await req.json() and pass that object to constructEvent, which is wrong. Next.js Pages Router code will accept the default bodyParser: true behaviour, which reads the stream and leaves you with req.body already parsed. Express scaffolds will mount express.json() globally before the webhook route, which silently consumes the raw stream.
The correct pattern depends on the framework. In the Next.js App Router, use const body = await req.text() inside the POST handler and pass that string to constructEvent. In the Pages Router, export const config = { api: { bodyParser: false } } from the route file and read the raw stream with a small helper like getRawBody(req). In Express, mount express.raw({ type: 'application/json' }) on the webhook route specifically, before any global JSON parser, then access req.body as a Buffer. In Supabase Edge Functions or Deno, use await req.text() and the async variant stripe.webhooks.constructEventAsync because the default constructEvent uses Node crypto that is not available.
The failure mode is particularly nasty because it does not look like an auth error. The endpoint returns 400, Stripe retries for hours, your dashboard shows endless failed deliveries, and by the time someone checks the logs a chargeback has landed for an order that looked paid but never triggered fulfilment. Every AI-built Stripe integration we audit fails this check or passes it through sheer luck.
Idempotency on POSTs and deduplication on webhooks
Stripe guarantees at-least-once webhook delivery. It does not guarantee exactly-once. The same invoice.paid event can land twice, hours apart, if your endpoint ever returned a 500 or timed out. If your handler decrements inventory or emails a receipt without checking whether it has already processed that event ID, you will ship two widgets for one order or send a duplicate confirmation that makes you look sloppy.
The fix is a small processed-events table with a unique constraint on event.id. Before processing, try to insert the event ID. If the insert succeeds, process the event, return 200. If the insert fails with a unique violation, return 200 immediately without re-processing. Stripe sees success, retries stop, and your business logic runs exactly once regardless of how many copies of the event arrive. Do not confuse this with the timestamp tolerance in constructEvent, which is replay protection against attackers and defaults to five minutes. Legitimate Stripe retries can arrive a day later.
The matching problem on the outbound side is idempotency keys on POST calls. Every create operation, such as paymentIntents.create, refunds.create, subscriptions.create, accepts an Idempotency-Key header. Stripe caches the response for 24 hours and replays it on any request with the same key. AI-generated code either omits the key entirely or, almost as bad, generates a fresh UUID inside the request handler. A fresh UUID per request makes the header useless, because each retry creates a new key and a new charge.
Use a stable business identifier as the key. For a checkout, that is the cart ID plus a short suffix. For a refund, it is the order ID plus the refund reason. For a subscription creation, it is the user ID plus the plan ID. When React Strict Mode double-invokes your effect in development, or Next.js prefetches a route that triggers your POST, or a network blip causes the client to retry, Stripe will return the same PaymentIntent instead of creating a second one. Stable keys are a two-line change that prevents some of the ugliest duplicate-charge bugs in the lifecycle of a startup.
The refund webhook that lies about being a refund
Here is the edge case that has cost more WitsCode clients support tickets than any other: charge.refunded fires on every refund, partial or full, and the event payload does not have a boolean telling you which. The charge object inside the event has amount and amount_refunded. If they are equal, the refund was full. If amount_refunded is less than amount, a partial refund happened and more may come later.
AI-generated code frequently treats charge.refunded as a full-refund signal and marks the order refunded, closes the ticket, returns the inventory, and revokes the license. Then the customer, who only asked for fifty dollars back on a two-hundred-dollar order, loses access to a product they still paid for. You find out about it from a support thread, not your logs.
The fix is to read both fields and branch. If charge.amount_refunded === charge.amount, treat it as a full refund. Otherwise, create a partial refund record, keep the order active, and only adjust whatever portion was returned. Better still, in the current API version, listen to refund.created and refund.updated. Those events carry the refund object directly with its own amount, status, and reason, and make the partial-versus-full logic unambiguous. The older charge.refunded event remains for backwards compatibility, which is exactly why the AI training data keeps recommending it.
Tax, proration, and the subscription math vibe coders skip
Sales tax is where AI-generated pricing logic becomes a lawsuit waiting to happen. The Lovable checkout will add a 10 percent VAT line, hardcoded, because the training corpus is full of European examples. Apply that to a US customer and you are collecting sales tax you are not registered to remit, which is fraud in most states once you cross nexus thresholds. Apply it to a UK B2B customer and you have missed the reverse-charge mechanism. Apply it to a SaaS product in the EU and you have missed MOSS entirely.
Stripe Tax solves this. Set automatic_tax: { enabled: true } on your Checkout Session, Payment Intent, or Subscription, give Stripe the customer address, and it calculates the correct tax for that jurisdiction, including state and local variations in the US, EU VAT with reverse charge detection, UK VAT with thresholds, and a growing list of other regions. It monitors your nexus exposure and warns you when you are approaching registration thresholds. For SaaS in the EU, it applies the right digital-services rate. You still need to file and remit in most jurisdictions, but Stripe Tax integrates with TaxJar and Avalara for that. The alternative, rolling your own tax engine, is a multi-month project that no vibe-coded app should attempt.
Subscription proration is the other place the math gets skipped. When a customer upgrades mid-cycle, Stripe can credit the unused portion of the old plan and charge a prorated amount for the new plan, either as an immediate invoice or rolled into the next cycle. The behaviour is controlled by proration_behavior, which takes create_prorations, none, or always_invoice. AI-generated code often just calls subscriptions.update with a new price ID and no proration config, which either under-charges on upgrades or double-charges on downgrades depending on the defaults. The right answer is explicit: pick a billing policy, set the flag, and make sure billing_cycle_anchor is where you expect. For upgrades, always_invoice with proration_behavior: 'create_prorations' is the common choice because the customer sees the difference paid immediately and the next invoice is clean.
Failed payments, SCA, and disputes: the revenue-loss trio
Three categories of Stripe event get ignored by AI-built apps and each of them costs real money. The first is the failed payment. When a subscription renewal fails, Stripe fires invoice.payment_failed, moves the subscription to past_due, and begins retrying based on your dunning configuration. If you have Smart Retries enabled in the Stripe Dashboard, Stripe uses ML to pick retry times with the highest success probability. If you do not, you get the default schedule, which is worse. Either way, you need to listen for the status transition on customer.subscription.updated and decide when to revoke access, when to send a dunning email, and when to surface an in-app banner asking the customer to update their card. AI-built apps usually do none of this. The subscription silently lapses, the customer never hears about it, and when they notice they cancel out of frustration instead of fixing the card.
The second is Strong Customer Authentication. SCA is EU and UK regulation, but the assumption that 3DS only triggers for European cards is wrong. Card issuers anywhere in the world can require a 3DS challenge, especially for high-value transactions, new customers, or cards with recent fraud flags. A US bank will throw a challenge at a US customer in 2026 more often than the AI training data suggests. The PaymentIntent can come back with status: 'requires_action', and if your frontend does not call stripe.confirmCardPayment to complete the challenge, the charge never completes. For off-session charges, such as a renewal when the customer is not there, you need off_session: true and a previously saved payment method with setup_future_usage configured. Without this, a meaningful percentage of recurring charges fail on challenges the customer never sees.
The third is disputes. When a customer calls their bank to chargeback, Stripe fires charge.dispute.created, holds the funds, and gives you a short window, typically 7 to 21 days depending on the card network, to submit evidence. Miss the deadline and you lose the dispute automatically. Win rates depend entirely on the quality of the evidence, and evidence generation is a workflow, not a button: shipping records, usage logs, authentication records, customer correspondence. There is also radar.early_fraud_warning, which fires before the dispute becomes official and gives you a chance to refund proactively, which closes the case without a chargeback and protects your dispute ratio. Most AI-built apps listen to neither event. The money just disappears.
Multi-currency and where the funds actually land
Stripe supports more than 135 presentment currencies. Your settlement currency, meaning what lands in your bank account, depends on your country and configuration. A US Stripe account can typically settle in USD only unless you add a foreign-currency bank account. A UK account can settle in GBP, EUR, and USD. When presentment and settlement differ, Stripe converts at roughly a 1 percent fee on top of the card processing fee, and the conversion rate is applied at capture, not at session creation, so mid-day currency swings can chip away at margins.
The AI failure here is subtle. It will create Price objects in USD because that is what the examples use, but render the checkout page in EUR with a client-side conversion for display. The customer sees 92 euros, clicks pay, gets charged 100 dollars at the card network's exchange rate, and your settlement arrives as USD minus fees. The accounting is a mess, the customer is annoyed when their statement shows a different number, and your refund has to go back in USD which may be more or less than what they originally paid in local currency.
The fix is to create Price objects in every presentment currency you support and let Stripe pick the right one based on the customer's location or an explicit choice. Stripe Connect adds another layer: each connected account has its own settlement currency, so if you are running a marketplace paying out to sellers in multiple countries, you need to think about conversion at the payout layer too. The UI work is small. The accounting work, foreign-exchange gain and loss on financial statements, is where this becomes a real project for a grown-up SaaS.
What a Stripe hardening pass looks like
When WitsCode does a Stripe hardening engagement on an AI-built app, we run the same checklist every time. We verify webhook signature verification is reading raw bodies in the framework's correct way. We add a processed-events table with a unique constraint on event.id and wrap the handler in a dedup check. We audit every POST to the Stripe API and add stable idempotency keys keyed on business identifiers. We fix the charge.refunded handler to branch on amount, or migrate it to the refund.* event family. We turn on Stripe Tax, register the business where it needs registering, and delete the hardcoded tax logic. We set explicit proration behaviour on subscription updates and test the upgrade and downgrade paths with real cards. We enable Smart Retries, wire up the past_due transitions to email and in-app banners, and build a grace period before revoking access. We handle requires_action on every PaymentIntent on the client. We subscribe to dispute webhooks and radar.early_fraud_warning, and we ship an evidence template tied to your order data. We create Price objects in every presentment currency you sell in and document the settlement math for the founder so they stop being surprised when the deposit looks smaller than the receipt.
That is maybe ten days of focused work. It is the difference between a Stripe integration that demos well and one that can carry a real business through its first hundred thousand in MRR without leaking money through partial refunds, failed 3DS challenges, missed disputes, or duplicated webhook handlers.
If you built your checkout in Lovable, Bolt, Cursor, or any AI tool and you are charging real cards, assume you have at least five of the nine bugs above. We have not yet audited an AI-generated Stripe integration that had fewer. Book a Stripe hardening engagement with WitsCode and we will ship the fixes in a sprint, with tests, webhook replays, and a runbook your future self can read at 2am when something breaks on a Saturday. Stripe is easy to wire up. Keeping it honest in production is the last-mile work that separates a vibe-coded demo from an actual business.
Get weekly field notes.
Practical writing on shipping products, straight to your inbox. No spam.
Need help with this?
MVP Development
We design and build web apps, MVPs, and SaaS products. Talk to us about what you are working on.
Talk to usWant to discuss vibe coders for your business?
Start a project and we'll talk through where you are, what's working, and the highest-leverage moves for the next 90 days.