Skip to content
Ecom

Why We Removed 9 Apps From a Shopify Store and Revenue Went Up

A controlled case study on removing nine Shopify apps across a 60-day window, including the sequencing, native replacements, attribution method, and measured revenue lift.

By WitsCode11 min read

The store that kept paying for things it did not use

The subject of this case study is a mid-market Shopify store doing roughly 180,000 sessions a month across two connected markets. The catalog sits around 600 SKUs, the theme is a lightly customised Dawn descendant, and paid social drives a little over half of revenue. Nothing about the store was unusual. It looked, from the outside, like a tidy operation. The admin told a different story. Twenty-two apps were installed. Eleven of them had been added in the previous fourteen months by three different freelancers, a growth manager, and one former agency. Nobody could explain with confidence what four of them did on the storefront, and two of the higher-paying subscriptions had been switched off inside the app but left installed, so their script tags still fired on every page.

We took the engagement with one brief. Strip the stack back without touching the creative, the pricing, or the paid media, then let a sixty-day window tell us whether the theory held. By day sixty, nine apps were gone, revenue per session was up 11.6 percent, mobile LCP had dropped from 3.1 seconds to 1.9, and INP had come down from 412 milliseconds to 178. What follows is the method, not the marketing.

How app bloat actually taxes a store in 2026

The folk wisdom that apps slow stores down has been repeated so often it has stopped meaning anything concrete. The mechanism matters because the mechanism is what tells you which apps to remove first. In 2026 a typical Shopify app harms conversion through three overlapping channels. It injects a script tag that parses on every page whether or not that page uses the feature. It registers asset preloads and font variants that compete with the hero image for early bandwidth. And it attaches event listeners that run JavaScript on every tap and scroll, which is the bloodstream of the INP metric. Shopify's own performance dashboard, since mid-2025, attributes median impact per installed app at roughly thirty to forty milliseconds on a standard Dawn-family theme, and the effect is not linear. Apps that listen for cart mutations, pageview events, or product impressions compound because each one re-triggers the others.

Deloitte and Google's "Milliseconds Make Millions" benchmark placed a one-hundred-millisecond mobile speed improvement at an 8.4 percent uplift in retail conversion. That was 2020 data on a slower mobile baseline, and more recent Chromium research on INP shows a similar pattern at the interaction level, where p75 INP under two hundred milliseconds correlates with meaningfully lower bounce on product detail pages. The merchant does not experience these numbers. The merchant experiences a dashboard that says conversion rate is 2.1 percent and has been for months. The apps quietly take their cut of that number and nobody attributes the loss because nobody can see the counterfactual.

The audit method, and why the inventory came before the decisions

Before a single app was touched, we built a single spreadsheet with one row per installed app. Each row had seven columns. Monthly cost. Feature delivered. Where that feature appears on the storefront. Whether a native Shopify equivalent now exists. Whether a metafield, metaobject, or theme section could cover it. Whether the app injects a script tag globally or only on specific templates. And the current Shopify Performance dashboard score contribution, if Shopify had measured one. That final column is the one most audits skip, and it is the one that makes the argument to the finance team defensible. The dashboard will not flag every bad app, but the ones it does flag are almost always the ones costing real money.

Three patterns emerged inside an afternoon. First, the store was paying for duplicate functionality in three places. Two review apps were installed, the primary one driving stars on listing pages and the second surviving from a promotional campaign that ran a photo gallery block that nobody had rolled back. Second, four apps had been superseded by native features in Shopify releases during 2024 and 2025, and nobody had revisited the stack since. Third, the highest-cost app by subscription was delivering a feature that three metafields and twenty lines of Liquid could reproduce with better CLS behaviour. None of that was unusual. It is, in our portfolio of 250-plus store audits, the default state of any store that has been trading for more than two years under more than one operator.

The nine removals, by category, and what took their place

We will describe the nine by category rather than by name, because the purpose here is observational rather than editorial. A store may have good reasons to keep an app that another store is right to remove, and naming brand-level winners and losers distracts from the logic. What matters is the category, the replacement pattern, and the risk sequencing.

The first to go was a site-wide SEO auditor. It ran a crawl widget in the admin, advertised suggestions, and injected nothing to the storefront in theory but had left an analytics pixel from an old integration still firing on the checkout domain. It contributed nothing to revenue and had been paid for eighteen months without being opened. The second was a currency converter that had become redundant the moment Shopify Markets was configured with local domains, because Markets already served localised prices with correct formatting and tax behaviour. The third was a product-tabs app, a popular pattern for splitting long descriptions into Description, Ingredients, How To Use, and Shipping. The replacement was three metafields definitions, one metaobject for reusable shipping copy, and a server-rendered Liquid section that rendered the tabs as semantic details elements. It cost no JavaScript, improved Lighthouse accessibility scores, and was indexed more cleanly because the content was now inside the page source rather than hydrated on the client.

The fourth was a stock-countdown and urgency banner, which had been contributing to cumulative layout shift on the product detail page because it waited for a live inventory fetch before painting. We rebuilt the same behaviour as a server-rendered badge driven by a metafield threshold and a cached inventory value, painted in the initial HTML. The fifth was a duplicate photo-review carousel, removed outright because the primary reviews app already exposed photo reviews through its app-extension block and the carousel was rendering twice on mobile. The sixth was a wishlist app. Wishlist is a feature customers genuinely use, but the implementation was a script-tag approach that round-tripped to a third-party API on every PDP visit. We replaced it with a cookie-based list read and written in Liquid plus a small fetch to the Shopify Storefront API when the customer was logged in. No script tag, no third-party origin, no round trip for guests.

The seventh was an on-site email popup. Popups do move email capture rates, but the implementation was a full-bundle widget loading on every page, including the checkout thank-you route where it should never fire. Shopify Forms, combined with a metaobject-driven discount rule delivered by Shopify Functions, now handles the capture, and a lightweight theme section triggers it after scroll depth without a third-party origin. Email capture did not fall. It rose by a small margin, because the form was faster to paint and more likely to be seen before the user bounced. The eighth was a live-chat widget. The store's ticket volume did not justify a general-purpose chat app. Shopify Inbox, free and running through the Shop channel, handled the same volume with lower page weight and no iframe cost on product pages, because the Inbox launcher is injected only on templates that enable it. The ninth was a bundle-builder app. Shopify Bundles, now a native product type backed by Functions, replaced it with better inventory accuracy and no storefront script injection at all.

How we isolated the revenue effect from everything else happening

This is the section most case studies skip, and it is the reason most case studies do not deserve to be trusted. A sixty-day window after an intervention will contain weather effects, campaign effects, seasonality, competitor moves, and creative refresh. Simply stating that revenue rose after the change is not evidence of anything. We used a four-part method to isolate the signal.

First, the brand operated a sister store on the same theme and the same fulfilment stack, serving a related but distinct product line. That store became the control. No app changes were made to the control for the entire measurement window. Second, we worked in revenue per session rather than absolute revenue, because session volume was not held constant across the windows. Third, we ran a difference-in-differences calculation. We took the change in revenue per session on the test store between the thirty days before and the sixty days after, and subtracted the equivalent change on the control store across the same calendar dates. This removes any effect that was common to both stores, which is a reasonable proxy for seasonality, category demand, and shared paid media performance. Fourth, we held paid media spend constant in dollar terms on both stores and excluded any week that contained a brand promotion, a BFCM-adjacent date, or a product launch. What is left, after all of that subtraction, is the best available estimate of the intervention effect.

The estimate, in this case, was a lift of 11.6 percent on revenue per session attributable to the stack reduction and the accompanying performance improvement. Conversion rate carried most of the lift, at 9.2 percent. Average order value moved upward by a smaller 2.1 percent, which we attribute to the native bundle migration producing cleaner merchandising on collection pages and a slightly higher attach rate on bundle SKUs. The LCP and INP improvements were the mechanism. Revenue was the consequence.

The sequencing matters more than the list

One instinct is to pull all nine apps on the same afternoon. We counsel against it for a practical reason and a measurement reason. The practical reason is that some of these apps own data. Wishlist entries live inside the app. Review data, if the app is the reviews vendor, lives there too. Email capture consents and segmentation live in the popup app. Pulling apps without an export and a migration plan is how a store loses customer relationships. The measurement reason is that removing nine things at once makes it impossible to attribute the result to any one of them, and it prevents a rollback if one removal turns out to have mattered more than it seemed.

The sequence we use, and used on this store, runs from lowest to highest risk. Tier one is the risk-free removals. An SEO auditor that writes nothing to the storefront, a currency converter made redundant by Markets, a duplicate carousel, a stock-countdown badge with a metafield replacement already staged, and a product-tabs app where the metafields are already populated and the section already tested on a staging theme. These come out first because there is no customer-facing data to migrate and no feature gap to bridge.

Tier two is the medium-risk group where a native or custom replacement has to be deployed in the same release. The popup comes out when Shopify Forms is live with the same discount logic. The wishlist comes out when the Liquid-and-Storefront-API replacement is shipped and the existing wishlist data has been exported and re-imported into customer metafields. The live chat comes out when Inbox has been configured with routing rules, canned responses, and staff access, and the operations team has run a full day on it.

Tier three is the reputation-critical group. The reviews app, even if duplicated, touches customer trust and organic CTR through rich snippets. The bundle-builder app, if it underpins a meaningful share of revenue, has to be migrated to native Bundles with SKU mapping and inventory reconciliation before it comes out. On this store, the photo-review duplication was a tier-three removal that took two weeks of preparation and one hour of execution. That hour was scheduled off-peak and watched live.

What the chart does not show

The bar-chart visualisation of apps installed falling from twenty-two to thirteen and revenue rising across the same timeline tells the story compactly, but it understates two effects that matter more than the revenue number.

The first is maintenance cost. Every installed app is a theme update risk, a Shopify platform upgrade risk, a support-ticket risk, and a line on a subscription renewal report. Removing nine apps reduced the store's recurring app spend by a little over two thousand dollars a month. The saved subscription cost compounded with the revenue lift and the reduced engineering overhead of theme updates. The second is the diagnostic capacity this creates. A leaner stack makes the next audit easier and the next performance regression easier to attribute. Every app you remove makes every remaining app more accountable, because the shared contribution model that hides underperforming apps has less cover.

What the chart also does not show is that three apps we considered removing stayed in. A store should not remove an app because an audit says it should. It should remove an app because a replacement plan exists, the replacement has been tested, and the numbers after removal can be read cleanly. The three apps that stayed in were apps the team judged worth their cost, a decision that was defensible after audit and not before it.

What this means for your store

If your Shopify admin shows more than fifteen installed apps and your revenue sits above half a million a year, the probability that four to nine of them are recoverable is high. The highest-value removals in 2026 tend to be the categories where Shopify has shipped native equivalents since 2024, which now includes bundles, subscriptions, gift cards, forms, markets, and functions-driven discounts and shipping. The second tier of high-value removals is any app that is duplicating a feature already delivered by another app in the stack. The third tier is any script-tag app whose feature could plausibly be rebuilt as a server-rendered Liquid section backed by metafields, because that rebuild removes a whole category of INP, LCP, and CLS risk for a fixed engineering cost.

The WitsCode app-diet engagement is a two-week audit followed by a staged removal across four to six weeks, measured against a control store or a difference-in-differences calendar window where a control store is not available. We handle the exports, the metafield migrations, the Shopify Functions implementations, the Liquid replacements, and the sixty-day verification with a written report. If your stack has grown faster than your storefront, the case for pulling it back is not a hunch. It is a measurement waiting to be taken.

Get weekly field notes.

Practical writing on shipping products, straight to your inbox. No spam.

Need help with this?

Shopify Development

We design and build web apps, MVPs, and SaaS products. Talk to us about what you are working on.

Talk to us

Want to discuss ecom for your business?

Start a project and we'll talk through where you are, what's working, and the highest-leverage moves for the next 90 days.