Two Rules of AI Business and Startups That Ignore Them

These rules are not new, and they are not mine; I stole them from Andrew Ng and Benedict Evans, two men with a huge following. Still, a large majority of AI entrepreneurs and engineers don’t pay attention to them, maybe because these rules show why their AI project will fail.

AI’s Law of Diminishing Returns

To paraphrase Andrew’s words from Coursera’s Deep Learning Specialization course:

The effort to half an AI system’s error rate is similar, regardless of the starting error rate. 

This is not very intuitive. If an AI system passes 90% of test cases and errors on 10%, then you are 90% done, right? Fix the remaining 10% of errors, and you will have 100% accuracy? Absolutely not. If it took you six months to halve the error rate from 20% to 10%, it will take you approximately another six months to halve 10% to 5%. And another six months to halve 5% to 2.5%. Ad infinitum. You will never achieve a 0% error rate on a real-world AI system. For an illustrative example, see this typical chart of error rate vs the number of training samples:

Notice that later in the training process, training set size increases exponentially with each error rate halving, and the error rate never reaches zero. Sure, you will get more efficient with acquiring training data (e.g., by using low-quality sources or synthetic data). Still, it is hard to believe that acquiring 10X more data is going to be much easier than acquiring the initial set. 

This rule becomes more intuitive when dissecting what an AI system error rate represents: uncovered real-world special cases. There are an infinite number of them. For example, one of the easiest machine learning (ML) tasks is classifying images of dogs and cats. It is an introductory task with online tutorials that get 99% accuracy. But solving the last 1% is incredibly hard. For example, is the creature in the image below a dog or a cat?

It is Atchoum the cat, who rose to fame because half of humans recognized him as a dog. The human accuracy on dog/cat classification within 30 seconds is 99.6%. A dog/cat classifier with less than a 0.4% error rate would be superhuman. But it is possible. A training set with hundreds of thousands of strange-looking dogs and cats would teach a neural network to focus just on details encoded in dog or cat chromosomes (e.g. cat eyes). However, building such a dataset is orders of magnitude more complex than a tutorial with 99% accuracy. Other problems lurk in that 1% error rate: photos that are too dark, photos in low resolution, photo compression artifacts, photo post-processing by modern smartphones (adding of non-existing details), dogs and cats with medical conditions etc. The problem space is infinite. This is still considered a solved ML problem though, because a 1% error rate is low enough for all practical purposes. 

But for some problems, even a 0.01% error rate is not satisfactory, for example: full-self driving (FSD). Elon Musk said in a 2015 article with Forbes:

“We’re going to end up with complete autonomy, and I think we will have complete autonomy in approximately two years.”

Tesla was so confident in that prediction that they started selling a full self-driving add-on package in 2016, and they weren’t the only ones. Kyle Vogt, CEO of Cruise, wrote a piece called How we built the first real self-driving car (really) in 2017, in which he claimed:

“the most critical requirement for deployment at scale is actually the ability to manufacture the cars that run that software”

So, the software and the working prototype are done; they just need to mass-produce “100,000 vehicles per year.” 

Fast forward to 2024. Elon Musk’s predictions for autonomous Tesla vehicles deserved a lengthy Wikipedia table, mostly in red

What about Kyle Vogt? In October of 2023 Cruise’s car dragged a pedestrian for 20 feet, after which California’s DMV suspended Cruise’s self-driving taxi license. Kyle “resigned” as CEO in November 2023.  

Don’t misunderstand me—I believe autonomous cars will have a significant market share, probably in the next decade. The failed predictions above illustrate what happens when entrepreneurs don’t respect the AI law of diminishing returns. Elon and Kyle probably saw a demo of a full-self driving car that could drive on its own, on a sunny day, on a marked road. Sure, a safety driver needed to intervene sometimes, but that was only 1% of the drive time. It is easy to conclude that “autonomous driving is a solved problem,” as Elon said in 2016. Notice how ML scientists and engineers didn’t make such bombastic claims. They were aware of many edge cases, some of which are described in crash reports. Edge cases include:

Why so many companies promised a drastic reduction in self-driving error rates in such a short time without having a completely new ML architecture is an open question. Scaling laws for convolutional neural networks have been known for some time, and the new transformer architecture obeys a similar scaling law. 

AI’s Product vs Feature Rule

When is an AI system a good stand-alone product, and when is it just a feature? In the words of Benedict Evans from The AI Summer podcast: “Is this feature or a product? Well, if you can’t guarantee it is right, it’s a feature. It needs to be wrapped in something that manages or controls expectations.” I love that statement. The “it is right” part can be broken down using error rate:

If your AI system has a higher error rate than target users, you have an AI feature in an existing workflow, not a stand-alone AI product.

This rule is more intuitive than the law of diminishing returns. If target users are better at a task, they will not like stand-alone AI system results. They could still use AI to save them effort and time, but they will want to review and edit AI output. If AI completely fails at a task, humans will use the old workflow and the old software to finish the task.

Let’s take MidJourney for example, which generates whole images based on a text prompt. When I used it for a hobby project last year, satisfying artistic images appeared instantly, like magic. But then I spent hours fixing creepy hands, similar to the ones below:

Each time MidJourney created a new image, one of the hands had strange artifacts. Finally, it generated an image with two normal hands—but then it destroyed the ears in another part of the image. The problem was less with wrong details and more with bad UI, which didn’t allow correction of the AI’s mistakes.

Adobe’s approach is different—it treats generative AI as just one feature in its product suite. You use an existing tool, select an area, and then do a generative fill:

You can use it for the smallest of tasks, like removing cigarette butts from grass in a wedding photoshoot. If you dislike AI grass, no problem—revert to the old designer joy of manually cloning grass. Also, Adobe Illustrator has generative Vector AI that generates vector shapes you can edit to your liking.

MidJourney makes more impressive demos, but Adobe’s approach is more useful to professional designers. That doesn’t mean MidJourney doesn’t make sense as a product, its target users are the ones who don’t care about details. For example, last Christmas, I got the following greetings image over WhatsApp:

Did you notice baby Jesus’ hands and eyes? Take another look:

That would never pass with a designer, but that is not the point. There is a whole army of users who don’t care about image composition and details, they just want images that go with their content. In other words, MidJourney is not a replacement for Adobe’s Creative Suite—it is a replacement for stock photo libraries like Shutterstock and Getty Images. And judging by the recent popularity of AI-generated images on social media and the web, people like artsy MidJourney images more than stock photos.

Low-hanging fruit in stand-alone AI products are use cases where a high error rate doesn’t matter or is still better than the human error rate. An unfortunate example is guided missiles; in the Gulf War, the accuracy of Tomahawk missiles was less than 60%. But the army was happy to buy Tomahawks because they were still much more accurate than older alternatives, as fewer than 1 in 14 unguided bombs hit their targets.

Evaluating startups based on the above rules

The great thing is that error rates are measurable, so the above rules give a framework to judge an AI startup quickly. Below is a simple startup example.

Devin AI made quite a splash in March of 2024 with a video demo of developer AI that can create fully working software projects. The announcement says that Devin was “evaluated on the SWE-Bench” (relevant benchmark), and “correctly resolves 13.86% of the issues unassisted, far exceeding the previous state-of-the-art model performance of 1.96% unassisted.” So, the current state-of-the-art (SOTA) has a 98% error rate, and they claim to have an 86% error rate. Even if that claim is valid (it wasn’t independently verified), why do their promo videos show success after success? It turns out that the video examples were cherry-picked, the task description was changed, and Devin took hours to complete.

In my opinion, Microsoft took the right approach with GitHub Copilot. Although LLMs work surprisingly well for coding, they still make a ton of mistakes and don’t make sense as a stand-alone product. Copilot is a feature integrated into popular IDEs that pops up with suggestions when they are likely to help. You can review, edit, or improve on each suggestion.  

Again, don’t get me wrong. I think coding SOTA will drastically improve over the next few years, and one day, AI will be able to solve 80% of GitHub issues. Devin AI is still far away from that day, although the company has a valuation of $2 billion in 2024.

More formally, the framework for evaluation is:

  1. Find a relevant benchmark for a specific AI use case. 
  2. Find the current state-of-the-art (SOTA) error rate and human error rate on that benchmark.
  3. Is the SOTA better or comparable to the human error rate?
    1. If yes (unlikely): Great, the problem is solved, and you can create a stand-alone AI product by reproducing SOTA results.
    2. If no (likely): Check if there is a niche customer segment that is more tolerant of errors. If yes, you can still have a niche stand-alone product. If you can’t find such a niche, go to the next step.
  4. You can’t release a stand-alone AI product. Wait for SOTA to get better, pour money into research, or go to the next step.
  5. Think about how to integrate AI as a feature into the existing product. Make it easy for users to detect and correct AI’s mistakes. Then, measure AI’s return on investment:

    AI_ROI = Effort_saved_by_AI / Effort_lost_correcting_AI

    If too much user time is spent checking and correcting AI errors (AI_ROI<=1), you don’t even have a feature.

Or, to summarize everything discussed here in one sentence:

Every innovative AI use case will eventually become a feature or a product, once the error rates allow it. If you want to make it happen faster, become a researcher. OpenAI’s early employees spent seven years on AI research before overnight success with ChatGPT. Ilya Sutskever, OpenAI’s chief scientist, still didn’t want to release ChatGPT 3.5 because he was afraid it hallucinated too much. Science takes time.

If you found this article useful, please share.

 

 

Working Backwards Book Summary

Why read the Working Backwards book?

Working-backwards-book-small.jpg

Amazon is not very popular right now, with many media criticizing its treatment of workers and its cut-throat culture. Still, love them or hate them, there are some undeniable achievements. Amazon is a tech giant with a market cap of over 1.4 trillion USD. For the first eight years, they operated with a loss as they reinvested all profits into expansion. To me, the most impressive thing about Amazon is that they succeeded in many different business fields. Other tech giants often have a primary business domain. E.g., Microsoft makes software, Apple premium consumer devices, and Facebook social media platforms.

But Amazon was revolutionary in all these fields:

  • E-commerce: First online bookseller and one of the first online stores.
  • E-commerce platform: Fulfillment by Amazon handles logistics for other sellers.
  • Logistics: Amazon Delivery and Amazon Prime (with same-day delivery).
  • Ebooks: Amazon Kindle, the most popular e-ink reader, and Kindle Library, the biggest ebook library.
  • Cloud computing: Amazon Web Services (AWS), the first and largest cloud vendor.
  • Voice assistants: Amazon Echo, the first voice assistant device.
  • Video streaming: Amazon Prime Video and Amazon Studios.
  • Consumer products: Amazon Fire tablet, Amazon Fire TV, and Amazon private labels
  • Retail: Whole Foods Market (acquired) and Amazon Go, the first cashierless convenience store.

Notice that the above domains require completely different know-how and different business principles. E-commerce is a low-margin business, logistics involves a lot of low-skilled employees, and cloud computing requires a small number of high-tech employees. Amazon Studios is an entirely different creative business, with connections to Hollywood and high margins. Amazon Kindle, Echo, and Go required the invention of new technologies. How can a single company excel at all of that?

The book’s thesis is that Amazon excels in different fields because of its unique “Amazonian” culture. 

Amazonian culture

Where does “Amazonian” culture come from? It is never directly stated in the book, but insider stories make it apparent that Jeff Bezos decides on Amazon culture. Jeff’s famous company-wide memos prohibit some practices and prescribe what to do instead. 

On the one hand, that is expected. Amazon, in 2022, had 1.6 million employees. Without strong management at the top, Amazon’s culture would be a mish-mash of cultures new employees would bring. On the other hand, a strong man at the top goes against common advice from conferences and management books: empowering your employees, having a flat organization, and deciding on company culture together with employees. I found similarities between Jeff Bezos and Steve Jobs — you are free to do whatever you want in a company that is not Amazon. While Steve Jobs insisted on product excellence, Jeff Bezos insisted on following Amazon’s procedures and principles.

Here are the central tenets of the Church of Jeff Bezos, in no particular order.

Decoupled two-pizza teams

Again, this principle goes against common wisdom. Many consultants and books say “communication is key” and “communicate important things to all relevant people.” In practice, this turns into large group meetings and emails with a dozen recipients, so employees spend most of their time in meetings or reading emails.

That is because the number of communication channels grows exponentially with the number of participants (a variant of Metcalfe’s law):

communication-channels-C_ScrumAlliance_org.png

An example from the book is a minor change in the Amazon Affiliates program. At first, Amazon gave affiliate commissions only for directly linked products. They decided to expand that to all products purchased by affiliate visitors. It was a straightforward change, and they estimated it could be released in a month. However, they first needed approval from the database team to make changes to the central Amazon database. Then they needed to get the approval and materials from the marketing team. Then they needed to get approval from the legal team. Then they need to write support documents and instruct support agents. All relevant parties need to be synchronized on the release date. In the end, a change a few people could implement in a week took six months!

Jeff recognized these internal dependencies are why companies get slower as they grow. He proposed a radical solution: structure Amazon around independent “two-pizza” teams (below ten people). If the team can’t be fed with two pizzas, it should be broken down. Each team has the resources and authority to release a feature without waiting for other teams. A team should expose their feature as an internal API and document how to use it. In the above example, the affiliate team should not depend on the database team to release a new version. There should be separate affiliate API and database API, with separate release schedules and documentation. If one team needs a meeting or written approval from another team to release a feature, that is a failure of an organization. 

But what if a change from one team breaks something or causes a problem? That can happen, but Jeff says breaking changes are not a problem if the change is reversible. Most tech problems can quickly be reverted, e.g., a problematic new feature can be disabled. Company coordination between many teams is needed only when the change is not easily reversible, e.g., shipping faulty devices to customers.

This is not how most companies operate and it is hard to implement. Bosses like to retain political control over many people and don’t like independent teams doing stuff without their knowledge. Developers don’t like writing internal documentation or exposing API when it takes less time to connect to a central database. And why write internal documentation when a coworker can ask you over the phone, email, or in a meeting? 

At Amazon, you need to do it because Jeff said so. And Jeff did such a good job breaking Amazon’s monolith into independent teams that he realized Amazon could expose internal APIs and documentation as public cloud services. That is how Amazon, then an e-commerce company, became a cloud computing vendor. In Q4 2021, Amazon Web Services generated 13% of Amazon’s revenue but 100% of Amazon’s profits (other divisions reinvested all profits). 

Notice that the “two-pizza teams” principle is similar to the “move fast and break things” idea. In both cases, it is more important to move fast, even with occasional problems, because most problems can quickly be fixed.

Six-page memos

In Jeff’s words, “Six-page memos are the smartest thing we ever did.

In the beginning, Amazon used the same meeting structure as other companies:

  • A meeting organizer would prepare PowerPoint slides. 
  • While slides were presented, meeting participants could interrupt and ask questions.
  • A meeting would end with a discussion and decision on future actions. 

By 2004, Jeff started to hate this structure. On a plane, he had read a 25-page paper, The Cognitive Style of PowerPoint, which explained that bullet points are a terrible way to structure arguments. That paper proposed banning PowerPoint in organizations and using written text instead. The paper had such an influence that Jeff immediately sent an email banning PowerPoint and mandating narrative memos instead. In Jeff’s own words, “writing a good 4-page memo is harder than ‘writing’ a 20-page PowerPoint because the narrative structure of a good memo forces better thought and better understanding” and “​​If someone builds a list of bullet points in Word, that would be just as bad as PowerPoint.” In other words, bullet points are lazy thinking. 

That directive was met with strong resistance at Amazon. PowerPoint was standard, and making slides was easier than writing narrative memos. Memo needs to stand on its own, as there is no presenter who can explain bullets or who you can ask for clarification. For many managers, this seemed like an overly academic approach. But Jeff insisted.

From then on, meeting organizers would prepare a six-page memo they would not distribute in advance. Instead, all participants would get and read the memo in the first 20 minutes of a meeting in complete silence. This solves a few problems:

  • Preparation: With silent reading at the beginning, all participants are on the same page. Before, people who prepared or read meeting documents in advance had an advantage in the discussion.

  • Text limit: Because reading is limited to 20 minutes, a memo can’t be longer than six pages. Tricks to cram more text, like using smaller fonts or margins, make no sense with a 20-minute limit.

  • Shareability: A memo is standalone and can be shared with people outside the meeting. That doesn’t work with slides, as they depend on a presenter explaining them.

After reading (Jeff would often finish reading the last), participants would discuss and decide on the next steps.

It took some time for six-page memos to become part of Amazon’s culture. When it did, memo writing became competitive. The best writer on the team would produce a memo, and other team members would criticize it. The book mentions that successful six-page memos were distributed around the company as examples, and writing well became a vital skill at Amazon.

Working backwards

Once two-pizza teams are formed, and the new project is defined in a six-page memo, what is the next step? A usual corporate product development process would start with implementing the product, then creating marketing materials, and ending with a press release and support documents. 

Jeff Bezos noticed problems with that order:

  • It is company-centric to start with development first. In the development phase, a company is inclined to cut corners and reuse existing solutions that are not optimal for customers.

  • Project goals are not clear to all employees involved. Even developers are not always clear on which features are necessary and which are optional.

  • The product team often develops features that the marketing team doesn’t find marketable, and lacks features that would be great for marketing.

  • If the price is left unspecified, the product team will often create a product that is too expensive.

Instead, the process should be customer-centric and start with customer needs. Jeff asked teams to work backwards, starting from a target customer. The first thing a team needs to create is a press release for the fictitious product. Industry practice is that press releases should be 300-400 words, not longer. Since that limits a product description, only key features can be listed. A press release also explains use cases, target customers, price, and availability. After approval by management, the entire team reads it and knows what they are working on. The next thing developed after a press release is an internal FAQ (frequently asked questions) that explains all things team members need to know but are not in the press release, for example: which services, resources, or hardware is required, people involved, minor features, architecture, timeline, budget, edge cases, etc. The purpose of the FAQ is that all team members understand the details and commit to it. After the FAQ is approved, product development starts.

An example given in the book: when Kindle 2 was being developed, Jeff insisted that the press release includes “Whispersync.” Whispersync is an ability to wirelessly sync books, bookmarks, and reading progress over a GSM network. Previously, customers needed a cable, a PC, a sync application, and an internet connection to move books to Kindle. With Whispersync, purchased books would magically appear on your Kindle after a purchase. And the fictitious press release stated Whispersync will be free with every Kindle. Jeff and the marketing team loved it. The internal FAQ explained that Whispersync will use Whispernet, Amazon’s custom always-on 3G connection. That put a great burden on the product team. They needed to add a 3G modem, negotiate prices with network carriers to have an affordable 3G plan, incorporate the cost of 3G in the price of Kindle and books, and develop new syncing software. If the press release didn’t insist on that feature, employees would be inclined to do what is easiest for them: keep the existing cables and sync software. The fictitious press release made solving that customer pain point obligatory. When Kindle 2 was released, journalists were delighted to discuss the revolutionary Whispersync feature.  

Measure input metrics, not output metrics

Companies often focus on what Jeff calls “output metrics”: revenue, profits, stock price, market share, etc. They are “output” because they result from many input metrics, some of which we don’t control. For example, revenue depends on the current economy and seasonality, stock price depends on bear and bull markets, and market share depends on competitors. It is silly to be proud of increased monthly revenue if it was caused by holiday season shopping. It is silly to be proud of the increased stock price if lower interest rates drive it, not your actions. 

Jeff says that the primary metrics should be “input metrics” that you have direct control over. In the case of Amazon, input metrics are inventory size, prices, delivery times and prices, etc. If you have a large inventory, low prices, and fast and affordable delivery, then the output metrics like revenue will be good. It is the management responsibility to decide which input metrics are the right ones.

Jeff believes the right product metrics make the growth flywheel spin faster. Amazon’s e-commerce growth flywheel is below:

Amazon-flywheel-C_foundit_substack.png

Metrics that Amazon has control over (and are good input metrics) are product selection, prices, and customer experience (which includes delivery experience, website experience, refund experience, etc.).

However, it is more complex than given in the above diagram. The book provides an example of how in the early days, Amazon’s input metric was the “number of product detail pages.” However, then they noticed that an increase in the number of listed products didn’t cause an increase in revenue. It turned out that the inventory team added many products that were not popular, were out of stock, or had a long shipping time. They changed the input metric to “number of product page views X percentage of items in stock that can be shipped immediately,” which accounts for both product popularity and inventory. But, the book advised against having too complicated or too many input metrics. One department had many complex input metrics, resulting in employees not understanding how to influence them. They switched to simpler metrics, so everybody could understand how to contribute to the bottom line.

Single-threaded leadership

Companies always have a shortage of good managers to lead new projects. As a result, managers often need to manage multiple projects. That is bad. The best way to make a project fail is to give it to someone “30% of their time.” If the project is important, it deserves complete focus from a manager and a team. The book gives examples of times when Jeff moved heads of highly successful divisions to work on new things that made no revenue for years. Some managers thought they were being demoted by working on unprofitable projects, only to achieve wild success after a few years.

In Amazon’s terminology, single-threaded leadership is having a single-threaded owner heading a single-threaded team; both focused on the new project alone. Without dedicated focus, employees would revert to doing legacy work, as legacy work is bringing money.

Bar raiser for hiring

At the beginning, Amazon had a high hiring bar because Jeff Bezos had very high expectations. As the company grew, they noticed that the quality of new hires varied widely. That was mostly caused by the urgency bias — a candidate who was a poor fit still got hired because there was an urgency to fill their position. A new employee often didn’t match the expectations and left after a short period, returning the company to square one.

Even without urgency bias, new employees’ quality varied depending on who interviewed them and which process they used. For example, Jeff preferred candidates who excelled at academia, even if that was not necessary for their work. Other interviewers had lower criteria, did unstructured interviews, or asked questions that didn’t predict future performance. If one interviewer was against the hire, they were still inclined to approve it if other interviewers were eager to hire.

To improve hiring, Amazon found and trained internal “bar raisers.” A bar raiser is an employee outside a team who comes as an objective 3rd party to check if a new hire satisfies minimum hiring standards. They checked the procedure was followed, structured interviews were done, and that each interviewer voted “hire” or “no hire” in writing. Each interviewer needed to write their opinion before final approval, without knowing what other interviewers thought. 

Conclusion 

I really enjoyed reading this book. No matter what I think of Jeff Bezos, I like his ideas on how to organize a large corporation to behave like a startup. If you share the same sentiment, get the book in Kindle, audiobook, or paper format (I don’t earn a commission):

Working Backwards book on Amazon

 

 

You Donate $400/Year Thanks To The Best Business Trick Ever

I’m going to tell you a story about one ingenious business model that the majority of people are not aware of. It costs average US household around $400 per year. To understand the model, you’ll need to understand three economic concepts: what the penny gap is, the razor and blades business model and Milton Friedman’s concept of there being “no such thing as a free lunch.”

Do you know what the penny gap is? If not, it boils down to this one eternal truth: people are cheap. They love free stuff and hate getting their wallets out. Even if you raise a price from free to one penny, the majority of people will refuse to pay that ridiculously tiny amount, unless they really need the product you’re selling. This obviously sucks for businesses.

Which is where the razor and blades model comes in, trying to get around the age-old problem of people being cheap. The trick is this — businesses lure customers in with some cheap product (like razors) or give it away for free. Well, “free.” Think: a free phone with a cellphone plan. The moment companies attract new customers, they then make money on the things the customers need to make the product work or via their service costs. Examples are inkjet printers and ink cartridges, phones and phone plans, gaming consoles and the games that go with them. And, of course, razors and blades — doubly so thanks to Gillette’s elaborate marketing claims of “innovative shaving technology”.

Businesses that succeed in pulling this off make so much money that even Scrooge McDuck drools over their profits. But there’s a catch — they’re going to need to lock customers in. These same companies don’t want competitors with low margins. So how do they stop someone from going to the cheaper ink cartridge shop down the road? Businesses add security modules to ink cartridges, patent blades, often lock phones to one carrier and make sure you can only use licensed games with their corresponding games console.

And sure, it’s smart. But no matter how streamlined their razor and blades model is, it still doesn’t solve the penny gap issue because customers still need to make peace with paying more money for additional products. Customers are human, which means they’re all about saving some of those dollar bills. They bellyache about the blade prices, fill ink cartridges with cheap replacement ink or switch their phone plan as soon as the contract is up. And businesses using the model may get rich, but their customers think they’re basically Satan with a tax identification number.

So what if you could hide those recurring costs? This is the ingenious part:

Indirect razors and blades model is an extension of the razors and blades model, where customers are not aware of the recurring costs, because they pay them indirectly.

Which is exactly what credit card companies do. Customers get credit cards for free. As a result, the average American owns 2.6 credit cards.

Then, every time a customer uses a credit card, there is a credit card processing fee. According to Helcim Inc’s list of interchange fees, US Mastercard and Visa credit card fees are between 1.51% and 2.95%. That doesn’t include extra fees like chargebacks or set-up fees.

Most customers don’t think about the processing fees, because they assume the businesses are shouldering those costs. However, economists know that there ain’t no such thing as a free lunch. Shops aren’t charities and they’re not going to donate money just for the hell of it. They calculate all of their business costs and then add their margin to it. Consider the following examples of “free” stuff:

  • Restaurants with “free service” which cost more more than self-service restaurants.
  • Business with “free parking” which cost more more than those without.
  • Shops in expensive rental locations which have higher prices than the same shops in cheap rental locations.

As such, the effect of processing fees on the final price depends on how many customers use credit cards. If everybody used credit cards, the average price of goods would rise by around 2%. So if you had a choice between buying a laptop with a credit card for $500 or with cash for $490, would you still opt for a credit card? Presumably most people would opt for $490 and would spend the change on lunch. But you don’t have a choice.

You’re not given that option for two reasons. Firstly, for many businesses it simply isn’t convenient to add a credit card surcharge. Secondly, even if businesses wanted to do that, surcharging everyday transactions is illegal in 10 US states. Molly Faust, spokeswoman for American Express justified their legal stance in the following statement: “We believe that surcharging credit card purchases is harmful to consumers.” How sweet of them to be so concerned for consumers’ well-being!

As a result, most businesses charge the same price regardless of whether a customer pays via cash or card. Which means all customers share the burden of credit card fees. If 50% of Acme Donuts’ customers use a credit card with a 2% fee, then the average price of donuts will be 1% higher, even for those customers who pay for their morning dose of sugar with cash. AmEx doesn’t seem to think it is “harmful to consumers” to pay a hidden fee even for customers who don’t own a credit card.

However, credit card companies invented something way better then legal pressure. What if they could motivate customers to flex that plastic all the time, even when it’s not more convenient than paying with cash?

Welcome to reward programmes like Cash Back, Points or Miles. Every time customers use the card they get a “reward”, even though the thing they get is actually their own money back, paid via higher prices. This prompts customers to use a credit card for a $5 drink despite having a $5 bill in the pocket. Unlike razors and blades, where customers try to consume less, in the indirect razors and blades model customers try to spend more. Doubly ingenious.

You can’t quibble with the results. The total credit card volume in the US in 2014 was $4 trillion — enough to “buy a Nissan Versa for every man, woman and a child”. But if that’s the total sales volume, how much do the customers pay in transaction fees after reward programs are paid out? Merchants Payments Coalition calculated that the average household in the US pays more than $400 annually in credit card fees. If customers knew in advance that they would have to pay over $400 per year, would they still use credit cards?

Which raises the question:

What can we, as a society, do about the credit card fee problem?

The Fight Club Approach

This is the radical solution your college-era socialist self would have been proud of: fighting against “evil” banks and credit card companies.

Unlikely as it sounds, this is exactly the approach taken here in Berlin. American visitors to the city are always shocked by the fact that establishments big and small refuse credit cards. Berlin is cheap and prides itself on being alternative. So it’s not exactly surprising that so many shop owners are trying to lower the costs by refusing credit cards.

There’s no disputing the anarchist charm of this. But I think that in the long run, it’s a little silly. Electronic payments are convenient and the future of currency; we can’t just ignore them.

Passing The Hot Potato Approach

This is the legal approach where countries adopt laws which limit how much credit companies can charge, in which ways they can charge, and who foots the bill.

For example, in 2014, the EU introduced legislation limiting credit card fees to 0.3% and debit card fees to 0.2%.

Similarly, in 2013, an anti-trust settlement in the US placed a cap on the fees that banks can charge merchants for handling debit card purchases. This was rolled back in 2016.

Lobbying for the law can be well-intentioned but ultimately useless: it doesn’t solve the structural payment problems. Since the cost of technology remains the same, the credit card companies just end up shifting costs elsewhere. For example, you can limit the processing fee, but the credit card companies and banks then just ask for extra money elsewhere to cover their costs, like raising or establishing set up fees and monthly maintenance fees.

A Proposal For A Modern Solution

We need to understand why fees are so high. In my opinion, it boils down to three components: high margins, high fraud rates and expensive proprietary transaction systems.

The margins in the US are set by payment networks such as Visa.  The two largest credit card companies, Visa and Mastercard, have such a chokehold on the market that they change their interchange fees twice a year, in April and October. At the end of the first quarter of 2017, Forbes cited the ten largest card issuers in America as accounting for “almost 88% of total outstanding card balances in the country.” It is extremely hard for a new company to break into this market, as major players have one key advantage: network effect. Legislation needs to be made to help smaller, more efficient competitors carve out a slice of the market.

High fraud rates is a colossal problem for current credit card technology. The Nelson Report calculated that in 2015 credit card fraud totaled $21.84 billion. But that report doesn’t take into account the indirect costs of fraud, like the costs of issuing replacement cards or the cost of prevention. In 2016, LexisNexis estimated that for every $1 of direct fraud, there is $2.4 of indirect costs. “Yeah, and? Why should I care? The fat cat credit card companies cover that cost.” Again, because of Milton Friedman’s “no such thing as a free lunch.” You’re shouldering the cost of credit card fraud via increased credit card fees. Current credit card technology is inherently insecure.

And finally, the third issue is that in order to validate credit card transactions you need to use the backend provided by credit card companies. They are proprietary, legacy systems that have no incentive to cut costs.

And because of the issues above, fees are higher than they should be. But by how much? For comparison, at the time of writing (August 2017), the average bitcoin transaction fee was 0.56%. Realistically, this isn’t the best comparison, because bitcoin architecture only rewards the miners who win the arms race for best custom hardware. Still, it is obvious that modern crypto-currency can deliver inherently safe transactions at a much lower cost than current credit card fees.

In my opinion, in order to change the status quo that has existed for the last half a century, legislators need to pass bills which address these issues. In the 21st century, electronic payments are a vital part of common infrastructure, just like roads, the postal service or the internet. And if you look back in history, there’s a particularly relevant comparison to be made. Credit card companies today can be compared to the railroad tycoons of the 19th century. After they built the railroads across the US these same corporations had the power to “squeeze out competitors, force down prices paid for labor and raw materials, charge customers more and get special favors and treatments from National and State government.” Sounds familiar, right? Sometimes two companies would compete on the same route with different track width and different train specifications, as was the case with parts of the New York subway.

What we need now is government intervention, much like how President Eisenhower’s administration introduced the national network of highways that was the US interstate project and helped solve the transport issue. Private companies built portions of interstate highways (and made a profit), but all highways were built to the same exacting standard, connected to each other in an meaningful nationwide network, open for use by all citizens, and connected parts of the country that were of vital interest.

So legislators, if you’re reading this, instead of spending time on laws which cap fees or pass them round the economy like a hot potato, consider focusing on laws which make the payment system more efficient.

The government could demand:

  • Transaction security: Future electronic payments protocols must be cryptographically safe, which will eliminate fraud costs. That’s a win for companies, and win for consumers.
  • Openness: Popular electronic payments protocols need to be open and have transparent charging.
  • Competition-friendly: Small companies should be able to connect to the open payment network and offer transaction validation service over their servers, encouraging healthy competition.

If each of the above ideas is implemented, there won’t be a need to limit credit card fees. With multiple companies competing over a modern, cryptographically safe protocol, fees will naturally go down. That old Adam Smith chestnut, the invisible hand of the market will do its job. It might even give us a thumbs up.

We’ll have laid the bricks for a road leading to innovation, not away from it. And hey, things can stay flexible. It’s no big deal if new protocols are introduced, as long as they satisfy security, openness and pro-competition requirements.

Compare that to the current situation where new players like Apple Pay and PayPal offer better technology but are still proprietary systems. If Apple succeeds in dominating the market as we know it with Apple Pay, then they’ll be the new king of indirect razors and blades model, and will probably respond to that power much as pretty much any being or organization responds to dominating a market: by taking advantage of consumers. What is even worse, governments over the world already are familiar with exactly the kind of legislation I’ve outlined above, but for different markets. Just peep regulations for energy markets or TV and radio broadcasting. They are all run by private companies, but legislators understand that electricity and broadcasting need to be run in a way that’s in the public’s interest — they grasp one basic rule of economics. Namely, that it isn’t in the public’s interest to allow monopolies.

Meanwhile, the credit card industry still seems to be stuck around the decade Mad Men was set in. And funnily enough, my nostalgia for the era of the Beach Boys doesn’t extend to how financial security was handled back then. Every time I hit a restaurant I’m worried that the waiter is going to copy out my credit card details to support his online gambling habit — because it is so freaking easy. But hey, I shouldn’t worry, because 2% credit card fee includes insurance against this current, inefficient system.

Whether or not you agree with the exact improvements I’ve outlined above, it’s clear that the system needs reform. If you agree that current credit card system is ridiculous in modern day and age, share this article.

 

Authors: Zeljko Svedic and Sophie Atkinson. Reprints are possible under the terms of the Creative Commons BY-NC-ND License, but please send us an email.