Recently, a team in my company has been facing a chaotic situation caused by bad API designs from a payment service, giving me an idea to share the story here. Most of the details in the article have been modified due to sensitivities but the big picture is not affected.
The background, to put it roughly, is that the payment service provides a web API that supports offline transactions. After a customer has completed an offline transaction (for example, using a credit/debit card to transfer money), the customer can enter its reference ID on the merchant's website. The website then calls the offline transaction API to verify the reference ID. If the ID is valid, the website increases the user's balance.
Everything was perfect until one day the payment service added two parameters to the API without releasing change notes to callers. Simply put, the new parameters allow its callers to specify the ID of a merchant's subsidiary (if any) so the verification will fail if the reference ID belongs to another subsidiary. In order to maintain backward compatibility, both parameters are optional.
This was where the catastrophe started. For some reason, after the two optional parameters were added, malicious users found that a reference ID of the transaction that does not belong to the merchant can pass the verification. It is like the reference ID of an offline transaction to Steam can be used to increase the balance on Epic Games Store. This caused a huge loss to victim merchants because they actually did not receive the money, yet the API misled them to increase a malicious user's balance.
After a great effort of investigation and lengthy communication with the provider, the team found that the provider's API works if the two parameters are present. However, the damage has been done.
So two morals of the story:
- When adding new optional parameters to an old API, testing the new parameters working as expected is equally important as they are omitted. And yet the latter is usually not covered.
- Sometimes, a breaking change is acceptable when maintaining backward compatibility becomes expensive.
Thank you for reading this.