The researchers started by using a single-agent reinforcement learning (SARL) algorithm, which allows for the use of feature information such as page views, trends on social media, and other data to generate an inventory replenishment model. They then moved on to a multi-agent reinforcement learning (MARL) algorithm, which more fully captures the interplay between retailer and supplier.
“For example, how would the supplier respond if the retailer places a large order during a lockdown?” Xin says. “The [MARL] model provides an estimation of lead time and fill rate—as in, lead time would increase by 30 percent and fill rate would drop by 20 percent if the order volume is twice the normal.”
In all cases, the algorithms were not simply suggesting inventory replenishment strategies that humans could override. They were making final decisions.
Which came out ahead, the humans or the algorithms? The answer: both.
Algorithms, the researchers find, are optimal for managing slow-moving inventory—products that are important but may not have constant sales, as opposed to fast-moving items that sell every day. As Xin puts it, toilet paper is a fast-moving item, while a toilet brush is slow moving.
“It’s important to keep slow-moving items to offer customers more options,” Xin says. But such items contribute less to daily gross sales, and demand forecasting is time consuming. Managing this inventory with algorithms can free up human buyers to focus on the top-selling items.
Before the implementation of algorithmic product replenishment, Xin says, one human Tmall buyer managed up to thousands of products. “It was a tedious process and easy to make mistakes, not to mention many of these products rarely had sales but still took a nontrivial amount of a human’s time,” he says. “Now [the number of products managed] is reduced by 70 percent, and all slow-moving products are managed by algorithms.”
Indeed, more than half of all Tmall products were being replenished by algorithms by the time the study ended in November 2021. The financial impact: for goods with $10 million in daily gross merchandise value, the use of SARL algorithms could improve the company’s bottom line by $19 million annually, while MARL ones could boost it by $31 million, the researchers argue.
What’s the key to algorithms outperforming in a crisis?
Typically, when large orders are placed, inventory levels rise and the out-of-stock product rate decreases. However, in this case, the products managed by humans had both higher orders and higher sellouts. The main reason for this, Xin says, is that lead time also increases for large orders during major disruptions. When all retailers need inventory, those who want more have to wait longer. This results in more product sellouts. Algorithms are better able to catch the fluctuation in lead times and fill rates during major disruptions and thus eliminate the human tendency to panic buy.
None of this means algorithms will be taking away all human buyers’ jobs anytime soon. Human communication with suppliers is still critical. For example, suppliers may share inside information that can’t be captured by an algorithmic model, and suppliers may find it “weird” if there is no person on the other side of a transaction, Xin says. Additionally, products that require frequent promotion and price changes can’t be managed by algorithm alone because this requires coordination with other business areas, such as marketing.