Explaining recommender systems to product owners
Understanding the benefits and potential biases (in media)
In my presentation at the Data Technology Seminar organized by the European Brodcasting Union, I have focused on demonstrating that recommender systems can actually help public media organizations to better fulfill their role in society and reduce content distribution biases.
First of all, I introduced Recombee, a company I have co-founded with my fellow researchers and friends. We managed to bootstrap to 40 people in a couple of years. I must say that the research is much more interesting and relevant, when you can work with real users. Especially in recommender systems, where evaluation on offline data is difficult.
From the beginning, we designed our product, recommender system as a service, to be real-time, domain agnostic and scalable to any traffic. That is why we can easily serve very diverse online platforms such job boards, e-commerce, real-estate, marketplaces. In media domain alone, you can find also quite diverse set of online platforms.
In music streaming platforms such as Audiomack, you can recommend not only songs, but also artists or genres.
In social networks such as 9gag, you need to recommend multimedia content uploaded by users, new items flow in rapidly so you need to be as fast as possible.
When it comes to video streaming platforms such as Prima or Showmax, you have to deal with different types of content, recommending series, video on demand or linear IPTV content with various restrictions.
For recommending news, there is another set of challenges no matter if you work for a private or public media organization. News homepages are probably the most watched scenarios and you have to empower editors to be able to interact with the recommender engine and give them tools to understand how their actions impact the audience.
Serving all these diverse scenarios is possible thanks to an abstraction layer that enables business rules to adjust the behaviour of the recommender engine for each recommendation request. These rules are typically constructed by product owners in the admin user interface of our service. Product owners in media have typically quite strong opinion on how the business rules should look like, but some of these rules can actually harm the user experience. That is why we work with product owners to give them more insights on what is actually recommender system doing and how the audience is impacted by different logics and restrictions encoded in business rules.
How to explain RS impact to product owners? And how Product Owners Can Measure Success of Content Delivery to Users?
Real-time data dashboards with insights that will be soon available in our admin interface enable you to answer following questions:
Content recommended to users in the last hour?
What content is recommended in which scenario?
Content that performs the best overall?
Content that performs well for a particular group of users?
For each scenario, you can monitor which content is recommended and the dashboards can be easily configured to show for example percentages of particular item segments.
Besides Recombee insights enabling realtime monitoring of your scenarios, we wanted to explain behaviour of recommender systems in the granularity of individual users and items.
For this task, we have implemented and opensourced a visual analytics tool Repsys. This tool enables product owners to answer additional questions regarding their content, audience and recommender system:
What content is similar based on user interaction patterns?
How individual algorithms help users to explore the content catalog?
How well algorithms predict the right audience for a given content cluster?
Which algorithms perform best for a given group of users? And which are biased?
The tool was developed in a joint research laboratory of CTU FIT and Recombee mainly by Jan Safarik and Vojtěch Vančura.
It enables you to explore item catalog based on the interaction similarity of items. When you explore academic datasets such as Good Reads, clusters of items are well separated and content of clusters is uniform (note that item attributes are not used when projecting items).
But when you project news articles from a real news portal, the clusters are not so nice. That is mostly because many news portals still have a static frontpage maintained by editors and users interact with content that is displayed together on a frontpage. This way, there is interaction similarity between items that would not be similar in an environment powered exclusively by a good recommender system.
In media, popularity based algorithms perform quite well, especially contextual bandits that are based not on overall popularity, but on the popularity of items for particular scenario or group of users.
However modern collaborative filtering methods promise better personalization capabilities.
When you analyze the performance of the popularity based algorithm POP on a real audience, you can see that it has worse performance on niche users.
This niche user segment is interested to read more about Belarus involvement in the Russian invasion to Ukraine.
You can actually select one particular user from this segment and observe recommendations produced by POP algorithm and EASE algorithm (colaborative filtering). Apparently, EASE is much more relevant to the interaction history of the user.
What is even more interesting, when you compare these two algorithms in Repsys, not only that EASE is much more accurate for niche users, but it also recommends much more diverse and niche content to dark blue user cluster on the right (those are cold start users and users who interacted with popular content only).
To conclude:
Editors, who decide which content to put on the frontpage based on its overall performance (in the last hour or so) are similar to bandit based recommendation algorithms and discriminate against users that prefer niche content.
Good recommendation algorithms help users to explore niche and new content.
Recommender systems can be well controlled and their behaviour explained.
Repsys tool for evaluation of recommendation algorithms will enable RS audits in future. Please contribute and give us feedback on how it works in your environment.
We aim to build a trustworthy and explainable recommender engine, let us know if you like to demo it on your website.