#2 – Sex built on recommendations


How did you get recommendations for sex? In this episode, porn recommendations are detailed as a special case for all of us to start or improve our sexlife.


Hi everyone,

You are listening to Unexpected Data, the podcast that explores Data Science world without any taboos. I’m Yudan Lin, your Host from Vienna, Austria.

When you have a question, it is easy to ask your family and friends. But when it comes to sex, it can be hard to bring it to the table. In some cultures, you can see that a simple hug is associated with a strong physical intimacy.

So when you start your sexual journey, many trust the internet to learn more. Sometimes you are even pushed to see very explicit and graphic images, only by typing few innocent words on your search engine.

As you already guessed it, in today’s episode, we will deep dive more into sex. If you are new to this season, I recommend you episode 1 about how the industry surrounding sex is adapting to our changing needs while distancing from each other during the time of the coronavirus pandemic.

Warning: There will obviously be some content about sex, which may not be suitable for all listeners.


Before to start this episode, I would like to present you a new section called ‘My Story’.

In our modern society, breaking our own isolation and bringing more connections to each of us was never more important.

Unexpected Data has then dedicated this whole section to individual stories.

Each episode tells personal story, which allows us to understand the singularity of each person and the different cultures that surround us.

Without judgment and sometimes without name, these storytellers support us in being aligned with ourselves and with the data world that we live in.

So if you have an idea or want to share your story: contact us at hello@unexpecteddata.com


When it is available, institutional sex education is most of the time delivered in schools (1). From my experience, teachers let us know about the anatomy of our bodies and provide knowledge on preventing potential risks associated to sexual relationships. As it is not usual to talk about it, you can picture the situation with agitated and giggling teenagers covering the serious professional lecture.

But what about how to behave in an intimate situation with your partner? What are the practical advices or recommendations in such situation?

In a time where the media is sexualizing everything, we still turn to the internet for learnings. Google knows everything, right?

And few types or clicks away, we are exposed to porn.

The word “pornography” comes from a Greek combination of “writing” and “prostitutes” (2). So it all started with drawings and the most famous collection of it is India’s Kama Sutra. With the development of art in the 19th century, porn was also the topic of pioneering ‘art studios’ in Paris. Then when the technology became more affordable, this type of art moved to VHS. Some even say that the porn industry played a crucial role in the victory of JVC’s VHS over Sony’s Betamax. Then progressively, sexual content was or is still available on TV on Sunday evening. Now it is all over the internet. Some even support the idea that the appetite for porn helped to democratize faster connections and develop online technologies and businesses (3).

A big player in this field is MindGeek. Does it sound familiar?

On the company website (4), they present themselve as the leading technology company in Web design, IT, web development and SEO. But they are probably best known as the owner of PornHub and several other top adult content websites.

And as a data scientist at MindGeek, you will not see porn all day long. Like in any tech company, data scientists are asked to get insights from billions of data to improve business operations and user experience. From a MindGeek job posting (5), the different Data Science projects include topics like content recommendation engines, ad bidding systems, credit card fraud detection, and computer vision tasks. So, when it comes to data career, there are more than the GAFAM, meaning Google, Amazon, Facebook, Apple and Microsoft. For sure, there may be contact with sexual content as PornHub’s yearly report’s data can show.

This said, what is a recommendation engine or system?

When the number of offer and the competition are high, us consumers are overwhelmed. Not only the companies are fighting for our attention but we are also paralyzed by or tired of so many choices. The content or product that really fits to us doesn’t emerge from this ocean of noises. This is what the psychologist Barry Schwartz called the paradox of choice (6).

Finding a balanced and satisfying equilibrium for all of us is possible. The notion of reference point, guidance and semi personal assistant is offered to us by using recommendation engines. By understanding better, the interests and needs of the consumers via data, the companies can show the wanted product or service to specific consumer. On top of it, the analysis of those tracked data can support iterative improvement of their offers.

Amazon has democratized the basis of recommendation engine in the last century with the famous ‘You may also be interested in’ section. Those recommendations are certainly very useful to manage the products ‘stock, bring more money by selling more and more traffic by catching your attention.

With the rise of social media, these tools were fine-tuned with the accumulation of likes and other data from billions of people which have ‘nothing to hide’. Facebook has then one of the biggest and closed advertising platforms which uses this type of engine to decide what goes or not to your feed. And when it comes to sex, MindGeek listens also to its visitors and members with a chirurgical precision. All your views behaviors like pause, playback, drops, downloads), your navigation behaviors (search keywords, pageviews, etc) and interactions (comments, likes, etc) are tracked and analyzed to refine their products and services.

To make it simple, you can find 3 main algorithms or rules which lead the recommendation engine (7):

The first one is based on the concept of popularity. For example, it could be the sex videos that are most clicked or viewed based on some criteria and you can be proposed the top 10 videos in your country. This means that users are all shown the same content. For more clicks or views, there is a prominence of eye-catching content, which may not be aligned with the way people want to have sex at the specific moment of their life.

The second one is based on your profile and your video history. This algorithm assumes that the video that you have seen will also be watched by others with similar content preferences or similar socio-demographic characteristics. This means that a profile and a watchlist of each user is created before any recommendation. With the assumption that there is representativity in the data, likelihood models are used to put a guess on what profile or watchlist will fit to first time visitor of the platform. With time, history of digital activities and improvement of the algorithm, this profile and watchlist will become more and more accurate.

With this second algorithm, diversity and variety of content can never surface upfront and visitors may need to search for it themselves. One good example is the rise of amateur content as top searched category in PornHub report of 2019. In the long term, we can all be suggested the same content dictated by the majority of likes. Sometimes sex beginners may even think that this is the new normal of sexual relationships.

The last but not least one is based on the content itself. You have watched videos on PornHub and the only common point is the performer. In this case, the recommendation algorithm will rely on the basic assumption that you have watched all these contents because of this specific performer. So, with this understanding of the user mindset, it will then suggest you all the videos where your favorite performer appears. This implies that the video has metatags/keywords content data where the performers are listed.

Without perpetual monitoring, maintenance of these algorithms, the risk to sink even deeper into a single sex norm or to propose an offer that doesn’t grow with the visitor’s interests and life is permanent.

The different websites and for sure the recommendation algorithms can also be improved by linking these metatags with other user filtering preferences and search keywords. 

As THE most known porn content provider, MindGeek can even set the tone, trend and innovation in sexual behaviors. One good example of data powered content success is Netflix show ‘House of cards’. It was released without any pilot and matched its viewers preferences in terms of actors and storyline. So for sex content, we can image the potential when combining their data with other source from the GAFAM to telecommunications services.

So think about your data, next time that you log in with Facebook or Gmail and their unified system.

It is well known that peer recommendations are more valuable to us than the ones of a machine, So direct feedback in PornHub comments, from sextech competitors or potential websites like Rotten Tomatoes for movies can offer a potential auto-regulated recommendation system. And it is also important to note that the advices or links shared by your ‘sex expert’ friend can make you do more actions than any recommendation engine can yet.

It is without saying that sexuality is more than just about sex. So from my point of view, it is the responsibility of the data experts and by extension, the product experts to provide enough diversity, inclusion and variety of content about the way people have sex and are confident about their body singularity. And more communication about it supports self-empowerment, informed decisions and better health with lower STD infections rate (1). Fortunately, there is a growing variety of offers to cover it.

To accelerate their business and collect data, it seems that smaller sextech players than PornHub have no choice than to partner with TechGiants (8). I can think about Google which offers tools to understand and target users. And even if they can create their own data science eco system, the expectation in terms of ethics, confidentiality, integrity and use of the data are for sure very high and must be taken in account in the building phase.

In this race of using user data more efficiently and choice guidance with recommendation engine, some questions still are open: What sex vision do we want to achieve together? What level of dialogue can be reached between companies and users when it comes to sex? One thing is sure, the challenge for anyone in the digital world is to understand the different advantages of recommendation engine and feedbacks mechanisms and adapt its strategy to be more aligned with its users and appear as a reference in its domain.

Congratulations! By listening to this entire episode, you are actively breaking the long-lasting taboo of having a single sex norm. In the same time, you are helping giving awareness about the application of Data Science in your daily life.

If you like this show, I’ll be delighted if you can head up to Podchaser and leave a review. And do let us know what you like in the show and in particular what you find really usefull or entertaining. Let us also know what would could change as it will help grow this podcast.

Find out even more about Sex and Data Science in our next episode.

When the waiting time is unbearable, I’m inviting you to follow us and contribute with us on UnexpectedData.com or wherever you listen to your podcast.


  • Unexpected Data Podcast is a creation and production of Yudan LIN.
  • Music: Focus by A. A. Aalto is licensed under a Attribution-NonCommercial 3.0 International License.
  • Image: Engin Akyurt

2019 -2022 © Unexpected Data. All rights reserved.