2019年11月5日 星期二

The Privacy Project: How personalization has failed us

Curation by algorithm hasn't lived up to expectations.


The Privacy Project

November 5, 2019

Charlie Warzel is off this week, so he's turning the newsletter over to his colleague Thorin Klosowski.

Olly Curtis/Future, via Getty Images

By Thorin Klosowski

Do you remember the moment you first tried a streaming service like Pandora? It felt like magic. You could type in a band name — nearly any band name — and Pandora created a radio station of similar artists. It was a new era of machine curation, where you could skip the dusty record stores and have new media funneled straight into your brain. Now everything is so curated that it's difficult to find content that's truly surprising.

Recommendation algorithms (also called curation algorithms) have been a staple of online services for decades. These formulas operate on a simple premise: They collect data about your habits, compare that data with other people's habits, then recommend items based on that data. Allowing media companies to track your viewing, reading and watching habits seems like it offers a clear trade-off: Agree to the surveillance and you get exposed to new media.

The modern era of recommendation algorithms kicked off somewhere around 2010 with the launch of Facebook's now defunct "Instant Personalization" feature. It collected your data across different services into one package for advertisers to market to you. In exchange, you'd get better recommendations for everything, from restaurants to music. Instant Personalization was one of Facebook's first big privacy overreaches, and we later learned it gave third parties like Spotify, Apple and Amazon access to more data than people originally agreed to. Now all kinds of businesses take that data and package it for advertisers.


Behind every "you might also like" recommendation is an algorithm built on data you've provided. This includes the obvious stuff, like your viewing or listening history, but it may also factor in your age, location or gender. These algorithms are all a little different. For example, Netflix considers some surprising factors, like the time of day, the devices you use and how long you tend to watch. Spotify builds its recommendations by logging what you listen to, funneling that through a genre classification system, then pulling in songs from playlists from other users with similar tastes.

Spotify's complicated algorithm struggles to push the boundaries of your own habits. Listen to a track from Nine Inch Nails and you'll get more Nine Inch Nails on your algorithm-generated Discover Weekly playlist. Maybe it'll toss in something similar sounding, but it's just as likely to throw in a random pop song from the '90s. If you go too off course and listen to a jazz playlist followed by some metal, the whole thing breaks down and you're served up a nonsensical playlist for a week. Even in the best-case scenario, the experience is transactional, and without the thrill of self-discovery — part of the appeal of seeking out new media — the recommendations feel cold.

Movie streaming services often use the simple-seeming "people who watched this also watched" algorithm. These formulas are as likely to recommend something new as they are to solidify a stereotype. Netflix does this all the time, and as The Outline points out, it's terrible at pushing you toward new movies. Most likely because of the feedback loop of movie availability, Netflix always seems to recommend Netflix-produced movies. Netflix at least provides a way to sort movies alphabetically by genre. If you've never dialed all the way down to the A-Z listings by genre in Netflix (or anywhere else), I highly recommend it. Almost every time, I find great movies the algorithms ignore.

If you want to see the outcome of too many algorithms, just head over to any Amazon product page, where you'll get blasted with categories like "Inspired by your recent shopping trends," "sponsored products related to this item," "frequently bought together" or "Customers who viewed this item also viewed." Amazon loves to use popularity-based algorithms, where whatever's selling well in the moment gets pushed to the top. This has an unexpected effect of sometimes pushing fringe ideas into the spotlight. These popularity-based suggestions don't improve the shopping experience either, because popularity often has nothing to do with quality.


Our tastes are rarely simple enough for an algorithm to make sense of. You may love the Beatles but hate the Rolling Stones. You might despise William Gibson but adore Neal Stephenson. You may be a fan of Deep Impact or Armageddon, but (probably) not both. Our preferences about what we like are often arbitrary, disjointed from the logic required for an algorithm to function. Over at Vox, Kyle Chayka discusses this in regard to fashion, asking, "How good of a tastemaker can a machine ultimately be?" These recommendations rarely go past the boundaries of what we already know and enjoy, and that's not usually how we find our next favorite thing. There's no risk factor with curated media.

Who cares if Amazon doesn't seem capable of recommending a book I want to read? Or if Netflix can't suggest a movie to watch? It's about where the data goes. If the data is sold to data brokers, it's used to create a broader image of us for advertisers. This trade-off is easier to stomach when there's a benefit, but if you don't enjoy the curation, it's just free data for companies to exploit.

It would be nice if algorithms were more transparent, so we could see why they make certain recommendations. Spotify and Netflix do a good job explaining how their systems work, but others could follow suit. It's also time to give us more control over this data and where it goes. Spotify allows you to opt out of third-party ads, but it could do more to empower you to control the flow of your data. In the best scenario, we could opt out of these systems completely, asking companies to delete the data if we don't want the curated recommendations.

Send me your thoughts at privacynewsletter@nytimes.com. Your responses may be shared in an upcoming edition of this newsletter.


From the Archives: Recommendation Algorithms in the '90s

Concerns over privacy and algorithms isn't new. I found mentions back to 1999 of people raising privacy issues:

Some users may feel their individuality challenged by the idea that others share their preferences; that something as personal as taste can be rendered in cold, hard algorithms. But the greatest concern is privacy.
"This is a major issue that deserves public discussion," Professor Maes said. The more an agent knows about you, the more useful it can be.
But while some shopping bot companies promise not to sell information to other companies, others give away their software and plan to make money from selling the information they gather on shopping patterns.

That same article goes on to point to an early proposal to protect privacy:

Firefly hired an accounting firm to certify its privacy practices, and Professor Maes was an early backer of proposed industry standards for privacy. The so-called Platform for Privacy Preferences protocols are under discussion by the nonprofit World Wide Web Consortium and enjoy the support of Microsoft, MCI WorldCom, AT&T, I.B.M., Netscape, Oracle and other companies.
But even if these standards — which involve encrypting information and obtaining permission from consumers to transfer personal information — become accepted, online shoppers may not feel comfortable.

The goal of that project was to shift the responsibility of privacy away from the user and over to the browser, but a variety of issues kept it from taking off.

What I'm Reading

The Privacy Project newsletter from Oct. 29 misidentified the founding father who used the pseudonym Vindex the Avenger. It was Samuel Adams, not John Adams.

Need help? Review our newsletter help page or contact us for assistance.

You received this email because you signed up for The Privacy Project from The New York Times.

To stop receiving these emails, unsubscribe or manage your email preferences.

Subscribe to The Times


Connect with us on:


Change Your Email|Privacy Policy|Contact Us

The New York Times Company

620 Eighth Avenue New York, NY 10018