Friday, January 29, 2010

'Census' of BitTorrent files: 99% likely infringing

Remember when we were told how peer-to-peer networks would be used for benevolent purposes, like making available the King James Bible, the works of Shakespeare, and The Odyssey? (See n.3.) Well, not so much. From a "census" of files available on BitTorrent conducted by Princeton University student Sauhard Sahi and Professor Ed Felten, a frequent critic of the entertainment industry and its copyright enforcement efforts:
Overall, we classified ten of the 1021 files, or approximately 1%, as likely non-infringing, This result should be interpreted with caution, as we may have missed some non-infringing files, and our sample is of files available, not files actually downloaded. Still, the result suggests strongly that copyright infringement is widespread among BitTorrent users.
Valuable information to keep in mind while debating net neutrality rules and ISPs' right to manage their networks and fight piracy.

See my post responding to Copycense.


  1. Good to see this kind of work done on a scientific level. I thought the part about video was particularly damning:
    "By this definition, all of the 476 movies or TV shows in the sample were found to be likely infringing."

  2. "and ISPs' right to manage their networks and fight piracy. "

    I'm sorry but an ISP has zero right to fight piracy. That is law enforcement's job. At most it is the job of the ISP to identify infringing content and provide that information to authorities.

  3. @Matt:

    That's absurd, and I'd be interested in any legal support you have for your statement that "an ISP has zero right to fight piracy. That is law enforcement's job." The vast majority of copyright enforcement is done by private parties -- not "law enforcement."

  4. @Ben:

    It is as much the ISP's right/responsibility to fight piracy as it is the US Postal Service to check every letter, and package to ensure that it does not contain any copyrighted material.

  5. @Mark:

    That may be your personal opinion, but it isn't the law. DMCA Section 512(i) makes clear that all of the DMCA safe harbors -- including the Section 512(a) safe harbor for "Transitory Digital Network Communications" systems (i.e., ISPs) -- applies only if the ISP has "adopted and reasonably implemented, and inform[ed] subscribers and account holders of the service provider’s system or network of, a policy that provides for the termination in appropriate circumstances of subscribers and account holders of the service provider’s system or network who are repeat infringers."

  6. I wonder if this study considered out-of-print works and live concert recordings to be infringing. I know some would debate that it doesn't make a difference, as these works still infringe, but to me it is an important distinction.

    Overall, these findings do not surprise me as they only considered the "trackerless" variant of bittorrent, which are totally unregulated. There are whole bittorrent communities/trackers dedicated to the sharing of works that are not commercially available, and many of them are quite efficient in enforcing that rule. They're certainly the minority, but are an example of how bittorrent can be productively employed.

  7. Are 99% of the works checked out in their little survey pron? I doubt that it is. And if not, then it is unrepresentative of the content available through peer to peer.

    Sorry to burst on your tiny little thousand file bubble.

  8. Anon at 11:37

    Doubtless there are "tons" of files available as torrents that are perfectly legit. Unfortunately, even a casual review of BT sites quickly shows that the largest number of downloaded files are not legit.

    P2P is a very useful means by which to transfer files, but its usefulness is in significant measure overshadowed by those who use it as nothing more than a way to cop a freebie.

    Of course, I assume by your post here that you are not one of those who do misuse P2P.

  9. Matt/Mark/Others of similar mind,

    ISPs want to limit P2P traffic for economic reasons, not some vague sense of corporate citizenry or moral obligation. The enormous bandwidth usage by a small number of users: 1) increases costs, 2) limits usage by other paying users, and 3) creates animosity between the generators of media and the conduit through which people access that media. In other words, they're bad for business.

    I am highly critical of some of the decisions that the recording industry has made regarding technology and piracy, even the one-size-fits-all pricing ($0.99/track for Men at Work, seriously? I can buy the whole CD for $0.25 at a yard sale). But we live in a capitalistic society where companies should be as free as reasonably possible to make the decisions that will make or break them. We only have the right to convince them that it is unwise, and critique and second-guess them at every turn...

  10. 99 percent seems like a perfectly reasonable estimate to me - Check out the volume at sites like The Pirate Bay for television shows that came out this week, brand new movies and pornography. There is no question that all of that is protected material, and it dwarfs what is available in either the questionable category (Abandonware, nostalgia stuff like old Nickelodeon game shows, etc.) or legal range. This is also referring to only what is available, and not what people are actually downloading.

  11. Actually there is question that all of it is 'protected material', I have my own recording studio, and we often upload torrents of music that we have recorded - and since my customers write their own music and lyrics they own 100% of the IP. I also have friends who are producing Web television shows, and who do the same. We usually use the 'trackerless' Pirate Bay for distribution.

    So anyone who claims all audio/video is infringing is lying.

    Now a percentage of torrents does infringe. What that percentage is, is open to debate.

  12. FYI - I'm writing a series on the biggest copyright infringers:

    I don't expect the series to make anyone very happy, but I intend to continue.

  13. The Mad Hatter -

    Nobody is claiming that "all audio/video is infringing." However, when the number exceeds 99%, which I totally believe, it is no longer "open to debate."

    By the way, having reviews your articles allegedly exposing "the biggest copyright infringers," I notice one major omission -- the average internet user. Where is your article on that segment of the population, the one who feels entitled to pilfer the work of others without permission or compensation?

  14. I wonder how much focusing on the trackerless torrents skews the data. With court decisions making it difficult to get away with hosting torrent trackers, it seems like there's a big incentive to move infringing works to trackerless torrents. I use non-infringing torrents all the time (mostly linux distros), but I don't know how to use trackerless torrents.

    If I had to guess, I'd say a lot more than 1% of files available are non-infringing, but that a lot less than 1% of torrent traffic would be non-infringing.

    Anon at 11:37, 14% of the files were pornography, and they counted only one of the files as non-infringing.

  15. Nobody is claiming that "all audio/video is infringing." However, when the number exceeds 99%, which I totally believe, it is no longer "open to debate."

  16. You should check out the Usenet case (and I'm guessing Ben can link to the summary judgment opinion if he wants to). An algorithm was set up by a statistics expert to generate a random selection of Usenet content, and then the random couple thousand files were reviewed by a copyright expert who found something like 94% to be confirmed or highly likely infringing. And I believe that of the other 6%, the majority were simply files they couldn't tell what they were or what their staus was--I think as far as likely non-infringing, they also found it was less than 1%. The expert reports were upheld by the Court as valid. So, the BitTorrent survey is not surprising.

  17. Much of the content mentioned was protected by Digital Rights Management (Technical Protection Measures is another name used). Are you trying to tell me that Digital Rights Management is ineffective?

    OK, OK, I'm joking. We know that Digital Rights Management is a bad joke, which only punishes honest buyers. Digital Rights Management is anti-consumer. I tend to be damned careful about what I buy, and avoid anything that uses Digital Rights Management whenever possible.

    Unfortunately it isn't always possible to avoid it. A while back I bought a copy of 'Animal House', a low brow comedy. Every time I want to watch it, I have to sit through five minutes of commercials for movies that were in the theaters god knows how many years ago, and which I'll probably never see unless they pop up in the remainder bin at HMV.

    There's no way to avoid watching the commercials. I would have been better of pirating it from that point of view, but hey, I'd like to think that my purchase put a little bit more money in Stephen Furst's pocket.

    At least the TV shows I buy on DVD don't usually have commercials, which reminds me that I have to order the Moccasin Flats DVD set (highly recommended Canadian TV show - adults only though, there's a fair bit of nudity and swearing).


Comments here are moderated. I appreciate substantive comments, whether or not they agree with what I've written. Stay on topic, and be civil. Comments that contain name-calling, personal attacks, or the like will be rejected. If you want to rant about how evil the RIAA and MPAA are, and how entertainment companies' employees and attorneys are bad people, there are plenty of other places for you to go.