Ibosiola et al. (2019b)
Contents
Source Details
Ibosiola et al. (2019b) | |
Title: | A Large-Scale Empirical Analysis of DMCA Notices and Online Complaints |
Author(s): | Ibosiola, D., Castro, I, Stringhini, G., Steve Uhlig, Tyson, G. |
Year: | 2019 |
Citation: | Ibosiola, D., Castro, I., Stringhini, G., Uhlig, S. and Tyson, G. (2019) A Large-Scale Empirical Analysis of DMCA Notices and Online Complaints. Available: http://www.eecs.qmul.ac.uk/~tysong/files/Lumen-DMCA.pdf (last accessed: 11 June 2019) |
Link(s): | Open Access |
Key Related Studies: | |
Discipline: | |
Linked by: |
About the Data | |
Data Description: | The study extracts data from complaints filed on the the Lumen database in 2017, totalling over one billion URL complaints from over 30,000 senders. Thereafter metadata from the complaints were extracted to determine: website type, liveness checks, and webpage probes to determine whether HTML is mirrored elsewhere. |
Data Type: | Secondary data |
Secondary Data Sources: | |
Data Collection Methods: | |
Data Analysis Methods: | |
Industry(ies): | |
Country(ies): | |
Cross Country Study?: | No |
Comparative Study?: | No |
Literature review?: | No |
Government or policy study?: | No |
Time Period(s) of Collection: |
|
Funder(s): |
Abstract
“Under increasing scrutiny, many web companies now offer bespoke mechanisms allowing any third party to file complaints (e.g.,re-questing the de-listing of a URL from a search engine). Whereas this self-regulation might be a valuable web governance tool, it places huge responsibility within the hands of these organisations.We argue that this demands close examination. We present the first large-scale study of web complaints (over 1 billion URLs). We find a range of complainants, largely focused on copyright enforcement(DMCA). Whereas the majority of organisations are occasional users of the complaint system, we find a number of bulk senders specialised in targeting specific types of domain. We identify a series of trends and patterns amongst both the domains and complainants. By inspecting the availability of the domains, we also observe that a sizeable portion go offline shortly after complaints are generated. This paper sheds critical light on how complaints are issued, who they pertain to and which domains go offline after complaints are issued.”
Main Results of the Study
The study finds that a disproportionate amount of web complaints are DMCA notices (98.6%). Most of these notices are sent from a small but active group of complainants, with the top 10 senders accounting for 41% of all notices generated. These include large copyright holders (e.g. Fox) or trade organisations (e.g. BPI) and specialist third parties (e.g. Rivendell).Certain categories of website are more susceptible to being reported than others, with e.g. file sharing sites, blogs and adult entertainment more regularly reported than e.g. education or marketing websites. Many of the most frequently reported domains are obscure and not highly popular, with just 3% ranking within Alexa’s Top 1M.The notice system appears to be effective, with 22% of reported URLs becoming inaccessible within 4 weeks. In this respect, specialist third party complainants have more success than trade organisations, averaging a 53% vs 8% success rate when it comes to shutting down websites within a week.The study also finds patterns of “cat and mouse” behaviour between complainants and website hosts. Complainants are increasingly bulk reporting complaints, with e.g. one sender reporting over 17 million URLs for a file sharing website that only hosted 2 million pages. In response, website hosts are creating “replica” websites with multiple domain names and IP addresses, or utilising “unblocking” websites.
Policy Implications as Stated By Author
The study encourages ongoing transparency in reporting takedown processes, particularly as the current system has the potential for misuse by allowing repeat-notices and auto-generated URLs. Automation may improve and streamline the process, to which the study suggests: filtering of invalid complaints, identifying legitimate complaints, and a means of recourse for website developers.
Coverage of Study
Datasets
{{{Dataset}}}