In the digital economy, companies are waging a fierce war over data, with data rights ownership and scraping frequently emerging as the cause of disputes worldwide. In the recent case of hiQ v LinkedIn (2019), the Supreme Court of the US vacated the preliminary injunction granted by the district court and remanded the case to the Ninth Circuit for further review. A noteworthy incident of public data scraping in the US, the case raises the question of whether social or content platforms ought to be able to bar other operators from scraping (via web crawlers or other means) and using their publicly accessible user data.
Data analysis company hiQ used web crawlers to scrape public personal data such as users’ names, positions, work experience and expertise from LinkedIn, algorithmically analysed the data and sold the results to various employers. LinkedIn served hiQ with a cease-and-desist, demanding that it stop accessing and copying data from LinkedIn’s server, and asserting that hiQ’s data scraping violated federal and state laws including the Computer Fraud and Abuse Act (CFAA). LinkedIn also took a series of steps to prohibit hiQ from accessing its website.
For its part, hiQ filed suit against LinkedIn and sought a preliminary injunction prohibiting LinkedIn from employing any legal or technical methods to prevent hiQ from accessing public data. The injunction was granted by the district court on the grounds that publicly accessible data on LinkedIn did not fall within the scope of the CFAA, and the data scraping of hiQ did not constitute “access without authorisation”. Furthermore, giving companies with massive user databases a free hand to decide who can access and use their data may lead to unfair competition.
The remand and subsequent substantive trial of the case have granted the court an opportunity to further determine the ownership of public data and the legal boundary of data scraping.
Data ownership is the starting point, as well as the focus, of all current disputes involving public data scraping. These data are generated by users, collected and controlled by the platforms and made accessible to the public. So then, who ultimately owns them – the user, the platform, or the public? There is no consensus. It has also been suggested that delimitation costs of data ownership protection are simply too high and alternatives should be considered.
Judicial practice in China
Under China’s current judicial practice, similar cases are often placed in the analytical framework of the Anti-Unfair Competition Law, which, to some extent, avoids the issue of public data ownership. Courts generally recognise a company’s claim that data obtained during its operation are valuable commercial assets and competitive resources, and therefore should be protected by law.
On this basis, the analysis around the general provisions of the Anti-Unfair Competition Law focuses on the evaluation of the legitimacy of data scraping and use, such as: The ease with which such data were obtained in the first place; whether data were processed; whether companies applied crawler protocols or technical means to restrict third-party data scraping; and whether scrapers were reaping unearned profits by selling homogeneous products. Courts have pursued this angle in both Dianping v Baidu (2015) and the more recent Douyin v Shuabao (2019), which deals with content scraping.
However, the “unfair competition” approach often highlights the rights of business operators, but neglects the role of user consent in data access and circulation, authorised or otherwise.
As owners of the original data, users sharing their information online does not mean they also accept their data being collected and used by third parties for any purpose. Users’ wishes over the whereabouts and use of their own data ought to trump the commercial interests of platforms.
In the case of Weibo v Maimai (2015), the court established the principle of “authorisation by user, platform and user again” for third-party access of personal information via open application programming interface (API). When it comes to non-personal information, however, to clarify and balance user wishes and platform interests can be a laborious task. For instance, if a user authorises a third-party platform to access his/her data, is the platform obligated to facilitate the data transfer? Or, if the platform insists on denying access or use of such data, will the scraper be acquitted due to user consent? The case of Weibo v Toutiao (2017) touched on this issue.
The hiQ case further raises anti-monopoly questions. If a platform controls data essential to the business model of its competitors, and such business model and the resultant differentiated products could benefit the public, does prohibiting others from accessing such open data potentially lead to restricted competition or even data monopoly?
The complexity of data competition cases stems from the multi-layered debates over the types, forms of display and use of data. It is also a tricky balancing act, with business interests on one side, and user choice, open exchange and sharing of data and data security on the other. Data accumulated by platforms and their hard-earned resource advantages merit protection, but we cannot afford to be overprotective and overlook the interests of consumers, operators and the public, lest it lead to unwanted data barriers. As it stands, legislators and courts have yet to settle on any definitive framework or guidelines for corporations to fine-tune their data compliance accordingly.
Wang Yaxi is a partner and Wu Yue is an associate at Yuanhe Partners