Data Scraping for the Training of Generative AI: Lessons from Chinese Case Law and Regulation

Research output: Contribution to journalArticleAcademicpeer-review

Abstract

The collection of data from websites at great scale–so-calleddata scraping–is the foundation for ChatGPT and most otherGenerative AI (GenAI) tools. Much of the previous discussion onthe regulation of GenAI has focused on the US and EU and notso much on more technical aspects like data scraping. In re-sponse, this article focuses on the regulation of data scraping tobuild and deploy GenAI in China, and reviews applicable regu-lation and case law. We find that the sectoral approach to AIregulation in China provides important insights into balancingtechnological progress and societal values, diverging from thelaissez-faire attitude in the US and the horizontal approach withthe AI Act in the EU.
Original languageEnglish
Pages (from-to)33-41
JournalComputer Law Review International
Volume25
Issue number2
DOIs
Publication statusPublished - 15 Apr 2024

Cite this