Meta’s copyright infringements a 'blatant disregard' for authors’ rights
An article published in The Atlantic relating to Meta’s use of the Library Genesis data set has revealed unlawful scraping of copyright-protected material.
The NUJ has expressed its dismay and condemnation following publication of court documents revealing Meta trained its AI model Llama 3, using material in Library Genesis (Lib Gen) containing authors’ copyright-protected works. Authors of books in the library including Sarah Silverman and Junot Díaz have brought a copyright-infringement lawsuit against Meta.
An investigation by The Atlantic references discussions by the company's employees considering whether to use licensed material but instead opting to unlawfully train its model after allegedly gaining consent from Mark Zuckerberg, Meta CEO. One employee is quoted as stating a licensing deal is “unreasonably expensive” and another senior manager noting it as a slow process.
It is unclear which parts of LibGen were used by the technology company to train Llama 3; it is also believed Open AI has previously used the library, although the company states “The models powering ChatGPT and our API today were not developed using these datasets. These datasets, created by former employees who are no longer with OpenAI, were last used in 2021.”
The NUJ submitted a response to government's Copyright and Artificial Intelligence consultation last month. It received over 11,000 responses including submissions from across the creative industries.
Laura Davison, NUJ general secretary, said:
“Meta’s copyright infringements are a blatant disregard for authors’ rights and will continue at scale unless tech giants face consequences for their actions, through enforceable legislative and regulatory standards. We need greater enforcement of copyright law to ensure the works of authors, journalists and freelance creators are protected.”