US publishers tell Common Crawl to stop scraping and delete archive
Posted by thm 1 day ago
Comments
Comment by Stagnant 1 day ago
Comment by conartist6 19 hours ago
Comment by Grimblewald 1 day ago
Comment by conartist6 1 day ago
Comment by mindcrime 1 day ago
The publishers need to rethink their entire take on how the Internet works or any "victory" they earn is going to be extremely Pyrrhic.
Comment by toomuchtodo 1 day ago
Comment by khelavastr 1 day ago
It's absurd to say "you can't record this book to a friend or robot".
Nobody seems to actually reproduce the copyrighted materials.
High-dimensional eigendecompositions which underpin AI similarity are some of the most literally derivative materials of texts that you can imagine.
Comment by conartist6 1 day ago
(my point being that it would be different if the product CommonCrawl provides were trained models, but this is not the case: its product is unlawful reproductions of copyrighted data for commercial use)
Comment by joshuaissac 1 day ago
Common Crawl is not a business and is not selling anything.
Comment by conartist6 1 day ago