The recent unveiling of GPT Bot web sxfrom OpenAI, has sparked intrigue and debate about the increased capabilities ChatGPT. As website publishers, however, we have grave concerns about how our content may be scraped and reused to train these AI systems without our consent.
Our websites rely on advertising revenue to stay afloat. But if ChatGPT can simply scrape and synthesize the information from our sites, using GPT Bot why would anyone need to visit and view our actual content? Even if ChatGPT only provides summarized excerpts, this undermines the economic model that allows for quality journalism and content creation.
We do not object to all AI research and development. But the means don’t always justify the ends. As more advanced AI comes online, tech companies need ethical frameworks to ensure they don’t exploit legal loopholes at the expense of content creators’ livelihoods.
Scraper bots like GPTBot may identify themselves, but this is not meaningful consent. There is too much pressure to allow access or risk being left out of AI training data. And even if we did opt out, our content could still be summarized and regurgitated without attribution.
We implore lawmakers to update privacy and IP protections for the AI age. Content scraping for language models should require explicit opt-in agreements, with restrictions on how excerpted material can be displayed. Publishers should be compensated for use of copyrighted material.
The public is rightfully wary of AI’s risks. But for website publishers like us, AI also poses very real economic threats even in its well-intentioned applications. Our voices need to be part of this discussion before it’s too late.