Mastodon.radio people.
Should I add something to out terms saying that it is forbidden to use the content of this site for training large language models/gpts/"ai"
I suspect it won't make a difference to what happens, but not having it will guarantee our posts are scraped and abused, and I'm not (personally) OK with that
@M0YNG legally it won't do anything, however you could update the robot.txt file as per openai's docs https://platform.openai.com/docs/plugins/bot
@M0YNG some mastodon instances say they delete posts older than 3 or 6 months.
@M0YNG I expect anything we publish publicly onto any web site is fair game for being scraped off, we like it or not, even if we hold the copyright. The only alternative is to move behind closed doors.
If the remote end is polite enough to follow copyright and such config, it's fine, but I'd expect Meta and other AI data hoarders ignore all of these police notices and simply do the worst they can.
@M0YNG our instance got scrapped for google IA training. I'm asking myself the same question.
@M0YNG Even if it ends up not being very effective, it’s a good idea to publish an official policy.