
Sign up to save your podcasts
Or
User-generated content is at a turning point. With generative AI models cranking out tons of slop, content repositories are being polluted with low-quality, often useless material. No website is more vulnerable than Wikipedia, the open-source reference site populated entirely with articles created (and revised) by users. How Wikipedia is handling the issue — in light of its strict governance policies — is worth watching, especially for organizations that also rely on user-generated content.
Links from this episode:
The next monthly, long-form episode of FIR will drop on Monday, August 25.
We host a Communicators Zoom Chat most Thursdays at 1 p.m. ET. To obtain the credentials needed to participate, contact Shel or Neville directly, request them in our Facebook group, or email [email protected].
Special thanks to Jay Moonah for the opening and closing music.
You can find the stories from which Shel’s FIR content is selected at Shel’s Link Blog. Shel has started a metaverse-focused Flipboard magazine. You can catch up with both co-hosts on Neville’s blog and Shel’s blog.
Disclaimer: The opinions expressed in this podcast are Shel’s and Neville’s and do not reflect the views of their employers and/or clients.
Raw Transcript:
Shel Holtz (00:00)
@nevillehobson (00:08)
to churn out articles that look convincing on the surface, but are riddled with fabricated citations, clumsy phrasing, or even remnants of chat bot prompts like as a large language model. Until now, the process of removing bad articles from Wikipedia has relied on long discussions within the volunteer editor community to build consensus, sometimes lasting weeks or more. That pace is no match for the volume of junk AI can generate.
So Wikipedia has now introduced a new defense, a speedy deletion policy that lets administrators immediately remove articles if they clearly bear the hallmarks of AI generation and contain bogus references. It’s a pragmatic fix, they say, not perfect, but enough to stem the tide and signal that unreviewed AI content has no place in an encyclopedia built on human verification and trust. This development is more than just an internal housekeeping matter.
It highlights the broader challenge of how open platforms can adapt to the scale and speed of generative AI without losing their integrity. And it comes at a moment when Wikipedia is under pressure from another front, regulation. Just this month, it lost a legal challenge to the UK’s online Safety Act, a ruling that raises concerns about whether its volunteer editors could be exposed to identity checks or new liabilities. The court left some doors open for future challenges, but the signal is clear.
the rights and responsibilities of platforms like Wikipedia are being redrawn in real time. Put together these two stories, the fight against AI slop and the battle with regulators shows us that even the most resilient online communities are entering a period of profound change. And that makes Wikipedia a fascinating case study for what lies ahead for all digital knowledge platforms. For communicators, these developments at Wikipedia matter deeply. They touch on questions of credibility.
how we can trust the information we rely on and share, and on the growing role of regulation in shaping how online platforms operate. And there are other implications too, from reputation risks when organizations are misrepresented, to the lessons in governance that communicators can draw from how Wikipedia responds. So, Shail, there’s a lot here for communicators to grapple with. What do you see as the most pressing for communicators right now?
Shel Holtz (02:52)
We have talked about in past episodes, the fact that more obscure articles can have inaccuracies that will sit for a long time because nobody reads it, especially it’s not read by people who would have the right information and correct it. But by and large, it is a self-correcting mechanism based on community, which is great. It does seem that the shoe is on the other foot here because when Wikipedia first launched,
I’m sure you’ll recall that schools and businesses banned it. You can’t use this, you can’t trust it. It’s written by regular people and not professional encyclopedia authors. Therefore, you’re going to be marked down if you use Wikipedia, it’s banned. And they fought that fight for a long time and finally became a recognized authoritative site. And here they are now banning something new.
that we’re still trying to grapple with. We do need to grapple with it. The AI slop issue is certainly an issue. I worry that they’re going to pick up false positives here. Some of the hallmarks of AI writing are also hallmarks of writing. I mean, if I hear one more person say, an dash is absolutely a sign that it was written by AI. I’m gonna throw my computer out the window.
I’ve been using dashes my entire career. I was using dashes back when I was doing part-time typesetting to make extra money when I was in college. And dashes are routine. There is nothing about them that makes them a hallmark of AI. That is ridiculous. But we are going to see some legit articles tossed out with the slop. The other thing is some of the slop may have promise. It may be
the kernel of a good article, and this is a community platform, and wouldn’t people be able to go in and say, wow, this is really badly written, I yeah, yeah, I may have done this, but there’s not an article on this topic yet, so I have expertise, I’m gonna go in and start to clean this up. It’s a conundrum, what are you gonna do at this point? We haven’t had the time to develop the kinds of solutions to this issue that might take root.
And yet the volume of AI slop is huge. The number of people producing it is equally large. And you have to do something. So I think it’s trial and error at this point to see what works. And there will be some negative fallout from some of these actions. But you got to try stuff to take it to the next level and try the next thing.
@nevillehobson (05:52)
I think there’s a what I see is a is a really big issue generally that this highlights and part of it is based on my own experience of editing Wikipedia articles in a couple of cases for an organization working with people like butler inc. Bill Butler has been an interview guest on this podcast a few times, which is
The speed of things, the one memorable thing that stays in my mind about using Wikipedia or trying to progress change or addition is the humongous length of time it takes with the volunteer editor community. The defense typically is, well, they’re volunteers, they’re not full-time, they’re not employees, they’re not dedicated, they say you’ve got to be patient, they’re doing it for their own free will to help things. I get all that, I’m a volunteer myself in many other areas, but…
That’s great. But as they themselves are saying, things are moving at light speed with AI slop generation, you can’t afford to have three to four weeks where you you the the person editing is asked the community, is this good? Are you okay with this? What else? And three weeks go by before you get a reply. And often you don’t you have to nudge and so forth that to get that ain’t going to work today. So it needs something better. They have this
really interesting looking projects called Project AI Cleanup, which is got, it’s well defined in the Wikipedia, on Wikipedia what it is. They’ve also are developing a non AI powered tool called Edit Check. That’s geared towards helping new contributors fall in line with the policies. So part of the problem a lot of the time, I think is the elaborate policies and procedures you’ve got to follow.
It’s not user friendly for people who don’t know all this. And they do have a history of people not welcoming newbies to it readily. So all that’s in the background. But this is quite interesting, EditCheck, towards helping new contributors fall in line with the policies and writing guidelines. They’re also working on adding a paste check to the tool, which will ask users who’ve pasted a large chunk of text into an article whether they’ve actually written it. So it’s kind of helping that kind of focus.
I think I get what you say and I don’t disagree by the way on the discovery of things and you know there might be something good here and all this I get all that and I hope that continues but this is urgent this really does require attention and I think the one of the points in the why this matters to communicate this section is
the big one, think reputation risk. I mean, some of the research I did, this is going back a couple of years now when I was working on a particular project, was the reality that when you, let’s say as a communicator, I think about something related to your employer, your client or some work you’re doing about an organization, the first place you will go to typically is Wikipedia or the first place that shows up in the search results in that old traditional day that we’ve now passed that now, but as it was back then.
You get your, you know, above the fold screen full of results on Google. And the three things that make you feel this is this I will go to would be ideally the the organization’s website first and foremost. And it might have someone else talking about the organization’s maybe a second, the third result is going to be the Wikipedia entry. And then you have a little box on the right hand side, which summarizes everything and that’s taken from the Wikipedia entry.
So if you have not updated your Wikipedia entry or it is inaccurate, that’s what’s going to show up there. So getting this right is good. But unfortunately, that won’t work in the day of AI slot because things change so fast. And just wait till agenting AI gets on the case and you got all these bots creating content as well. I think the point about dashes and so forth, that isn’t going to stop anytime soon, I don’t think. And I believe that.
that presents a big challenge to Wikipedia where you have human verifiers checking things, where this artist has got 15 dashes. Hey, come on, that’s got to be AI generates. So all that kind of thing, they still have to figure out how to do that. I think your point about they got to do something for this. Absolutely. And this is probably the one thing they’re doing, but there’s more they still need to do that.
is likely to be quite a challenge, speed with, because of the speed how this is evolving so fast. So I think it comes down to, suppose, you the information consumer, the user of the stuff you’re finding, you absolutely need to do your own due diligence more than ever you have done before. Don’t just believe Wikipedia because, hey, it’s Wikipedia, it’s a community generated site. I hear all the stories I’m sure you have about that’s the reason why it’s no good.
because it’s community generated. I don’t buy that argument. I’ve rarely had any issues. And one thing I do use a lot myself as part of the verification is the talk pages and the history of editing pages. And you get a feel for, you know, what’s been happening here and so forth. Plus, there are services. Butler has one. There’s another one that’s the name escape me, but it was it’s owned by that that guy we interviewed in Israel who was behind that.
that will, Wiki Alerts, that will notify you whenever a page you’re paying attention to has got changes, tells you what the changes are. So, and Wikipedia itself has some pretty good analytics now and alerts and so forth and so on. So, communicators can help with this as well in reviewing stuff they know about pages and content they know about to make sure that it doesn’t require, it doesn’t have any issues that they should be concerned about. But that’s the regular climate.
You have AI literacy. communicate this need to know literally or need to know how to get help in recognizing AI generated text. There isn’t a single guide. There’s lots of people with opinions out there. Your own common sense will often help you. Pitfalls like fabricated citations, how do you really check those? Responsible use in professional context. This brings that to the forefront again, like it was originally and you mentioned.
organizations, schools, academia banning the use of all this back in the 2000s. Now we’re in the 2020s and this is becoming more urgent it seems to
Shel Holtz (12:05)
@nevillehobson (12:21)
Shel Holtz (12:32)
The other issue that I think is going to be interesting is that the quality of the output is going to only improve. And where you can tell bad writing from a bad prompt right now. Well, first of all, you’re going to have more people learn how to prompt well, which will make it harder to identify that it was done by AI, especially if somebody takes five minutes to go through it and edit it and make a few revisions. the AI is just going to crank out better stuff.
@nevillehobson (13:09)
Shel Holtz (13:25)
@nevillehobson (13:46)
Shel Holtz (13:54)
@nevillehobson (14:14)
The famous example I’ve heard so many times is probably in the very early 2000s, a British author asked Wikipedia to correct his entry because his date of birth was wrong. And they said, sorry, you’re not a neutral point of view. We can’t do that. And they refused to make the change. I mean, the absurdity of that struck a lot of people as utterly absurd. But if you actually read the policies on this, it’s not absurd at all.
Shel Holtz (15:03)
@nevillehobson (15:03)
is find a neutral point of view to him or her, tell Wikipedia and provide the source proving it, which could be, here’s a copy of the birth certificate from Somerset House or whatever it was at the time they do. it’s clear that, and again, in the context of communicators, when the first round of guides for PR people from the CIPR back in 2012.
that the need for this was apparent very, very quickly to educate PR folks particularly who did not grasp that concept of neutrality and neutral point of view. So they’d have to change a lot if ⁓ agentic AI got into the mix there. I think the more pressing realization perhaps is not agentic AI as an ally of you, not at all. It’s a tool for the bad guys to create really questionable content. You’ve never spot that.
And that’s you need another AI to pay attention. So all these things are probably in the mix there somewhere. The tool they’ve got or the project they’ve got the AI cleanup project, as they’re calling it, is advice, editing advice for community members. It’s a little light on the detail, but I haven’t drilled into all the stuff on the menu that you can go to on how to do this is actually very well thought through this.
⁓ where it goes into detail that you wouldn’t think of, know, broken links, how to fix those to make them to not be broken, finding something that does work properly on fixing common mistakes of free images in the Wikipedia, Wikimedia Commons, there’s copyright issues surrounding that. So all that’s in there too. But what strikes me at the end of it, Shell, is that there are smart people thinking about all of this and it’s great. But I’m not sure that
⁓ humans doing this alone are going to be able to grasp the speed with which this is is upon us. And that’s where I think also AI Genre, whether it’s agentic or not, I don’t know, could be a huge aid in the kind of here’s something that needs reviewing and checking fast, boom, put your AI on it’s not not a human. Although you’ve got to have the human as the kind of
ultimate arbiter or something that’s to be removed or not. So it’s a challenge for the very methodology of Wikipedia, it seems to me, and this reliance on community generated content because the bad guys, and I don’t mean this only in that there are people deliberately doing this, but the bad guys embracing bad agentic AI are moving faster than you can think of. And that’s the bigger threat to Wikipedia, it seems to me.
Shel Holtz (17:39)
@nevillehobson (17:55)
Shel Holtz (18:07)
@nevillehobson (18:11)
The post FIR #477: Deslopifying Wikipedia appeared first on FIR Podcast Network.
4.5
2424 ratings
User-generated content is at a turning point. With generative AI models cranking out tons of slop, content repositories are being polluted with low-quality, often useless material. No website is more vulnerable than Wikipedia, the open-source reference site populated entirely with articles created (and revised) by users. How Wikipedia is handling the issue — in light of its strict governance policies — is worth watching, especially for organizations that also rely on user-generated content.
Links from this episode:
The next monthly, long-form episode of FIR will drop on Monday, August 25.
We host a Communicators Zoom Chat most Thursdays at 1 p.m. ET. To obtain the credentials needed to participate, contact Shel or Neville directly, request them in our Facebook group, or email [email protected].
Special thanks to Jay Moonah for the opening and closing music.
You can find the stories from which Shel’s FIR content is selected at Shel’s Link Blog. Shel has started a metaverse-focused Flipboard magazine. You can catch up with both co-hosts on Neville’s blog and Shel’s blog.
Disclaimer: The opinions expressed in this podcast are Shel’s and Neville’s and do not reflect the views of their employers and/or clients.
Raw Transcript:
Shel Holtz (00:00)
@nevillehobson (00:08)
to churn out articles that look convincing on the surface, but are riddled with fabricated citations, clumsy phrasing, or even remnants of chat bot prompts like as a large language model. Until now, the process of removing bad articles from Wikipedia has relied on long discussions within the volunteer editor community to build consensus, sometimes lasting weeks or more. That pace is no match for the volume of junk AI can generate.
So Wikipedia has now introduced a new defense, a speedy deletion policy that lets administrators immediately remove articles if they clearly bear the hallmarks of AI generation and contain bogus references. It’s a pragmatic fix, they say, not perfect, but enough to stem the tide and signal that unreviewed AI content has no place in an encyclopedia built on human verification and trust. This development is more than just an internal housekeeping matter.
It highlights the broader challenge of how open platforms can adapt to the scale and speed of generative AI without losing their integrity. And it comes at a moment when Wikipedia is under pressure from another front, regulation. Just this month, it lost a legal challenge to the UK’s online Safety Act, a ruling that raises concerns about whether its volunteer editors could be exposed to identity checks or new liabilities. The court left some doors open for future challenges, but the signal is clear.
the rights and responsibilities of platforms like Wikipedia are being redrawn in real time. Put together these two stories, the fight against AI slop and the battle with regulators shows us that even the most resilient online communities are entering a period of profound change. And that makes Wikipedia a fascinating case study for what lies ahead for all digital knowledge platforms. For communicators, these developments at Wikipedia matter deeply. They touch on questions of credibility.
how we can trust the information we rely on and share, and on the growing role of regulation in shaping how online platforms operate. And there are other implications too, from reputation risks when organizations are misrepresented, to the lessons in governance that communicators can draw from how Wikipedia responds. So, Shail, there’s a lot here for communicators to grapple with. What do you see as the most pressing for communicators right now?
Shel Holtz (02:52)
We have talked about in past episodes, the fact that more obscure articles can have inaccuracies that will sit for a long time because nobody reads it, especially it’s not read by people who would have the right information and correct it. But by and large, it is a self-correcting mechanism based on community, which is great. It does seem that the shoe is on the other foot here because when Wikipedia first launched,
I’m sure you’ll recall that schools and businesses banned it. You can’t use this, you can’t trust it. It’s written by regular people and not professional encyclopedia authors. Therefore, you’re going to be marked down if you use Wikipedia, it’s banned. And they fought that fight for a long time and finally became a recognized authoritative site. And here they are now banning something new.
that we’re still trying to grapple with. We do need to grapple with it. The AI slop issue is certainly an issue. I worry that they’re going to pick up false positives here. Some of the hallmarks of AI writing are also hallmarks of writing. I mean, if I hear one more person say, an dash is absolutely a sign that it was written by AI. I’m gonna throw my computer out the window.
I’ve been using dashes my entire career. I was using dashes back when I was doing part-time typesetting to make extra money when I was in college. And dashes are routine. There is nothing about them that makes them a hallmark of AI. That is ridiculous. But we are going to see some legit articles tossed out with the slop. The other thing is some of the slop may have promise. It may be
the kernel of a good article, and this is a community platform, and wouldn’t people be able to go in and say, wow, this is really badly written, I yeah, yeah, I may have done this, but there’s not an article on this topic yet, so I have expertise, I’m gonna go in and start to clean this up. It’s a conundrum, what are you gonna do at this point? We haven’t had the time to develop the kinds of solutions to this issue that might take root.
And yet the volume of AI slop is huge. The number of people producing it is equally large. And you have to do something. So I think it’s trial and error at this point to see what works. And there will be some negative fallout from some of these actions. But you got to try stuff to take it to the next level and try the next thing.
@nevillehobson (05:52)
I think there’s a what I see is a is a really big issue generally that this highlights and part of it is based on my own experience of editing Wikipedia articles in a couple of cases for an organization working with people like butler inc. Bill Butler has been an interview guest on this podcast a few times, which is
The speed of things, the one memorable thing that stays in my mind about using Wikipedia or trying to progress change or addition is the humongous length of time it takes with the volunteer editor community. The defense typically is, well, they’re volunteers, they’re not full-time, they’re not employees, they’re not dedicated, they say you’ve got to be patient, they’re doing it for their own free will to help things. I get all that, I’m a volunteer myself in many other areas, but…
That’s great. But as they themselves are saying, things are moving at light speed with AI slop generation, you can’t afford to have three to four weeks where you you the the person editing is asked the community, is this good? Are you okay with this? What else? And three weeks go by before you get a reply. And often you don’t you have to nudge and so forth that to get that ain’t going to work today. So it needs something better. They have this
really interesting looking projects called Project AI Cleanup, which is got, it’s well defined in the Wikipedia, on Wikipedia what it is. They’ve also are developing a non AI powered tool called Edit Check. That’s geared towards helping new contributors fall in line with the policies. So part of the problem a lot of the time, I think is the elaborate policies and procedures you’ve got to follow.
It’s not user friendly for people who don’t know all this. And they do have a history of people not welcoming newbies to it readily. So all that’s in the background. But this is quite interesting, EditCheck, towards helping new contributors fall in line with the policies and writing guidelines. They’re also working on adding a paste check to the tool, which will ask users who’ve pasted a large chunk of text into an article whether they’ve actually written it. So it’s kind of helping that kind of focus.
I think I get what you say and I don’t disagree by the way on the discovery of things and you know there might be something good here and all this I get all that and I hope that continues but this is urgent this really does require attention and I think the one of the points in the why this matters to communicate this section is
the big one, think reputation risk. I mean, some of the research I did, this is going back a couple of years now when I was working on a particular project, was the reality that when you, let’s say as a communicator, I think about something related to your employer, your client or some work you’re doing about an organization, the first place you will go to typically is Wikipedia or the first place that shows up in the search results in that old traditional day that we’ve now passed that now, but as it was back then.
You get your, you know, above the fold screen full of results on Google. And the three things that make you feel this is this I will go to would be ideally the the organization’s website first and foremost. And it might have someone else talking about the organization’s maybe a second, the third result is going to be the Wikipedia entry. And then you have a little box on the right hand side, which summarizes everything and that’s taken from the Wikipedia entry.
So if you have not updated your Wikipedia entry or it is inaccurate, that’s what’s going to show up there. So getting this right is good. But unfortunately, that won’t work in the day of AI slot because things change so fast. And just wait till agenting AI gets on the case and you got all these bots creating content as well. I think the point about dashes and so forth, that isn’t going to stop anytime soon, I don’t think. And I believe that.
that presents a big challenge to Wikipedia where you have human verifiers checking things, where this artist has got 15 dashes. Hey, come on, that’s got to be AI generates. So all that kind of thing, they still have to figure out how to do that. I think your point about they got to do something for this. Absolutely. And this is probably the one thing they’re doing, but there’s more they still need to do that.
is likely to be quite a challenge, speed with, because of the speed how this is evolving so fast. So I think it comes down to, suppose, you the information consumer, the user of the stuff you’re finding, you absolutely need to do your own due diligence more than ever you have done before. Don’t just believe Wikipedia because, hey, it’s Wikipedia, it’s a community generated site. I hear all the stories I’m sure you have about that’s the reason why it’s no good.
because it’s community generated. I don’t buy that argument. I’ve rarely had any issues. And one thing I do use a lot myself as part of the verification is the talk pages and the history of editing pages. And you get a feel for, you know, what’s been happening here and so forth. Plus, there are services. Butler has one. There’s another one that’s the name escape me, but it was it’s owned by that that guy we interviewed in Israel who was behind that.
that will, Wiki Alerts, that will notify you whenever a page you’re paying attention to has got changes, tells you what the changes are. So, and Wikipedia itself has some pretty good analytics now and alerts and so forth and so on. So, communicators can help with this as well in reviewing stuff they know about pages and content they know about to make sure that it doesn’t require, it doesn’t have any issues that they should be concerned about. But that’s the regular climate.
You have AI literacy. communicate this need to know literally or need to know how to get help in recognizing AI generated text. There isn’t a single guide. There’s lots of people with opinions out there. Your own common sense will often help you. Pitfalls like fabricated citations, how do you really check those? Responsible use in professional context. This brings that to the forefront again, like it was originally and you mentioned.
organizations, schools, academia banning the use of all this back in the 2000s. Now we’re in the 2020s and this is becoming more urgent it seems to
Shel Holtz (12:05)
@nevillehobson (12:21)
Shel Holtz (12:32)
The other issue that I think is going to be interesting is that the quality of the output is going to only improve. And where you can tell bad writing from a bad prompt right now. Well, first of all, you’re going to have more people learn how to prompt well, which will make it harder to identify that it was done by AI, especially if somebody takes five minutes to go through it and edit it and make a few revisions. the AI is just going to crank out better stuff.
@nevillehobson (13:09)
Shel Holtz (13:25)
@nevillehobson (13:46)
Shel Holtz (13:54)
@nevillehobson (14:14)
The famous example I’ve heard so many times is probably in the very early 2000s, a British author asked Wikipedia to correct his entry because his date of birth was wrong. And they said, sorry, you’re not a neutral point of view. We can’t do that. And they refused to make the change. I mean, the absurdity of that struck a lot of people as utterly absurd. But if you actually read the policies on this, it’s not absurd at all.
Shel Holtz (15:03)
@nevillehobson (15:03)
is find a neutral point of view to him or her, tell Wikipedia and provide the source proving it, which could be, here’s a copy of the birth certificate from Somerset House or whatever it was at the time they do. it’s clear that, and again, in the context of communicators, when the first round of guides for PR people from the CIPR back in 2012.
that the need for this was apparent very, very quickly to educate PR folks particularly who did not grasp that concept of neutrality and neutral point of view. So they’d have to change a lot if ⁓ agentic AI got into the mix there. I think the more pressing realization perhaps is not agentic AI as an ally of you, not at all. It’s a tool for the bad guys to create really questionable content. You’ve never spot that.
And that’s you need another AI to pay attention. So all these things are probably in the mix there somewhere. The tool they’ve got or the project they’ve got the AI cleanup project, as they’re calling it, is advice, editing advice for community members. It’s a little light on the detail, but I haven’t drilled into all the stuff on the menu that you can go to on how to do this is actually very well thought through this.
⁓ where it goes into detail that you wouldn’t think of, know, broken links, how to fix those to make them to not be broken, finding something that does work properly on fixing common mistakes of free images in the Wikipedia, Wikimedia Commons, there’s copyright issues surrounding that. So all that’s in there too. But what strikes me at the end of it, Shell, is that there are smart people thinking about all of this and it’s great. But I’m not sure that
⁓ humans doing this alone are going to be able to grasp the speed with which this is is upon us. And that’s where I think also AI Genre, whether it’s agentic or not, I don’t know, could be a huge aid in the kind of here’s something that needs reviewing and checking fast, boom, put your AI on it’s not not a human. Although you’ve got to have the human as the kind of
ultimate arbiter or something that’s to be removed or not. So it’s a challenge for the very methodology of Wikipedia, it seems to me, and this reliance on community generated content because the bad guys, and I don’t mean this only in that there are people deliberately doing this, but the bad guys embracing bad agentic AI are moving faster than you can think of. And that’s the bigger threat to Wikipedia, it seems to me.
Shel Holtz (17:39)
@nevillehobson (17:55)
Shel Holtz (18:07)
@nevillehobson (18:11)
The post FIR #477: Deslopifying Wikipedia appeared first on FIR Podcast Network.