Wikipedia’s lessons about collaboration

Collaboration is a key component of human activity, in fields as diverse as scientific inquiry, news reporting, the arts, and government. When we work together effectively, we can accomplish big things. We should always seek to improve our understanding of what conditions support effective collaboration. Wikipedia, I believe, holds many of the answers — not in the content of its encyclopedic articles, but in the story of its genesis and growth.

Wikipedia has supported an unprecedented level of collaborative activity in its first two decades. What conditions have permitted the site to become, and remain, such an integral part of our information landscape? That is the question I explore in an essay, “Trusting Everybody to Work Together,” for a forthcoming book celebrating Wikipedia’s 20th anniversary. I review some of the early discourse that drove Wikipedia’s software design. I propose that the proper mix of eight specific, mutually supporting software capabilities has played a significant role, and I argue that closer consideration of these software capabilities should inform future software design, both in the Wikipedia world and beyond.

The Signpost just published the essay. Please take a look.

Posted in core, governance, history, leadership, wiki, Wikimedia Foundation, Wikipedia | Leave a comment

Merry Christmas! Simple things anyone can do for improved digital hygiene

The holiday season is a great time to talk to loved ones about how we can all improve the ways we use technology and the Internet.

There are many areas of concern around protecting your interests and information online. It gets worse every year; but there are also ever-expanding ways to control what digital fingerprints we leave in various places as we live our lives. (I don’t get into all those details here. For background, I suggest the December 20, 2018 episode of Preet Bharara’s podcast, “Stay Tuned,” in which he interviewed tech journalist Kara Swisher.)

Here are a few things you can share with family and friends to help them increase their online privacy, agency, and safety in an increasingly complex and dangerous information landscape. Continue reading

Posted in Beginner how-to, Free licenses, How-to, journalism, Statements of Ethics, Terms of Use, wiki, Wikipedia | Leave a comment

WikiCite conference: Wind in the “Newspapers on Wikipedia” sails

The WikiCite 2018 conference in Berkeley, California was an exciting meeting of the minds. There were a number of good developments for the Newspapers on Wikipedia (NOW) campaign. Here, I’ll recap those that stood out to me, as well as a few points that are unrelated to NOW. (Most of the talk videos linked below are very short, 1-3 minutes.)

This was the third annual WikiCite conference. WikiCite is an initiative to ensure that citation data (broadly defined, including publications, articles, authors, publishing houses, etc.) is well represented as open data on the web. (See also my recent post on Wikimedia Executive Director Katherine Maher’s keynote talk.) WikiCite has a great deal of overlap with NOW; though the primary focus of NOW has been prose on Wikipedia, but we have been improving Wikidata in parallel, and we can see the increasing importance of structured data to our project’s broad goal of making information about newspapers more accessible. Continue reading

Posted in edit-a-thon, events, journalism, wiki, Wikidata, Wikipedia, Wikipedia and education | Leave a comment

Katherine Maher on Wikimedia’s evolving strategic priorities: Reflections on her WikiCite 2018 talk

Katherine Maher. Portrait by Gerald Shields, licensed CC BY-SA 3.0.

I relished the opportunity to hear Katherine Maher, executive director of the Wikimedia Foundation, speak about strategic priorities at the November 2018 WikiCite conference. The organization has had its ups and downs in strategic planning over the years. This was my first glimpse of the products of recent strategic planning efforts; an opportunity to learn how the organization’s thinking and approach are evolving. Some of it was encouraging; some, less so.

The foundation hosts and supports Wikipedia, one of the world’s top websites, and up-and-comers like Wikidata and Wikimedia Commons. Hundreds of thousands of volunteers, however, build and own the sites’ contents, and have no formal affiliation with the organization. The foundation’s annual budget has grown tenfold over ten years; it now approaches $100 million. But there are things money can’t buy. The organization has often struggled to maintain strategic and tactical alignment with the values that drive its volunteer contributors; and those challenges have frequently come to a head around partnerships with other organizations, and around software development. Maher addressed both areas in her talk.

WikiCite, a conference that attracts both Wikimedia experts and library and data professionals, and that focuses heavily on technology and planning for the future, was a natural venue for Maher to debut a talk like this. Below, I’ll recap the major themes, interspersed with my own reflections.

Thanks to Andrew Lih, a video of the full talk, entitled “Essential Infrastructure of the Ecosystem of Free Knowledge,” and the Q&A session is available under a free license at YouTube.” The commentary below contains time markers throughout.

Continue reading

Posted in core, events, governance, leadership, Statements of Ethics, systemic bias, wiki, Wikidata, Wikimedia Foundation, Wikipedia | Leave a comment

Proposal for funding from the AI Ethics Challenge

I submitted the following project proposal for the Artificial Intelligence Ethics Challenge:

Unpacking Wikipedia’s Lessons for Journalism

In its 18 years, Wikipedia has established itself as a titan of the Internet, built on a model utterly different from other top websites; this project will help journalists understand it better, and empower them to better access its data, and to draw effective contrasts with other top websites. Continue reading

Posted in Uncategorized | Leave a comment

A proven innovation could benefit Facebook’s users—and its shareholders, too.

Concern about social media and the quality of news is running high, with many commentators focusing on bias and factual accuracy (often summarized as “fake news”). If efforts to regulate sites like Facebook are successful, they could affect the bottom line; so it would behoove Facebook to regulate itself, if possible, in any way that might stave off external action.

Facebook has tried many things, but they have ignored something obvious. It’s something that has been identified by peer reviewed studies as a promising approach since at least 2004…the same year Facebook was founded.

Instead of making itself the sole moderator of problematic posts and content, Facebook should offer its billions of users a role in content moderation. This could substantially reduce the load on Facebook staff, and could allow its community to care of itself more effectively, improving the user experience with far less need for editorial oversight. Slashdot, once a massively popular site, proved prior to Facebook’s launch that distributing comment moderation among the site’s users could be an effective strategy, with substantial benefits to both end users and site operators. Facebook would do well to allocate a tiny fraction of its fortune to designing a distributed comment moderation system of its own. Continue reading

Posted in core, governance, history, journalism, leadership, User experience, wiki, Wikimedia Foundation, Wikipedia | Leave a comment

“Open” everything, and minimal financial needs: Wikipedia’s strengths

What insulates Wikipedia from the criticisms other massive platforms endure? We explored some answers—core values, lack of personalization algorithms, and lack of data collection—in last week’s “How Wikipedia Dodged Public Outcry Plaguing Social Media Platforms.”

But wait, there’s more:

Wikipedia moderation is conducted in the open.

“The biggest platforms use automated technology to block or remove huge quantities of material and employ thousands of human moderators.” So says Mark Bunting in his July 2018 reportKeeping Consumers Safe Online: Legislating for platform accountability for online content.” Bunting makes an excellent point, but he might have added a caveat: “The biggest platforms, like Facebook, Twitter, and YouTube, but not Wikipedia.Continue reading

Posted in core, governance, history, journalism, leadership, Statements of Ethics, wiki, Wikimedia Foundation, Wikipedia | Leave a comment

How Wikipedia dodged public outcry plaguing social media platforms

Everybody has an opinion about how to govern social media platforms. It’s mostly because they’ve shown they’re not too good at governing themselves. We see headlines about which famous trolls are banned from what sites. Tech company executives are getting called before Congress, and the topic of how to regulate social media is getting play all over the news.

Wikipedia has problematic users and its share of controversies, but as web platforms have taken center stage in recent months, Wikipedia hasn’t been drawn into the fray. Why aren’t we hearing more about the site’s governance model, or its approach to harassment, bullying? Why isn’t there a clamor for Wikipedia to ease up on data collection? At the core, Wikipedia’s design and governance are rooted in carefully articulated values and policies, which underlie all decisions. Two specific aspects of Wikipedia inoculate it from some of the sharpest critiques endured by other platforms.

Wikipedia exists to battle fake news. That’s the whole point.

Wikipedia’s fundamental purpose is to present facts, verified by respected sources. That’s different from social media platforms, which have a more complex project…they need to maximize engagement, and get people to give up personal information and spend money with advertisers. Wikipedia’s core purpose involves battling things like propaganda and “fake news.” Other platforms are finding they need to retrofit their products to address misinformation; but battling fake news has been a central principle of Wikipedia since the early days. Continue reading

Posted in core, governance, journalism, Statements of Ethics, Uncategorized, wiki, Wikipedia | Leave a comment

A peek behind the scenes in the $289M Monsanto verdict

Both sides in a similar Monsanto lawsuit tried to bend Wikipedia to their will

Image by Tori Rector, licensed CC BY-SA

On August 10, a jury hit Monsanto with a $289 million verdict, the latest in a string of lawsuits linking the agrochemical giant’s products and byproducts to cancer.

You may have heard that much; but you probably don’t know that, in the buildup this series of lawsuits, representatives of both the plaintiffs and the defendant worked to change Wikipedia content to favor their respective positions. The Wikipedia articles they worked on — non-Hodgkin lymphoma and polychlorinated biphenyl (PCB) — are viewed upwards of 75,000 times a month.

By the time a communications firm representing plaintiffs suing Monsanto sought advice from my business, Wiki Strategies, in 2014, they were entangled in a slow-moving “edit war” on Wikipedia. On the opposing side of the edit war: a Wikipedia user who asserted he was the social media team lead for Monsanto.

The central issue in that case was whether or not PCBs cause cancer. It’s exactly the kind of thing Wikipedia editors want to get right; people often turn to Wikipedia for scientific information, and despite Wikipedia editors encouraging readers to go to the sources before forming strong opinions, we know that they sometimes take Wikipedia’s word at face value. After advising our prospective client that we could not guarantee a favorable result, and that our goal must be to ensure that Wikipedia’s content adhered better to the scientific consensus on the matter (regardless of how well that matched the plaintiffs’ position), we gladly took on this new client, confident that the project would have a positive impact on Wikipedia. Continue reading

Posted in Beginner how-to, conflict of interest, core, governance, government, How-to, paid editing, wiki, Wikipedia | Leave a comment

“According to Wikipedia…”


Folksinger Odetta holding a guitar and singing

Folksinger Odetta performed “This Little Light…” on David Letterman’s first show after 9/11. Photographed by Jac. de Nijs / Anefo in 1961; public domain.

This week, I heard a wonderful news story about the song “This Little Light of Mine.” It was a thoughtful, in-depth exploration of a beloved piece of Americana—and exactly the kind of news story any Wikipedia contributor (myself, for instance) would be tempted to include as a citation in the relevant Wikipedia article.

Unfortunately, one small detail shakes my confidence in the reporting, and raises a general concern about the way even the most reputable news sources, like NPR’s “All Things Considered,” treat the sources they use.

About halfway through the story, the reporter delves into the origin of the song, stating:

“Wikipedia and some books credit Harry Dixon Loes.”

The problem: Wikipedia, whose production model differs significantly from traditional publishers, is not a distinct entity, and as such cannot itself issue “credit” or make claims.

Diagram showing arrows pointing in both directions between "Wikipedia" and "Press"

Graphic by Niabot/Wikimedia Foundation, licensed CC BY-SA.

While the content on Wikipedia may offer great value on the whole, the site can’t be relied on to verify any specific point. You may have heard, for instance, of the John Seigenthaler hoax, in which a Wikipedia contributor falsely linked a former U.S. attorney general to the deaths of JFK and RFK. Wikipedia contributors strive to correct false or questionable information, and in many cases do an admirable job; but as the Seigenthaler case clearly shows, the process can fail spectacularly. But as a core principle, Wikipedia’s designers have always insisted that “anyone can edit”—without first establishing credentials, or even proving their worthy intentions. Many of the contributions that come in are worthwhile. Over time, problematic additions are caught and corrected by other Wikipedia contributors; but by design, the site lacks any mechanism to fully guarantee the accuracy of any specific claim.

It is essential to regard Wikipedia as a platform, enabling individuals to make claims and back them up with authoritative citations, rather than as a publisher using robust editorial processes and offering its claims as reliable facts. Crediting an individual with adding a claim to Wikipedia is fine (though digging up the proper attribution may take some effort); using Wikipedia as a guide to find more reliable sources, and then citing those sources directly, is even better. But crediting Wikipedia itself is problematic, and is something any journalist should avoid, unless mentioning it as one link in a chain leading to a more authoritative source. At best, crediting Wikipedia imparts information with little value to the listener; at worst, it gives the listener undue confidence in a piece of original research that may or may not be true, and suggests to the listener that NPR erroneously regards Wikipedia as an authoritative source for such information.

Wikipedia’s editors offer guidance about citing Wikipedia. The first words of this essay:

“We advise special caution when using Wikipedia as a source for research projects.”

(Even this page carries a banner warning the reader not to take its contents too seriously as an expression of Wikipedia policy; but the point it makes should resonate at first glance with anyone who has carefully considered Wikipedia’s production model.)

Shining a Light on “This Little Light of Mine”

As it turns out, the question of who wrote the spiritual anthem “This Little Light of Mine” provides an excellent example to expose how Wikipedia works, and why it should not be treated as an authority on individual facts.

Cartoon depicting a Wikipedia writer and a news writer citing each other's work

“Citogenesis” by Randall Munroe/xkcd. Licensed CC BY-NC.

Harry Dixon Loes’ name was first added to the Wikipedia article about the song a decade ago. The person who made the addition used the name “SingingSongsOfSunshine”—and that’s about all that anyone could tell you about their identity. (The most trusted Wikipedia administrators could determine the user’s IP address; but that wouldn’t tell you much, and it’s considered highly privileged information, to be used only in combating extreme cases of vandalism or harassment.) This Wikipedia username was only ever used to make three edits to Wikipedia, all on the same day in 2008. All three were minor changes to articles about gospel songs; none included any suggestion about the source for the information added. The person did not voluntarily disclose anything about their identity, nor did they engage in any kind of discussion under that account.

Two years later, somebody else asked on the article’s talk page, “Where is the proof of the original author?” This person didn’t even bother to use a Wikipedia account. Nobody ever responded. In 2012, another Wikipedia contributor placed a banner at the top of the article, indicating that the article had insufficient citations. That banner remains to this day. This person, known as “VernoWhitney,” unlike the others mentioned in this piece, is a dedicated Wikipedia contributor, having made several hundred edits to the site since first registering in 2010.

In February 2018, another user added a citation to justify Mr. Loes’ authorship. The citation added is to an article on; the article was apparently written in 2017, but lists no sources. Did the ThoughtCo article used the Wikipedia article as its source? I’ve never heard of ThoughtCo before, but at first glance, I’d say it looks like a “content farm“—the kind of site a Wikipedia contributor is expected to view with great skepticism. It seems entirely plausible that the ThoughtCo author would have used Wikipedia as a source, introducing a problem referred to by xkcd author Randall Munroe as “citogenesis.” Like the earlier Wikipedia user, this individual—going by “Ddallender”—only ever made three tiny edits to Wikipedia, and never disclosed any information about themselves. As with the initial addition in 2008, neither the public nor any Wikipedia administrator has any way to know who this was.

While Wikipedia’s reliability is worthy of healthy skepticism, the site does excel in other areas, many of which set it apart from traditional publishers. Journalists, and critical readers in general, appreciate many of Wikipedia’s features. The research required for this piece, for instance, is enabled by Wikipedia’s radically transparent processes. I did not need any special access to the site, or to its authors, to identify which accounts made what changes, or when. Merely knowing which buttons to click will yield a wealth of information about any Wikipedia article.

On the whole, “All Things Considered” is the kind of journalistic source Wikipedia contributors love to use in composing Wikipedia content. But Wikipedia should be citing “All Things Considered,” not the other way around. I hope that “All Things Considered” will continue to use Wikipedia in its research; but in so doing, it must apply considerable judgment before reporting any of Wikipedia’s contents. Academic studies have found that Wikipedia’s content is often excellent; but part of that excellence derives from the transparency we offer in terms of our sources. We hope to guide our readers to authoritative sources, and enable them to make their own judgment about contentious points, like the authorship of a classic entry in the American songbook. If Wikipedia’s word is reported as final, the public may be misinformed, and “All Things Considered” may engender doubts about its reporting practices among those listeners who are familiar with Wikipedia’s methods.

For more on the connections between Wikipedia and journalism, see “The Future of Journalism in a Wikipedia World.” For a current initiative to improve Wikipedia’s coverage of news outlets, see Newspapers on Wikipedia.

Continue reading

Posted in core, journalism, wiki, Wikipedia, Wikipedia and education, Wikipedian in Residence | Leave a comment