The WikiCite 2018 conference in Berkeley, California was an exciting meeting of the minds. There were a number of good developments for the Newspapers on Wikipedia (NOW) campaign. Here, I’ll recap those that stood out to me, as well as a few points that are unrelated to NOW. (Most of the talk videos linked below are very short, 1-3 minutes.)
This was the third annual WikiCite conference. WikiCite is an initiative to ensure that citation data (broadly defined, including publications, articles, authors, publishing houses, etc.) is well represented as open data on the web. (See also my recent post on Wikimedia Executive Director Katherine Maher’s keynote talk.) WikiCite has a great deal of overlap with NOW; though the primary focus of NOW has been prose on Wikipedia, but we have been improving Wikidata in parallel, and we can see the increasing importance of structured data to our project’s broad goal of making information about newspapers more accessible.
Newspapers on Wikipedia: A popular initiative
The WikiCite organizing committee encouraged NOW to engage with the conference, and when I arrived I could immediately see why. Many participants (librarians and Wikimedia enthusiasts, for the most part) were intrigued by what we are doing, and motivated to help out in a variety of ways.
I formally introduced NOW (3 minute video) on the second day of the 3-day conference, focusing on our choice to structure our approach as a “WikiProject,” what that means, and why it has been a good fit. Day 3 was a “hackathon” day; no fewer than four sessions (how gratifying!!) produced tangible accomplishments for NOW. These included:
- My own hackathon session (video intro; video report) had three groups working in parallel: (1) Don Elsborg, a librarian from Colorado, and Satdeep Gill, a longtime Wikimedian, jumped right in to start an article on an Oregon newspaper, the Stayton Mail, engaging with the fundamental work of our initiative. (2) Susanna Ånäs, a Finnish librarian, worked to import a database of Finnish newspapers to Wikidata; and we had an interesting discussion around an interesting pair of U.S.-based Finnish language newspapers I had recently discovered. (3) Stas Malyshev of the Wikimedia Foundation found that many newspapers’ Wikidata entries had no description, and worked on a scalable/automated method for filling in a basic description on each.
- Mahmoud Hashemi, Stephen LaPorte, Chunliang Lyu, and Sam Walton (video intro; video report) worked on two very cool things: (1) Demonstrating how to make a Wikidata-based citation on Wikipedia newspaper article (see the Register-Guard article); and (2) Working on an enduring tool that will help campaigns like ours (useful term, by the way…a “campaign” to achieve a specific goal in a specific time period is more specific than most WikiProjects, and can occur within one) measure progress. They named the project PaceTrack; there is a Google Doc and a GitHub repository. They’ve made much progress, and work is still underway.
- Simon Cobb (video report), a Welsh librarian, created an example of a newspaper infobox built entirely from information on Wikidata. See the Cambrian.
- Rob Fernandez (U.S.-based Wikimedian & librarian) demonstrated the Listeria tool, which can generate automatic lists for Wikipedia campaigns based on a Wikidata search. He created a list for Florida as an example.
Some fruitful informal chats
- Mark Graham, Executive Director of the Wayback Machine, told me about their efforts to create archival copies of news items at scale. He also drew my attention to a substantial directory of black newspapers in the U.S., which I immediately used as a reference to expand a newspaper article, and he pointed out a couple of aligned projects to address trust in news media.
- Dan Brickley, founder of schema.org and a Google employee, suggested a number of aligned projects. In addition, he affirmed our general belief that Wikidata is in ascendance as an important source for search results and knowledge panels.
- Joshua Dockery pointed out the “Misinformation Alerts” site, in which humans fact-check algorithm-based misinformation spreading on the web.
- I had a chance to catch up with LiAnna Davis of the Wiki Education Foundation, and learn a bit about how a project like ours fits with their current priorities. One specific point of interest: they are working on their first piece of university curriculum centering on Wikidata. Lane Rasberry and Daniel Mietchen are working with Wiki Ed on this as well.
- While this is not directly “of WikiCite,” I had the chance to visit with Sage Ross, also of Wiki Ed, just before the conference; Sage has made time to guide me, Nicholas Boudreau, and Lane in learning the Python programming language. This started as an effort to build a tool to measure progress in NOW; it’s likely that the PaceTrack project described above will “outpace” us, but regardless, a better understanding of Python can only help in any effort to work closely with the PaceTrack team’s emerging project. Also during our visit, Sage and I dug into Wikidata, and had a really educational session that deepened my understanding of how the site works. Many thanks to Sage!
Relevant grant proposal advances
Coinciding with the WikiCite conference, our colleague Lane Rasberry learned that his proposal to the “Ethics and Governance of AI Initiative” had advanced to the second round, surviving a cut from 500+ applicants to 66. Lane’s proposal, through the Center for Data Ethics at the University of Virginia, is strongly aligned with NOW, and if successful may help us to forge ahead with a second round of our project. His colleague Daniel Mietchen, who has also substantially contributed to NOW by writing most of the code for our progress-tracking map, was an organizer of WikiCite. Lane, Daniel and I took the opportunity to work together in crafting the response to the Round 2 questions, and submitted the application shortly after the conference’s conclusion. It was great to have the chance to work together on this in person! Here is the application we submitted.
Getting the word out
- Konrad Förstner, Professor for Information Literacy at Cologne, interviewed me (Pete Forsyth) and Lane Rasberry about NOW for the Open Science Radio audio podcast. UPDATE 1/9/19: Here’s the link!
- Lane interviewed me about NOW as well, in a shorter video piece; it should be published by early 2019.
- Separate from WikiCite, but coinciding, was the successful newspaper edit-a-thon that NOW founding members Eni Mustafaraj and Emma Lurie hosted at Wellesley College. Wellesley published a nice blog post about it.
Can’t always live in the NOW: General highlights from WikiCite
WikiCite 2018 offered many compelling moments, and of course many were not directly related to the NOW project. Here are a few that stood out to me:
Wikimedia Foundation (WMF) executive director Katherine Maher spoke about Wikimedia as the “essential infrastructure of the ecosystem of free knowledge.” See my recap and commentary here.
The second day centered on group work around strategic questions. Megan Wacha asked a question of one of the groups that Wikimedians will appreciate: In mapping out a product vision, what was your take on the role of Wikimedia volunteers and the Wikimedia Foundation? It was an insightful question, shining a light on an area of disconnect that seemed to crop up at various times in the conference. The group’s answer(s) focused almost entirely on the Wikimedia Foundation, suggesting to me that there wasn’t much understanding of the volunteers of Wikimedia as a separate entity. It seems to me that there was a great deal of technical learning and individual networking at the conference, as well as good strategic work (Day 2) and technical work (Day 3). But I’m not sure that the librarians and professionals in attendance had many opportunities to learn about the Wikimedia movement’s culture, values, or social norms. Megan’s question highlighted the point concisely, and I find myself hoping that future conferences might seek out ways to make cultural learning a more central component.
Many of my fellow Signpost colleagues, former and present, were in attendance. Phoebe Ayers, Andrew Lih, Rob Fernandez, Rosie Stephenson-Goodnight, Lane Rasberry and I discussed what it might take to put together a thorough history of Wikimedia’s in-house newspaper, perhaps as an oral history, and perhaps in time for Wikipedia’s upcoming 20th birthday. I discussed similar things with former editor Sage Ross the week before the conference.
Tpt pointed out to me that the 2019 Community Wishlist Survey, an annual effort to identify the top 10 projects of interest to members of the Wikimedia editing community for work by the Wikimedia Foundation’s developers, was likely to include improved export of books from Wikisource in formats like PDF, ePub, etc. This proved true; it was announced as the #4 priority. I look forward to seeing an improved mechanism for sharing valuable Wikimedia content more widely in offline formats!
John Mark Ockerbloom, digital librarian at the University of Pennsylvania, brought many ideas; the one that stood out most to me was a tool to make it easier to search copyright renewals. He has a database of copyright renewals for U.S. periodicals, which he introduced in a lightning talk. He talked about the value of maintaining such a list outside of Wikimedia, where an expert can assert responsibility for things like completeness.
Of course, in this blog post I have touched only what most caught my attention. There were all kinds of great things happening; if these topics speak to you, I urge you to explore the conference wiki pages, including video links, notes on Etherpad pages, etc.
(Note: Post updated with items on the Signpost and on copyright renewals after initial publication.)