Over the past year or so, about 40 volunteer developers of all skill levels have been working collectively to build a new tool that scrapes government websites to compile a comprehensive list of public meetings.
The volunteers are a part of City Scrapers, a program run by City Bureau, the Chicago nonprofit that calls itself a civic journalism lab. City Scrapers was created in partnership with ProPublica Illinois to help streamline City Bureau’s Documenters program, which pays community members to attend and take notes on Chicago city meetings.
“As we’re trying to figure out how to send Documenters to as many meetings as possible, we were realizing that it would take a lot of time for us to do it manually — or we could figure out a different solution,” Darryl Holliday, City Bureau’s co-founder and News Lab director, told me.
This week in Solution Set we’re going to take a look at City Bureau’s City Scrapers. We’ll dig into how City Bureau created the program, learn how it collaborated with other organizations, and discuss how it’s going to use the information its scraped.
Solution Set is a weekly report from The Lenfest Institute for Journalism and the Solutions Journalism Network. Every Thursday, we take an in-depth look at one cool thing in journalism, share lessons, and point you toward other useful resources.
We’re also partnering with GroundSource so you can now also get Solution Set delivered each week via text message. You can sign up by clicking here or by texting SOLUTION to (215) 544–3524.
A quick programming note: The People Powered Publishing Conference, an engagement-focused gathering in Chicago, which is organized by City Bureau and a host of others, kicks off today. I’m having major FOMO that I’m not there, but my Lenfest Institute colleague Cheryl Thompson-Morton is at #PPPC18, and you should say hello if you’re there too.
Here’s the TLDR:
• The Challenge: With its growing Documenters program, City Bureau needed to find a way to compile a calendar of Chicago public meetings in one place.
• The Strategy: It created City Scrapers, a volunteer effort to scrape city websites for meeting details, and built a centralized web app to house all the information.
• The Numbers: About 40 people volunteered to help with the year-long project.
• The Lessons: The project enabled City Bureau to create an important product for its newsroom but it also provided training opportunities for Chicagoans and a non-monetary way for people to contribute to the non-profit organization.
• The Future: Documenters.org, the app developed from the City Scraper data, is schedule to launch in January.
• Want to know more?: Scroll down for a link to a City Scrapers guide City Bureau created that has tips, best practices, and a lot more you need to know to start your own version of the program.
Every week, groups of Chicagoans fan out to attend and cover public meetings from various government bodies in the city as part of City Bureau’s Documenters program.
The program trains people how to cover and report on public meetings, and then City Bureau pays them for their coverage. Individuals document meetings in a number of ways. For example, some have live-tweeted a meeting of the city’s Justice Advisory Council while others have created a Google Doc full of notes on a meeting of the City Colleges of Chicago Board.
Hi, I'm Rebecca, and I'll be live-tweeting the Justice Advisory Council meeting this morning.
— Rebecca Stoner (@rstoner1023) November 9, 2018
The documentation is then shared back with the community and participating news organizations to help inform their coverage of various government bodies in the city.
The Documenters program launched in 2014 under the auspices of another Chicago nonprofit, but City Bureau took it over in 2016 and has built up and expanded the program. Since then, City Bureau has also decided to double down on covering public governance meetings in the city.
But as the program grew, City Bureau discovered another problem: There wasn’t a centralized place where all the public meetings in the city were listed.
“You have to go to dozens of different websites to get all of the information on when meetings are, what time, what day, where’s the location,” Holliday said. “There’s no centralized location in Chicago…where you can see all of a city’s public governance meetings in one place.”
Since it didn’t otherwise exist, City Bureau decided to create its own repository of Chicago’s public meetings. But in order to list all the meetings, it needed to identify all of them and then find a way to catalog them.
It decided the best way to approach the challenge was to build web scrapers, which regularly pull information from city websites.
“We made a big list…of every meeting holding body in the city and then began the process of creating these web scrapers that run once a day and update on any new meetings that come up and any meetings that were cancelled or changed,” Holliday said.
Building the web scrapers was a lengthy process. In August 2017 City Bureau launched its City Scrapers program in partnership with ProPublica Illinois. It was a volunteer effort to build the scrapers and create the website where the information would live. (Scroll way down to The Future to learn more about the website itself. This will mostly focus on the process.)
City Bureau quickly realized that this would be a big task. It put out an open call for developers (here’s a Google Doc with the language it used.), and it also turned to Chicago’s civic tech community to ask for help (more on this in The Lessons!). It worked with ProPublica Illiniois news applications developer David Eads to assist with designing the program.
“He had the tech skills to get us started with Python and with the scraping and the whole event schema that we set up: What would it take to do this, how can we onboard new people and pair them up with more experienced coders?” Holliday said.
The team began holding weekly meetings in City Bureau’s newsroom. In addition to the weekly meeting, City Scrapers held a Design-A-Thon day to begin to create the design for the Documenters app.
When it asked people to volunteer, City Bureau was clear about the time commitment and the requirements to participate. City Bureau invited people of all different skill levels to join City Scrapers. It also fed participants, which people always appreciate.
It also created a dedicated Slack channel for team members to keep in touch and work together remotely as well. The City Scrapers also developed a code of conduct that it included on their GitHub page.
“The City Bureau Labs community welcomes contributions from everyone,” the code read. “We prioritize learning and leadership opportunities for under-represented individuals in tech and journalism. We hope that working with us will fill experience gaps (like using git/github, working with data, or having your ideas taken seriously), so that more under-represented people will become decision-makers in both our community and Chicago’s tech and media scenes at large.”
City Scrapers was led by a core committee of six people — Eads, Holliday, and other volunteers — who helped organize each meeting and took on different roles to facilitate the process. One member greeted attendees and would triage participants by skill. New coders would be sent to another leader who oversaw the onboarding process. They would introduce them to Python and help them get the system set up on their computer. People who had some Python experience would be paired up with more experienced coders to work together, and the most experience coders would be assigned the most difficult tasks.
“Github was a really useful tool here. We were able to label all of the issues,” Holliday said. “At any given time there would be dozens of issues, and we could label them as ‘good first issue,’ ‘needs help,’ or ‘needs research,’ things like that. As people were triaged at meetings, we could look at Github and see what fit their skills.”
Everything the City Scrapers created is open-source, and City Bureau has also created a guide for how to build your own scrapers program.
In addition to the core committee of six people, there were about 40 volunteers who contributed to the project.
Their work will serve about 500 people who have participated as Documenters with City Bureau.
City Bureau identified more than 100 government bodies and committees to scrape for the project. (Illinois has nearly 7,000 local governments — the most of any state.)
The project was initially supported by a $50,000 grant from Democracy Fund and Knight Foundation. And City Bureau’s continued work will be supported by a $1 million grant the MacArthur Foundation gave City Bureau earlier this year.
Together with Detroit public radio station WDET, City Bureau received a $50,000 grant to expand Documenters to Detroit. As part of that pilot, City Bureau built scrapers to compile a comprehensive list of public meetings in Detroit.
They contracted with a couple of the coders who had previously volunteered for the Chicago scrapers, and they were able to build out the Detroit scrapers in just about a month.
“It can be done much faster,” Holliday said.
• Expand the pipeline: City Bureau intentionally moved slowly though in its initial work in Chicago. It could have hired developers to quickly build the scrapers, but it wanted to use the work as a chance to build community and get more people involved with City Bureau.
Expanding access to journalism is also one of City Bureau’s core goals — it offers 10-week paid reporting fellowships for early-career journalists — and it wanted to use City Scrapers as an opportunity to offer trainings.
“We put out a call for volunteer coders to join us at our newsroom once a week and help us…knock out a scraper for each of these agencies but also to provide a really valuable learning opportunity for people who are new to coding, people who are often marginalized within the coding world,” Holliday said. “The civic tech space has a lot of the same problem journalism does in my opinion: A lack of diversity, an elitist quality to it that doesn’t leave much room for people who might not have gone to the best schools or had been able to take the unpaid internships to learn the trade. We wanted to provide a space where people who were new to coding could come in with a relatively easy task but learn python in the process.”
City Scrapers used a practice called pair-sharing, in which two developers work together on a task. This worked well for City Bureau because of the educational focus of the project and it allowed coders of different abilities to work together.
One of the volunteers, Rebecca Wei, also created a series of video tutorials.
In fact, City Bureau ended up hiring one of the volunteer developers, Pat Sier, as its first full-time staff developer.
• Partner outside of news: As it looked for volunteers to help with the project, City Bureau turned to Chicago’s civic tech community.
Staffers used their networks to recruit potential volunteers, and many of the people who eventually participated — including some who were on the organizing committee — had not heard of City Bureau before beginning their work.
But Holliday said the project appealed to people involved with civic tech because an opportunity to build something that had real practical impact on the community.
“The biggest two reasons were that it offered this free educational opportunity that they weren’t always getting or wanted more of, and it was the kind of project that had a very practical output,” he said. “They could see where their work was going throughout the whole process. We’d update them on how the app was going and what the plans were for the Documenters program. They could see the live tweets when Documenters were going out to cover meetings, and those were the meetings that they were scraping. It was a direct result of them making it easier for us to do the work.”
• Do the legwork before: Before it began City Scrapers, City Bureau took a number of key steps to put the program on the path toward success.
Importantly, the organization set goals for the project and outlined why, specifically, it thought the project was worth pursuing. “In building an aggregator for public meetings, we had two key goals: identifying assignments for our Documenters, as well as making all local public meeting information truly accessible for everybody,” City Bureau wrote in its City Scrapers guide.
It then identified which agencies it wanted to scrape, and before it began building anything it created a spreadsheet with details about each organization. (Here’s its template.) It also consulted the state’s open meetings law to understand which agencies were required to meet in public.
City Bureau went through a similar process in Detroit when it launched the program there.
When launching into a project like City Scrapers, it’s crucial to do work ahead of time to ensure that the project has clear goals and is feasible.
• Everything ends: With the forthcoming launch of the site that City Scrapers helped build, City Bureau is changing the role of the project.
Since it added a full-time developer, City Bureau decided to end the first phase of the project. It held a barbecue to celebrate the achievement, which also gave volunteers who didn’t want to continue an opportunity to exit the project.
“So now we still communicate with the remaining volunteer group via the Slack channel we created and still work with a handful of the coders on ongoing tasks related to the scrapers (i.e. when they break and when we come across new agencies every now and then that need new scrapers). So we’ll keep moving in that direction as far as the Chicago coders go, along with updating them on the app progress so they know that the work they put in was appreciated and is going to good use,” Holliday told me in an email.
City Bureau is launching the site City Scrapers helped develop in early January. The site, which is built as a web app, will house the regularly updated schedule of meetings. It will also serve as a central hub for the Documenters program. Documenters will be able to pick up assignments through the app and they’ll also publish their notes from the meetings there. (You can sign up here to get notified when it will launch.)
“It will basically take all of the information and make it scraped and streamline it into one single website so you can filter by your interests, by date, by time, by location, and search for public meetings that are of interest to you. It also allows Documenters to create accounts and claim assignments. It’s both a CMS — it allows for the processing of more documentation from Documenters — but it also provides this public service, which is that now there’s a single location where all the public meetings [are listed.]”
Information from Detroit and Chicago will be in the app, but as City Bureau continues to grow it’s thinking about what the role of the app will be moving forward.
One of the core City Scraper volunteer committee members recently moved to Pittsburgh for graduate school and began a scraping operation there using City Bureau’s open-source code. But because it’s not directly affiliated with City Bureau they aren’t sure whether that data should be imported as well.
“The next step, aside from building scrapers, is connecting with us around whether that work should go into the Documenters app,” Holliday said. “If there’s a question around whether a Documenters program should be created, it’d be much easier and simpler to include it in our app, and as we build and grow, that becomes the hub for this kind of work.”
Want to know more?
• Here’s the full City Scrapers guide from City Bureau. It features a step-by-step process for how to set up your own scrapers program and it has a number of excellent resources.
•. I frequently come back to this 2015 Nieman report Melody Kramer wrote about how news orgs can rethink their approach to membership. Programs like City Scrapers can empower people to feel connected to their favorite news orgs in ways beyond just financial contributions.
• Heather Bryant published an essay this week at the Membership Puzzle Project looking at how news orgs can do a better job investing in their communities: “Many newsrooms are under-practiced at building quality relationships with audiences, especially those that have been poorly served due to race, gender, education, or income,” she wrote.
•. Want to learn more about City Bureau? Here’s a great profile from Nieman Lab.
Anything to add?
Is your news organization thinking about training programs or alternative forms of membership? I’d be curious to learn more. Please get in touch with any ideas or questions.
We’ll be off next week for Thanksgiving. Have a great holiday!
Creative Commons photo of a Chicago Police Accountability Task Force community meeting by Daniel X. O'Neil