Migrahack at CSUN

Just got back last night from the Migrahack at Cal State Northridge. I’ll post photos and more but wanted to get the links up here from my presentations. I got a chance to sit in on Ron Campbell’s Census workshop because I finished mine early. Campbell was one of my early data journalism heroes. Sure enough, in  just 5 minutes of his workshop I walked away with IPUMS and Census Reporter. The NAHJ student journalists gave me inspiration and a shot of determination. Anyway, I’ll be updating my presentations in the next few weeks with more Tableau, BLS, Census Reporter and IPUMS. But I’ll post the information here when that happens.



Posted in Uncategorized | Leave a comment

Los Angeles County is full of Germans…or?

I’m here somewhere near Burbank getting ready for Migrahack at Cal State University at Northridge, or, as I have heard people here call it, CSUN (as in see-sun).  Anyway, I’ve been playing around with some Census figures – the amount of time men and women spend commuting to work or how early men compared to women leave home for work every day. I stumbled across the ancestry table. Who knew there were so many people in L.A. County who counted German ancestry: 495,046. Not a lot when you consider there are nearly 10 million people here.  But it’s No. 1. EXCEPT for the 7 million people shoved into “other groups.” You tell me which group is missing in the table below. I am not sure why the Census puts the information together like this. It’s  an example for the class of what you can find on the Census, or in this case what’s missing sometimes.

Posted in Data-Driven Journalism, Journalism, Open data, scrapers | Tagged , , , | Leave a comment

School of Data: Evidence is Power

I was looking for a new AngelaWoodall.com blog theme and found one that happened to have this video in the example with journalists from around the world — from villages to major cites — eager to use data for transparency and accountability. It took me about a minute to choose the template after watching the video (I took it as a sign) although I am still not sure where they got “Wrangle the Newsfloor.”
(Just as an aside, this is the first time I paid for a template so it might be a coincidence but the money was worth the quality and instructions included in the deal. I’m just putting it out there for any of you who have hit your head against the wall with WordPress. There’s a reason why Tech Liminal has WordPress Support Group meetings.)

Posted in Data-Driven Journalism, Gov 2.0, Hackers, Hitting your head against the wall, Journalism, Journo-Apps, news apps, Open data, scrapers | Tagged | Leave a comment

Data Diving How-to

The timing was unbelievable: My rocket-fast Internet fizzled on a Friday and by Sunday had crashed completely. Four of six neighbors also lost all connection. One of them is a Pandora project manager. I have no idea what the other ones do but for five days we were offline. And I was trying to get ready to teach an Institute for Justice and Journalism storytelling with data workshop on Feb. 28.
Which is really the point of this post. I won’t drag out the details of the Internet drama except to say it took an incredible amount of yelling and Twitter to get online again. The statement, “I will write a letter every senator on the Comcast -Time Warner merger oversight committee starting with Sen. Leahy!” was involved.
Anyway, here is the presentation with links and lessons for the data-diving and data viz sessions, which did not put a single person to sleep!  I started with the most basic searches and spreadsheets and worked up to scrapers (speaking of which, there’s a scraper workshop coming up soon I’ll send word about). I’ll be updating the information in the presentation with California Secretary of State campaign contribution data information as well as how to background a nonprofit or charity. For now this is a great start. Looks like I may be in Northridge for a workshop in May and next semester at SF State.

Until then, happy hunting!

Posted in Data-Driven Journalism, Hackathon, Journalism, Open data | Tagged , | Leave a comment

Migrahack workshop Feb. 28

I’ll be leading two workshops at the upcoming HandsOn Tech and Institute for Justice and Journalism‘s Migrahack in San Jose Feb. 28: “Engaging & Insightful Storytelling with Numbers.” Migrahack’s unstoppable Claudia Nunez organized a roster of workshops taught by fabulous people. Here are the details and below the workshops I am handling.

9:00 a.m.-10:00 a.m.

Intro to DataViz: A guide to designing with data and open software. Learn about free tools available to make your reports stand out with data.

1:30 p.m. – 3:30 p.m.  

The basics of data diving: Where to find the data you want and what to do with it: We will use an example to walk through the basic steps in a project using immigration data then wrap up with a “real-life hit your head against the wall” example and talk about where to go for help, technical and with the data.

Posted in Data-Driven Journalism, Hackathon, Journalism, Journo-Apps, Open data, scrapers | Leave a comment

Scrapers: ScraperWiki

In just a few months the world of data scrapers has changed – for the better. At least if you consider point-and-click instead of bang-your-head-against-the-wall an improvement. But there are limits to the technology and I just found one of them.

footer_tractorScraperWiki, which I consider to be an early innovator of scrapers, has made it incredibly easy to extract data from the web and PDFs. They started testing it a few months ago. I finally had a go a few weeks ago (holidays and all) and it worked seamlessly on already structured data from Data.gov. No coding, no waiting around.

I’m writing a book about the struggle to save a family ranch in Sonoma County so I was looking at farmers markets and other ag data. Today I was trying to make a list of farmers markets within a 100-mile radius from California Federation of Farmers Markets search results. I copied then pasted the column of market cities into a Google spreadsheet then sorted them alphabetically and started numbering them. There were gaps between the rows so it was tedious. Then it dawned on me: Why am I cutting and pasting when I could just grab the results with ScraperWiki. However, what I got back from the California Federation was not pretty. The web is still a little wild out there because people throw together all kinds of stuff that machines can’t read. That’s one of the campaigns the open source community is trying to get local governments to realize – make data machine readable so you don’t get sludge in your beautiful machine. (And finally I’ll have Import.io up soon. They’re getting more streamlined too. Good news for journalism!)

Screenshot (28)

Posted in Data-Driven Journalism, Gov 2.0, Hitting your head against the wall, Journo-Apps, news apps, Open data, scrapers | Leave a comment

Digital First, the end

DFMCivic Playground started as an in-house project while I was a reporter at the Oakland Tribune, which is part of the Bay Area News Group and Digital Media First.  It was the CEO of the whole umbrella group, Digital Media First, that decided, in the spirit of tech start-ups, to throw some ideas at the wall and see if they stuck. The project as a whole was called ideaLab. I just heard that ideaLab was not one the ideas that stuck as one of the fellows describes here.  I love it that DMF was willing to try although there was a lot of grumbling about how the money and gear we received as fellows could have paid for salaries and raises in the newsrooms. Since I left staffs have shrunk more than I imagined they could while still keeping a paper filled with news. 

As you can tell, Civic Playground is independent from ideaLab and instead of focusing on apps I’m now working with newsrooms and reporters. I have been feeling the air start to leak out of the balloon that got inflated around technology in 2011 and 2012. Feels like a bit of reality has set in. 

We were supposed to host a scraper workshop at Hacks/Hackers and SPJ already but the holiday schedules involved were too much. So look for them in 2014. In the meantime, I am testing ScraperWiki’s new interface and I finally got to know Import.io better. I want to first try their Udemy.com how-to session

In the meantime, happy New Year, 2014. 

Posted in Data-Driven Journalism, Gov 2.0, Hackers, Journalism, Journo-Apps, news apps, Open data, scrapers | Leave a comment

Are these mobile news apps scary or brilliant or just the future?

The zeitgeist of the mobile era (era here meaning the past and coming few years) might be “Currency is Time.” The line came from the San Francisco head of News Republic, a mobile news aggregation app from Mobiles Republic. He was demoing last night at the ONA-organized “What users want from mobile news.” The app is not about excellence in news or saving journalism or anything so squishy. Mobiles Republic is a tech company using news to make a buck. But the hope among “content providers” is that News Republic will bring more readers to their stories. If it’s successful, I hope news companies give reporters a cut. mobilesrep

In any case, News Republic is moving toward journalism somewhat untethered from its origins. Circa takes it a step further and depending on how you look at it, the app could be considered brilliant or terrifying. I vote for the category of “intriguing but unsettling.” The mobile-native app, still in the angel investor stage, chops up news into facts, quotes, stats, images, videos. It destroys the idea of the article by stringing bits together to tell a story.  But that may make the more pleasurable experience on mobile. 

It’s intriguing and tempting and I have been wondering since I started a heavily data-driven project in August about whether it is okay to abandon narrative sometimes. No story, no infographics and no data visualization.

How many times have you read to the third graph and trailed off because, my god, the article just kept going on and on? I thought about instances in which economic data could be reported straight out. For example, politicians in Louisiana want to stop a lawsuit against the oil and gas companies because, they say, the industry drives employment and the local economy. But it’s not true. Pages about the conflict have filled the papers and some of the national ones have been inaccurate and lack data to back up the claims by the politicians. I would list the claims, the data and the source of the data in a table w links to the original sources and the spreadsheets the reporter is using. The digital director of an online site said his team helps journalists get readers to engage with their stories (he’s from a town that comments a lot on the local reporting unlike the Bay Area). But sometimes, he said, he has to ask if the story is worth reading when it’s ignored. Maybe the information is important but will turn out better in raw form as opposed to a story. That’s true for mobile, anyway, according to serial news innovator and Circa news director David Cohn, — who I am fond of because he showed up at News Hack SF a couple years ago talking about unicorns and rainbows to explain the latest start-up he was working on.

So yes “atomizing” the news is intriguing. But unsettling: What are the consequences? 






Posted in Data-Driven Journalism, Journalism, Journo-Apps, news apps, Open data, Uncategorized | Leave a comment