Google Refine 2.0: Making databases easier to read

By DANIELA RODRIGUEZ

As journalists, going through databases with what seems like an endless amount of data, can be overwhelming. Databases offered by the U.S. Census or the FBI can also be very confusing to understand at first.

Instead of using Excel or Access as the go-to desktop applications to organize all the data downloaded from these websites, there is now the alternative of using an application called Google Refine 2.0. This application can be downloaded to the computer but it is still working with the web browser. It allows data to be more organized. The data that are being worked with aren’t uploaded, so there is no need to worry about having the information you are gathering being found online.

When the aquisition of Metaweb was done, it also brought along Freebase Gridworks, “an open source software project for cleaning and enhancing entire data sets,” said David Huynh from the Google Search Infrastructure. With Google behind it, you know that it is going to be a very useful application. The last version, Freebase Gridworks 1.0, was found as very useful for those that conducted a lot of research and getting public data from government agencies.

With Google Refine, it makes gathering messy data into neat and organized data. It also notices inconsistencies and helps merging different data sets easier. It finds misspelled labels and different terms used for the same thing, so there won’t be redundancies and help group relevant data together.

It works like an Excel worksheet, but with more added features. It is also free to download and can be used as many times as the user wants at no cost. Even though it may be confusing to use at first, there have been videos posted to get acquainted with the basic features and functions of Google Refine 2.0.

I think this would be a great tool to use for journalists, especially for compiling data from a database for a news story. Since I have been looking at the databases government agencies make accessible to the public, they can be be very inconsistent and very confusing to read.

Here are some short videos to help with getting to know Google Refine 2.0:

Google Refine 2.0-Introduction (Video 1 of 3): http://www.youtube.com/watch?v=B70J_H_zAWM&feature=plcp

Google Refine 2.0-Data Transformation (Video 2 of 3): http://www.youtube.com/watch?v=cO8NVCs_Ba0&feature=relmfu

Google Refine 2.0-Data Augmentation (Video 3 of 3): http://www.youtube.com/watch?v=5tsyz3ibYzk&feature=relmfu

This entry was posted in Daniela Rodriguez and tagged , , . Bookmark the permalink.

Leave a Reply