Useful tools to review, refine, clean, analyze, visualize and publish data
However, most interesting to me are the tools that the interviewees mention and which Alex calls the "Newsroom Stack". Any number of those tools may be used in sequence to get from your set of data to useful insights. I used the additional comments from the journalists to add to my own list of useful data tools; some key ones below, the rest on the Health Data Innovation Tools page. Let me know what other tools you think I should add.

- Microsoft Excel - still the standard for many as the easy first stop to review data
- Data Science Toolkit - collection of useful tools to extract and convert test, GIS and other data (my overview here)
- ScraperWiki - provides software and instructions to extract data and information from web sites
- Google Refine - clean, organize, refine (duh!) and explore your new datasets, great for exploring new datasets
- Overview - clean, visualize and interactively explore large documents and data set (started by AP)
- The PANDA Project - the new newsroom data appliance
- Stat/Transfer - converts data between formats of statstical analysis packages
- Ruby on Rails - powerful open source framework for budding programmers with helpful frameworks like Django or Remote Table (mapping)
- Python - programming language, very useful for data analysis and visualization
- JavaScript - prototype based scripting language
- R - open source software environment for statistical computing and graphics
- Git - to track versions of code and share with others
Data visualization and GIS packages
- Protovis/D3 - JavaScript-based library of very slick visualizations
- MetaLayer - discover and share insights from data via infographics
- WEAVE - Web-based Analysis and Visualization Environment
- PostGIS - spatially enabled PostgreSQL server
- Tilemill - design studio to create maps, powered by MapBox
- Leaflet - JavaScript library to create interactive maps
Databases
- MySQL
- PostgreSQL - open source object-relational database system
- SQLite - Firefox extension that allows SQL queries without setting up a full database
Here are the articles; check back on the O'Reilly Radar data page for more:
This should put you in the right mood to have a look at the "Effective Data Visualization" presentation by Hjalmar Gislason (aka @datamarket) at Strata this week. It's a great account of the considerations necessary for anyone that wants to create visualizations. Very useful: if you download the PDF from Slideshare, the slides contain links to more information online.
Post new comment