Central Statistical Organizations - still emerging from the pre-Internet data publishing world
- HTML (71%) and PDF (64%) are the most prevalent forms of data distribution (which confirms my own experience; too many organizations simply put reports online that are traditionally published on paper)
- Use of Excel (55%) far outweighs the use of csv (17%) or txt (2%), even though it is a proprietary format
- Only 9 CSOs (5%) use interactive graphics, 58% don't provide graphics at all (except as part of reports)
- Only 21% of CSOs enable users to customize downloads, while the vast majority offers predefined documents for download. 21% don't provide any download functionality at all.
There is a growing number of web tools with useful functionality to make data available, engaging and even fun. Free tools like Tableau Public, ArcGIS.com, Google Motion Charts, and others offer great possibilities to share and visualize data, mostly not requiring much coding or developer knowledge. More comprehensive solutions like Socrata, Space-Time Research, or Tableau Server offer far more sophisticated possibilities for publishing, visualizing and exploring data. For a more comprehensive list of tools, check the Resources & Tools section. CSOs need to make use of these opportunities.
The study didn't explore the granularity of the data offered, i.e. whether the CSO is sharing microdata, detailed tabulations, simple tabulations, or only estimates. I spend a fair amount of time looking at CSO websites and talking to people at CSOs while searching for health-related data. Many organizations are hesitant to provide access to detailed or even microdata, mostly citing confidentiality reasons (for more background reading on motivations to share or not to share data, check here). This is another lost opportunity. Academic and other researchers need microdata to unleash the full power of statistical tools on the data and maximize the insight gained from them. There are three ways to deal with the problem. CSOs can ask data users to sign data use agreements that ensure proper and secure storage and use of the data. They can ask researchers to come to their offices or dedicated research data centers. Or they can use software tools to ensure confidentiality. Software from Space-Time Research uses a combination of techniques to enable work with micro-data while ensuring that any viewable / downloadable results are fully de-identified.
The discussions in Amman showed that CSOs across the Arab world are aware of the problem. Now it's time to work on the solution.