Skip directly to content

Olympic Games coverage & what IOC and NBC can learn from the open data movement

on Sun, 08/12/2012 - 10:54

Today, the Games of the XXX Olympiad are coming to a close. Every four years in August, the Olympic Summer Games become part of people’s lives around the world. In 2008, 4.7 billion people or 2/3 of the world's population saw part of the Beijing Olympics, according to Nielsen. Naturally, conversations during those two weeks (and afterwards) keep coming back to the Games. And in an increasingly connected world, conversations about the Olympic Games are going global on Twitter, Facebook, blogs and other social media.

One would think that NBC as the exclusive broadcast and online rights holder in the US would help inform and facilitate that global conversation about the Games. But as a never ending stream of rants (follow #NBCfail on Twitter) and ample coverage on blogs and news sites show, they are failing their audience. Top events including the opening and closing ceremonies are not broadcast live but delayed until prime time to maximize ad revenue. Online streams are only available to subscribers of specific cable packages, and they are low-res and choppy on top of that. Broadcast content posted online by fans of the Olympic Games gets removed quickly because of copyright infringement. And NBC prime time coverage fixates on American athletes, largely ignoring foreign athletes and sports where Americans are not likely to medal. I would love to see the best of Olympic sports without a national angle, but that’s nowhere to be found in the US.

The open data movement is currently gaining a lot of traction. Governments and organizations get several benefits and opportunities from opening up their data. Open data obviously increase transparency by providing interested parties a closer look. More accessibility of data will enable others to hold data publishers accountable, but also to provide feedback and input. “No matter who you are, most of the smartest people in the world don’t work for you” (Sun co-founder Bill Joy). Opening up data provides those smart people a chance to create products, services, analyses and insights from the data that the data publisher could never have dreamed of. And by enabling innovators like entrepreneurs, developers, journalists and others to develop innovative and (potentially) useful products and services, it can power whole new ecosystems like the ones around weather data and GPS data. Open data can help data publishers and at the same time contribute to the greater public good. 

How does this relate to the Olympic Games? According to the Olympic Charter, "The International Olympic Committee (IOC) takes all necessary steps in order to ensure the fullest coverage by the different media and the widest possible audience in the world for the Olympic Games." How would the IOC get the ‘fullest coverage’? By opening up data from the Olympic Games and ideally video, audio and imagery along with them and making them available so that innovators can create more products and services that audiences want.

Obviously, that won't happen because the licensing of broadcast rights provides a major contribution to the budget for the Olympics. For the US alone, NBC spent $2.3B for broadcast and online rights for the 2010 Vancouver and 2012 London Olympics, and $4.38B for rights to all Games until 2020. So the IOC won’t be able to simply share the complete feed from the Olympic Broadcasting Services for free. But there are a few things they can do:

  • Share comprehensive data from the Olympic Games, including information about the athletes, real-time feeds with results, and other data. Right now, the IOC is still not using its data treasure troves to create an open data Olympics, and still, some amazing visualizations have brought more insights about the London Olympics.
  • Work with broadcasters to remove those legacy national viewing restrictions online (e.g. you can’t watch BBC in the US). These national Olympic Games video monopolies stifle competition and lead to sub-par coverage (did I mention #NBCfail?). Allowing competition between broadcasters from different countries will force all of them to provide the coverage that their audiences want (until they do this, there is always TunnelBear).
  • Prevent broadcasters from creating walled gardens around the content and ensure live coverage of all events. Sports need to be watched live (especially since the ubiquitous social media are natural born spoilers), and Jeff Jarvis argues that this also makes economic sense.  
  • Ask broadcast partners to share their content as "open content" and allow audiences to reuse and redistribute broadcast content. Broadcasters will have exclusive rights for first broadcast, but innovators and the crowd can then repackage the content, show highlights, show coverage from different countries, and so much more. This will create buzz, stimulate conversation, and may just drive up broadcast viewership overall. For now, however, people that want to share stories will have to get creative, like the Wall Street Journal with its home made highlights.

The Olympic Games are iconic. They show that sports, competition and team play are important around the world. They can inspire us to lead more healthy and active lives (a message that would be more consistent if we could get rid of those counterproductive McDonald’s and Coca-Cola commercials and endorsements during the Games). By taking a lesson from the Open Data movement, the IOC and NBC have a huge opportunity to expand coverage and audiences for the Games, and make watching, following and talking about them even more fun, in person and online. Fingers crossed for Sochi 2014.

10 recommendations for open (government) data publishers

on Fri, 07/13/2012 - 06:31
I am back home from the hugely motivating and energizing International Open Government Data Conference, hosted by Data.gov and the World Bank. A tremendous group of people got together at the World Bank over the last three days to discuss a broad range of topics related to open government data (also see my post from Day 1). Participation in the conference shows that the open (government) data community is really thriving: 450 in-person participants from over 50 countries, 4000 online participants, 2000 tweets. 162 speakers covered a lot of ground: Great case studies of open data initiatives from national, state and city governments in developed and developing countries. Insights from data users, including developers, entrepreneurs, data journalists, academic researchers and others. Advice and updates from technologists and platform providers. Discussions about standardization. Great stuff. I got a better understanding of the benefits of opening up data along with a number of recommendations for open data work; many of those are just as applicable to other open data efforts.
 
Benefits of open government data
Governments have several benefits from opening up their data: it increases trust in the government by providing more transparency and accountability. It helps improve public services. It stimulates economic activity and generates jobs. As the UK government was surprised to find out, it can also help the government improve the use of their own data. It helps to increase the exchange of information among government agencies (which are often siloed) and improve collaboration. And as an additional perk, open data will also lead to savings by reducing work on specific data requests. Here is how to get there:
 
1. Get the data out there
It's easy to get stuck in discussions about platforms, formats and standards. It's also easy to delay releasing the data to work on them until they are as good as can be. Instead, governments should start with a focus on getting the data out there. Value added work on the data can follow later, e.g. restructure the data with an eye for external users or linking / combining different datasets. These tasks could also be done by external players, who may re-package data for specific audiences and make them easier to use. In general, the priorization of data releases should follow user demand rather than just publication of data that are easy or convenient to publish.
 
2. Make the data open
Open data should be just that. Open. Available to use for anyone, available for any use (commercial and non-commercial), available to be redistributed. A proliferation of licenses and different types of licenses can seriously hamper the usefulness and ultimately the impact and success of open data.
 
3. Make using the data easy
Different data users will prefer different methods of accessing data, e.g. analysts/scientists/data journalists tend to want to download detailed data, developers want ongoing API access, and less sophisticated data users want query tools and data visualizations. Publishers should offer access at least via download and API. In addition, data need to be properly documented. Data by themselves can be misleading or used in the wrong context. Governments should make sure that data collection and subsequent work on the data is properly documented, and that the data are labeled, in machine and human readable format.
 
4. Get as much data out there as possible
Governments should share all data that are collected with tax money and can benefit their citizens. Two key reasons stand against sharing data: privacy and national security. Those two are very valid arguments, and data have to be vetted carefully to be released without unreasonable (!) risks to either. We need to keep in mind, though, that in today's world of databases and social media, privacy can often not be guaranteed, instead risk of identification needs to be managed). However, experience shows that governments often use both reasons as an excuse to avoid having to share data. Finally governments need to avoid sharing only selected data that fit their political agenda or only "toy datasets" that are of only limited use, and then claim openness.
 
5. Don't start from scratch
There are several powerful platforms that simplify the process of publishing data online, including open source and commercial ones. In many cases, this should eliminate the need for custom solutions. Present at the conference were Socrata, CKAN, BuzzData, Microsoft Open Data Initiative, Junar and the new Open Government Platform (aka "Data.gov in a box"). In addition, there are providers like Datamarket.com and Knoema that provide white-label platforms for data publishers. These platforms can provide significant savings in investment and cost, but chosing the right platform requires careful consideration (more on that later).
 
6. Engage with data users
Governments should maximize the impact of open data. To that end, they need to engage with users to encourage and promote the use of their data. Starting this process is significantly easier if there already exists a vibrant entrepreneurial and developer community in the country. Governments should also engage with data users to get feedback on release priorities, and to discuss data quality and possibilities for improvement. Conversely, data users should never be afraid to ask for data, be persistent (tenacious?) in trying to get access, and provide feedback on data use cases and quality issues.Launching an open data platform is the easy part, it's much harder to make it sustainable and create an ecosystem around it (see related post here).
 
7. Make the data discoverable and linkable
A key problem in the open data field is the discoverability of data across platforms (it better be pretty straightforward to find data within one platform). To that end, data publishers should keep in mind that machines are one of the key audiences for the data. Schema.org and other standards can enable their data to show up properly in search results. In addition, there is a need of interoperability and linkage of data, which make standards for open (government) data are essential; David Eaves compared these standards with shipping containers that revolutionized and scaled global trade; similiarly, standards help make open data efforts scalable. However, cumbersome standards can also be like a straight jacket that hampers progress. Standards development should be very targeted to drive adoption, then expand from there (you can't have breadth and depth in the beginning).
 
8. Focus on local circumstances
There is a danger for governments to fall victim to geek dazzle (going for the big shiny new toy) or - in the case of developing countries - donor dazzle (doing what donors want). Instead, they should focus on their specific circumstances and implement a pragmatic solution. Especially in coutries where there are inequalities in terms of access to technology and funding, open data can increase the digital divide by allowing an advantaged few to make better use of the data for higher gain (information/data is power!). In particular with regard to databases that (help) establish rights (e.g. land ownership), data releases should wait for full due diligence.
 
9. Measure the success of open data
The impact of open data should not (just) be measured by the number of datasets shared (although that's a helpful indicator as well). The best validation of success of data publishing is data use. Measuring whether and how the data are used is hard. There are anecdotes about the success and usefulness of open data, but very little in terms of systematic or quantitative measurements. The few numbers mentioned (for the US) include $100B for economic activity generated by GPS and weather data, and a potential of $350-400B in impact by opening up health data (based on a McKinsey study).
 
10. Identify data gaps
Governments have to have data to be able to share them. Capturing and highlighting gaps in existing data helps direct efforts to fill those gaps. This is another reason to engage with the entrepreneurial community; in the absence of useful data, entrepreneurs and developers may start (or already have started) own data collection efforts, e.g. via crowd sourcing.
 
Let me know what other pieces need to be part of this list. I'd be happy to edit and expand.

Nuggets of wisdom from the International Open Government Data Conference

on Tue, 07/10/2012 - 16:15

The first day of the International Open Government Data Conference with over 700 registered attendees brought a wealth of insights, information, and social media activity around open government data. Keynotes by World Bank President Jim Kim (in fact his first public speech as World Bank President), World Bank Managing Director Caroline Anstey and US CIO Steven VanRoekel highlighted the importance, transformative power, and economic impact of open data (more on the economic impact in a separate post). A whirlwind overview of insights from 29 virtual lightning talks provided a lot of context and insight. And presentations by open data advocates and implementers from Brazil, Kenya, Mexico, Moldova, Senegal, UK, and the US brought information from the trenches of implementing open data initiatives and platforms:

  • It's vital to involve all stakeholders and data owners in the planning of an open data initiative
  • Launching a platform is easy, the real work starts with making it sustainable and creating an ecosystem around the data (also see related post from earlier today)
  • Open data folks need to involve and engage data users and citizens to encourage use of the data
  • Coders (as key data users) should be involved early on in the process
  • Open data should be accompanied by open source software (e.g. Open Government Platform, aka "Data.gov in a box")
  • Standardization is vital for open data, but can also be a cumbersome straight jacket, so needs to be introduced carefully

Below are a selection of more nuggets from the Twitter-maniacs in the audience (#IOGDC actually trended in the US in the morning). If you missed the conversation this morning, read on:

Creating an ecosystem around open government data

on Tue, 07/10/2012 - 08:41
The launch of Open Kenya has been much discussed and written about (e.g. here, here and here). In the panel discussion about "Putting data to work", Al Kags (@alkags) pointed out that the real work happens after the launch of an open data platform. Speaking for the Open Institute and the government of Kenya, Al just outlined his lessons learned for creating an ecosystem that makes use of the data:
  1. Build community catalysts (via websites/ communities)
  2. Build skills (boot camps, master classes, university classes)
  3. Embed change agents (inside media houses, codeforafrica and related organizations)
  4. Rapid prototyping (e.g. in incubator spaces; Kenya has 6 incl. iHub)
  5. Create proof of concept (challenge, seed funding)
  6. Scale success (venture funds)

For the success of open government data initiatives, other panelists in the "Making Data Work" panel from Moldova, Mexico, Brazil and the US agreed that it's key to bring together representatives from different ministries and agencies to make data available. But just as critical is the involvement of other stakeholders - and citizens in particular - to make use of the data and enable sustainability of the effort.

Health Data Innovation at the International Open Government Data Conference

on Mon, 07/09/2012 - 16:25

This week, fans of open government data from all over the world are converging on Washington, DC to attend the International Open Government Data Conference, organized by Data.gov, the World Bank and the Open Development Technology Alliance. I will be there and look at health data and innovation in particular. Registration is closed, but here are more ways to listen in or participate:

For all of you interested in health data innovation, here are some thoughts on open data, health data and innovation as casual prep reading for IOGDC. Open data - and open government data in particular - are fundamental enablers of innovation. However, opening up health data with senitive information about individuals has a few fundamental differences compared to other government data. In particular, privacy and de-identification, control over the data, and data linkage need to be addressed carefully. In addition, stimulating innovation requires facilitating and encouraging the creative use of data. This suggests a number of considerations for data owners / holders as they open up their data:

Open data 

  • Share as much data as you can: owners / holders of health related data have a moral obligation to share data (responsibly!) because health-related data can be used to improve health and save lives.  More about what 'sharing responsibly' means down in the 'health data' section. The Open Data movement and the related national and sub-national open data sites like Data.gov are absolutely key to health data innovation.
  • Make the data easy to find: posting the data on organizational websites is a good start (even better if the site is search engine optimized), but to reach broader audiences, the data should also be posted on open data sites, data catalogs, data repositories and data markets. 
  • Make the data as open as possible: Tim Berners-Lee suggested a five-star scheme to rank openness: making data available (*) - providing strucutured  data (**) - using non-proprietary formats (***) - using URIs to identify data (****) - linking data to other data (*****)

Health data

  • Manage privacy risk with proper de-identification: With data about individuals, there is no guarantee of complete privacy. Huge marketing, social media, and other databases, as well as improving techniques of probabilistic linkage make it easier to identify individuals in de-identified datasets. Hence, sharing de-identified data means managing the risk of identification. Since many data owners equate lowering the risk of identification means with providing less detail (e.g. providing county or even state instead of postal code of residence), the shared data become less useful. However, there are increasingly sophisticated methods to help data sharers manage the risk of identification.
  • Put individuals in control of their data: it is hard to say who actually owns healthcare data, but as Fred Trotter over at O'Reilly Radar points out, it's really about who controls access to the data. For healthcare data, there is a lot of fine print that patients sign off on (and then there is legislation like HIPAA in the US), which put most control in the hands of the provider. For survey and study data, participants' consent provides clear provisions about how the data can be used. However, consent often doesn't include provisions for broader data sharing and open data. Consent and privacy statements should anticipate opening the data; in addition, individuals/patients should have access to their health data (e..g via a Blue Button, used by the VA and now many other providers) so they can take the initiative and share their data.
  • Enable linking all data about a patient: For real health data innovation, data about an individual (data from different providers, family history, lifestyle information, location data) need to be available in combination. Some research databases already provide access to linked data, but we need better ways of pulling together and opening patients' data to allow research, analysis and innovation (again, with proper privacy safeguards and patient consent)..

Innovation based on health data

  • Make the data ridiculously easy to use: data that are being shared should be cleaned, well documented (documenation about data collection and any subsequent data editing/prepping), properly labeled, and provided with all relevant metadata (ideally following standards like DDI or SDMX-HD). In addition, a lot of innovation comes from entrepreneurs, developers and other folks outside the health & healthcare fields, who may need additional guidance about proper use of the data.
  • Target all relevant audiences: different data users will have different preferences in terms of data formats, granularity (microdata, indicator data, individual data points), and type of access (query tools, dataset download, API). Offering different options will increase the potential uses but also increase cost/resources needed. In addition, data visualizations based on the data can help stimulate interest, insights and innovation.
  • Don't make people find data. Make data find the people. (open data law that Tim O'Reilly has been sharing for years): as US CTO Todd Park puts it, you have to 'marekt the hell out of your data" to stimulate innovation, via communication, marketing, and events that stimulate use (e.g. app or visualization challenges, hackathons, datapaloozas). Done right, entire data ecosystems can emerge from open data, as evidenced by the successful examples of weather data, GPS data, and - more recently - the great progress around health data initiated by the energetic HealthData.gov team.

The next three days at the International Open Government Data Conference should bring a wealth of new insights around these and additional topics. Stay tuned for updates, and ping me via Twitter if you'd like to discuss or meet up at the conference.

Pages