AVERAGE PATH LENGTH is the average shortest path, or number of ‘hops’, from that node to every other node in the network.
CONNECTIVITY is the total number of incoming and outgoing links.
STRENGTH ASYMMETRY
is the relative difference in median strength of all outgoing vs incoming links – (Log (median strength incoming links / median strength outdoing links))
Link thickness reflects link ‘strength’, defined here as the number of votes for a ‘strong direct link’ divided by the number of people who viewed that node pair. (e.g., if 10 people looked at a node pair and 5 drew a link, strength = 0.5)
LINK ASYMMETRY
is the relative difference in outgoing vs incoming links – (Log (# outgoing links / # incoming links))
Tools to Anonymize Personal Data
For example, the ability to anonymize our personal data has a positive direct influence on many other challenges – like the number of people who will voluntarily share personal data, and the ability to catalyze a critical mass of community engagement. For example, if many people having asthma attacks voluntarily share those events with the public, the broad patterns could help identify geographic outbreaks related to air quality conditions. More people would participate if they knew their personal identify was stripped from the event so they would not be at risk of being denied health care coverage.
WE ARE DATA
How can #WeTheData benefit from (and avoid being harmed by) the explosion of data we generate everyday?
We used an ecological network approach, developed by Vibrant Data Labs, to make sense of this messy problem and identify Grand Challenges for catalyzing positive change.
“A problem well defined is a problem half-solved.”
– John Dewey
“We're mapping collective understanding of the problem and using the network structure to spark creative solutions where they're most needed.”
-Eric Berlow Ph.D.Ecologist | Complexity Scientist | Founder -
Vibrant Data Labs
Economic Opportunity
Human Health and Wellness
Civil and Political Rights
Environmental Sustainability
Science Education, and Human Knowledge
ability to accrue personal value from offering personal data
flexible allocation of costs
ability to be locally relevant
ease with which micro-entrepreneurs can enter the market
ability to protect against malicious uses
increased efficiency and effectiveness of public services
visibility of small success stories and examples of value from data
direct utility of open data to those providing it
personal accountability
direct utility of tools and platforms to those using them
ability to detect/self-correct unintended consequences of Vdat
transparency and accountability of large institutions
ability of everyday people to monitize dormant skills and assets
development of a marketplace for data, analytics, and other data services
transparency and accountability of information providers
degree of platform openness (copy and modify)
reputation system that engenders trust among participants
system of rewards for participation that are sensitive to context
degree to which the participants have a shared problem
ease of editing / adding to a dataset
ability of tools to be highly customizable
proportion of the population motivated by social rewards
ability to convert data into action
ability to collaboratively co-create
ability to collaboratively analyze and share insights about data
ability to make informed decisions based on data
ability to collaboratively improve platforms
formation of communities of shared interest or action around data
ability to catalyze a critical mass of community engagement
proportion of the population that is functionally data literate
ability to easily manipulate data granularity
reduction in cost of computation
computational power of small devices
ability to use data to make predictions
proportion of public that can critically evaluate conclusions drawn from data
ability to easily find pattern across multiple data streams
ability to intuitively explore and answer questions with data
proportion of the public using data to inform daily life decisions
ability to fact check online information
ease of access to novel statistical methods of pattern discovery
ability to see broad trends and place ourselves in context
ability to check and validate data quality
total number of people offering data viz and analysis tools/services
availability of automatic language translation tools
degree which UI design enables participation by diverse groups
degree to which underlying technologies and data are invisible
ease to users of managing personal data access permissions
legal/policy framework for personal digital rights management
legal framework for accessing/sharing copyright protected data
proportion of data from large institutions that is accessible
concentration of data and access to data in few corporations
tools to anonymize sensitive data
ability to control data access permissions
proportion of population with access to information infrastructure
wireless connectivity and bandwidth
accessibility of stored open data
ability to integrate info from multiple sources (e.g. spatial data in real time)
tools for seeing our own 'personal data exhaust'
ease of discovering datasets
number of people with easy access to data
proportion of real time data accessible by mobile phone
proportion of online info and media protected from censorship
ability to see under the hood of tools to avoid oversimplification
inter-connectivity of mobile apps
incentives for large institutions to open up data for social good
degree to which large institutions don't have anything to hide
ease with which everyday people can share data
ease of adding to / modifying metadata
more coherent data standards
reduction in cost of networking and storage
data self description
ability to automate inter-operabiilty of different datasets
accuracy and reliability of real time info with respect to purpose
ability to collaboratively clean and filter data
ability to share edit and update public data
ability and ease of cleaning of data
level of clarity, simplicity, and utility of data created by sensors
availability of suitable sensors
amount of personal data voluntarily shared
ability of people and institutions to be real time sensor networks
total number people contributing data
ease with which everyday people can collect/ create data
Higher Goals
Value Derived
Action & Collaboration
Analysis
Access & Circulation
Organization
Creation
How we nourish higher goals like broader economic opportunity, civil and political rights, human health, etc
How we derive value from data
How we turn that meaning into action and catalyze communities
How we analyze and discover meaning in data
How easy they are to access and share, and how we control access permissions
How they are stored and organized
How data are created
~90 challenges and >3,500>1700 links identified by the community!
availability of automatic language translation tools
Reaching people across geographical, cultural and linguistic boundaries requires translation. For example, a large proportion of basic information on the web is not accessible to Arab speakers. Sharing and leveraging open data will be limited if there are not tools that enable translation. Beyond the direct and obvious linguistic translation, this may include cultural differences, such as what units are used on data.
more coherent data standards
For disparate data sets to be able to “talk to each other”, to be used in the same analysis, and be used to drive new discovery, they must be combinable. That requires standardization. Data standards have traditionally been very difficult to implement because people are always coming up with new data-types. Open, flexible standards are essential to enable truly Vibrant data.
degree which UI design enables participation by diverse groups
As basic performance capabilities - computing power, networking, even the power of analytics tools - increase, the world will see a shift in emphasis from simple performance to a better design of that performance to fit the needs of real people. UI design is an important element in making the power of technology fit with the ways of learning, the language, the visual skills and other characteristics of real people.
proportion of the population that is functionally data literate
Refers to the ability of people who understand enough about data and analysis to derive real value from it. Finding patterns and meaning in data will be simplified with the development of new analysis and visualization tools. The proportion of people who are “data literate” will drive the rate of development of tools that make it easier for people without a lot of specialized training to analyze data for meaning.
degree to which underlying technologies and data are invisible
Some may want to explore all that is possible with a given tool, while others may want to just use the basics. Design plays a key role in enabling novices to immediately understand the use of a given technology or data set, while enabling deeper explorations over time. By artfully hiding or exposing details and functionality, designers will enable more people to share and use data in ways that suit them.
ability to easily manipulate data granularity
Tools that enable the manipulation of data granularity enable people with different skills and interests to look at common data sets in their own way. Sometimes, it is important to dig into the details. Sometimes, it is more appropriate to just have a “bird's eye view”. The more our data analysis tools can accommodate both ways of looking (and those in between) the more likely that more people will find value in them.
ease to users of managing personal data access permissions
Another issue refers to the importance of putting flexible tools for assigning data access permissions in the hands of people, not large institutions (reference “ability to control data access permissions”). These tools must involve a minimum of effort and management overhead on the part of individuals and be largely automated, otherwise they will be too complicated to be useful.
level of clarity, simplicity, and utility of data created by sensors
Many sensors today gather data in ways that are either proprietary or idiosyncratic. We need new methods for sensor data to be made more easily accessible and understandable from the very moment it is first created. This may be through improvements of devices themselves or through methods that automatically format and make available data by way of post-processing.
degree of platform openness (copy and modify)
Openness here is defined as free to copy and change. Open source has been a major enabler for many organizations to collaborate, share and leverage expertise. It enables innovation and local cultural relevancy people adapt tools and technologies to local needs. Similarly- by allowing the open sharing of data, and (appropriate) modification of information such as meta-tags, the utility of a database can be improved tremendously.
legal/policy framework for personal digital rights management
People will not be willing to openly share their personal data unless they have assurances that their privacy will be protected. Current privacy agreements - implemented by corporations and institutions - do little to assure people of the protection of their privacy. Legal and policy frameworks that return control over privacy to individuals, along with technology tools to help individuals manage those protections, will be vital for data vibrancy.
legal framework for accessing/ sharing copyright protected data
Tight legal restrictions limit peoples ability to work with and build on the data of others. Too often, such restrictions favor powerful enterprises, who “lock down” access. Without any restrictions, however, people may not be willing to share personal data for fear of violations of privacy. Legal frameworks require a balance between openness and individual protections, without those protections becoming a tool for abuse by powerful interests.
proportion of data from large institutions that is accessible
While data is much more easily shared, laws, institutional power, access to technology and other resources can create barriers to openness and transparency. Even in highly democratic societies, large firms and institutions have the means to gather information about individuals (for advertising, or surveillance). Real data vibrancy will only occur when this one-sided approach to data gathering and possession gives way to much broader circulation of data.
reputation system that engenders trust among participants
People who remotely collaborate need tools to assess the trustworthiness of others they share data with. Reputation systems have been developing online, and many firms are working on generalized reputation scores that apply across domains. But beyond simple scores, people will need other tools, including assurances about recourse in the event that trust is breached, in order to feel safe sharing information broadly.
concentration of data and access to data in few corporations
Large, consumer-facing companies guard consumer data they collect. On the one hand, it means that corporations might protect those data from abuse. On the other hand, this hoarding of data reduces the chances for people to discover unexpected new meaning and value.
tools to anonymize sensitive data
Tight legal restrictions limit peoples’ ability to work with and build on the data of others. Too often, such restrictions favor powerful enterprises, who “lock down” access. Without any restrictions, however, people may not be willing to share personal data for fear of violations of privacy. Legal frameworks require a balance between openness and individual protections, without those protections becoming a tool for abuse by powerful interests.
ease of adding to / modifying metadata
Metadata is information about data - descriptions of context, characterizations, labels, all ways that help people know what a data set or element is, and how it might be used. Metadata is a key component to make data sets more compatible and useful across boundaries. Therefore, we need open systems that allow metadata to be modified appropriately.
ability to control data access permissions
As data access becomes more open, people will need the means of setting personal data permissions. Current data permissions systems (e.g., those provided by online services or corporations with regards to personal data) are not nearly flexible or powerful enough, and generally favor the corporation as “owner” of the data. This will obviously have to change for people to willingly share and circulate their data in a more open way.
availability of suitable sensors
Most smart phones have many sensors on them - GPS, accelerometers, cameras, microphones, etc. A more vibrant ecology of data creation will depend new types of affordable mobile sensors to collect data in new ways - for instance, after the 2011 earthquake in Japan, small mobile radiation sensors deployed by citizens were critical to mapping radiation plumes around the Fukushima power plant.
reduction in cost of computation
Since the late 1960s computing performance has continuously doubled in power every 18 months, while the cost per unit of computing has dropped precipitously. This is a factor that will continue to fuel the growth of open and vibrant data exchange, as more and more people can ultimately afford to access computing power - and thus, digital technology.
reduction in cost of networking and storage
Network access in more remote - and usually poorer - parts of the world remains an acute issue. As with the cost of computing, digital storage technologies continue to decline in price and increase in power. Solid state storage devices promise to provide a more robust option for small devices, while new, open standards for servers have created a proliferation of storage “in the cloud”.
computational power of small devices
Since the late 1960s computing performance has continuously doubled in power every 18 months, while the cost per unit of computing has dropped precipitously. This is a factor that will continue to fuel the growth of open and vibrant data exchange, as more and more small devices become increasingly powerful computers. Right now, an average smartphone has more computing power within it than was used in the entire Apollo moon landing.
proportion of population with access to information infrastructure
For people to openly exchange data and build on each others' insights, a very basic ingredient is simply access to technology and digital data. “Cloud” based storage, smart phones, wireless networking technologies and many other factors have made this a reality for an increasing number of people, but remote and poorer regions of the world still lack that basic access. As access increases, so does vibrancy.
wireless connectivity and bandwidth
For vdat to become real wireless connectivity needs to be extended to the underserved people of the world. Amid the Arab Spring uprisings governments cut off carrier services to prevent people from organizing. New tools, like peer to peer connectivity of mobile devices could be one way of ensuring connectivity for all.