AI is only as good as the data it is built on
The foundation for any application of artificial intelligence is data – and the current thinking is the more, the better. Many businesses are trying to develop the field of AI in the hopes of revolutionising our lives and businesses.
But any capability of AI is only as good as the data it is built on.
We live in a world where data is a closely-guarded resource. Therefore, much of the innovation in AI has come from large companies which control vast amounts of data. For example, Google has mapped and photographed millions of miles of streets all over the world for the Street View tool on Google Maps, and can use this data to create AI-enabled directions features.
One factor that has allowed Google to achieve leaps forward in AI research is breaking down its internal data silos. In 2012, they standardised data collection across many different applications so that the data they held could be shared across different parts of the organisation for use in AI projects to improve their tools.
Collecting Unique Data
Smaller companies cannot behave like the tech giants because they simply do not have enough data or the money to send cars around the world taking photos of every street, or map the entire surface of the earth. For an early-stage tech company looking to accelerate the field of artificial intelligence, the quantity of data held by other companies is often out of reach. Instead, they often look to collect unique data.
But they also fear sharing or exchanging their own data, as ownership of unique data is a key asset. This has resulted in a fragmented industry for the smaller players.
In order to overcome some of the obstacles associated with fragmentation, we need to think in a more coordinated and open way about how we can share bits of data to open the way for more innovation. Data which follows clear, defined standards provides greater transparency, and helps to coordinate data structures for future collaboration and interoperation of products. In the field of AI, this means data sets can be combined, shared, and created to drive innovation.
The communication protocols which power the Internet have been among the most powerful open standards of modern times. The level of technological development we see today would hardly be possible without this global information sharing resource. As we move into the next decade, open standards around data in other areas will accelerate innovation in the next generation of cutting-edge technologies.
There are many different applications of AI where this could be relevant. To take the example of BIOS, our work centers on encoding and decoding signals between the brain and body to treat chronic health conditions. The nervous system carries an enormous amount of data back and forth within the body, so we use AI to understand the important signals relevant for treating diseases.
Researchers understand the need for data and standards sharing to increase the pace of scientific and technological advancement in this complex field. For example, the Allen Institute for Brain Science are committed to openly sharing their data on research on the brain in various mammals to enable breakthrough discoveries in the research of others.
But it’s not only researchers collecting data in this area. It’s time for companies to start doing the same. BIOS are committed to contributing to the development of open standards, for example through collaboration with IEEE to produce the recently published Neurotechnologies for Brain-Machine Interface Standards Roadmap. Standards will help those developing Brain-Machine Interface technologies to work together cohesively.
We believe this will enable the industry to move on from purely scientific discussions of neurotechnology to actually applying it in the real world, paving the way for faster adoption of both our own technology and collaborative efforts across the industry.
Open data shouldn’t equal lack of privacy
Open standards ultimately increase the accessibility, availability, and affordability of these technologies. To be clear, this is not to say that personal data should be shared freely. But we believe that open standards that put in place measures for privacy and responsible use of data enable the community to advance more quickly to help researchers and collaborators, which ultimately results in improved treatments for patients.
We would encourage all early-stage tech companies looking to develop AI capabilities to engage with open standards in their industry and play their part in ensuring we have the open standards necessary for accelerating next generation technologies.
This year's Open Data Day marked the tenth annual International Open Data Day, which featured local events around the world exploring the benefits of open data in communities. This initiative should help companies seeking to harness the power of AI consider their own role and responsibility in data sharing and the open standards which will enable this.