What is a Strategic Data Lake?

Sovereign states, or unions of states, could evaluate the opportunity for a state or union to establish strategic data lakes as part of the country’s overall AI policy. These Strategic Data Lakes (SDL) may be either centralised or distributed virtual data lakes created through federated learning (or other privacy-enhancing techniques). Domestic entities and appropriately authorised international actors may draw upon the data contained in these data lakes to advance responsible AI innovation. SDL may also be deployed as a mechanism of value and exchange, including to secure access to data lakes curated in other jurisdictions for the nation’s entities or to build International Strategic Data Lakes (ISDL) in cooperation with trusted international allies. The provision of data to SDLs may be encouraged through soft nudges and voluntary mechanisms or may be mandated by regulation, for example, if the SDL is likely to enhance the public good or benefit (such as improving safety and environmental outcomes in national industry). A mandatory approach to data mirrors existing practices within the country’s resource sectors, where information becomes ‘open file’ and accessible to all in certain circumstances.

The creation of SDLs not only symbolises a shift towards a data-driven economy but also represents an investment in the intellectual capital of the nation or union, empowering entities to generate value from a country’s data, growing, attracting, and retaining AI skills within a nation or union while advancing the frontiers of AI.