Worldwide Dataset (Free to Access) Captures Earth in Finest Ever Detail￼
The free dataset, WorldStrat, will be presented at the NeurIPS 2022 conference in New Orleans. It includes nearly 10,000km² of free satellite images, showing every type of location, urban area and land use from agriculture, grasslands and forests to cities of every size and polar ice caps.
The dataset includes locations in the Global South and those needing humanitarian aid, which are often underrepresented in satellite imagery because this is usually collected for commercial gain, therefore disproportionately featuring wealthier regions.
The scientists, from UCL and Oxford University, say the collection enables worldwide analysis of terrain to tackle global challenges such as responding to natural and man-made disasters, managing natural resources and urban planning.
Work on WorldStrat began in 2021, and since it launched in June 2022 it has been downloaded over 3,000 times.
Project lead, Dr Julien Cornebise (UCL Computer Science) said: “The combination of high-resolution commercial imagery and machine learning has huge potential to enable planetwide analyses, which could help to tackle all kinds of global challenges – the problem is that commercial data are often locked behind a paywall.
“ESA’s TPM programme made our project possible by providing free access to data that would normally be very expensive.”
The team used data from the Airbus SPOT 6 and SPOT 7 satellites, commissioned by the ESA and launched in 2012 and 2014 respectively. The satellites can provide imagery at resolutions as high as 1.5m per pixel, meaning that each pixel represents a 1.5m by 1.5m area on the ground.
The scientists used around 4,000 highly detailed images from the SPOT satellites. Even those these images are high (spatial) resolution, they are low in temporal resolution, meaning in this context that each satellite doesn’t revisit and recapture each site regularly. This is because images taken by the satellites were originally intended to be used for specific commercial applications rather than longer-term analyses.
To combat this, the team also used freely available, lower resolution images from the Copernicus Sentinel-2 satellite. These are at higher temporal resolution, meaning they were captured at more regular time points every five days. They matched each SPOT image with 16 images from Copernicus Sentinel-2, using around 64,000 in total.
The researchers developed the dataset to also support the development of machine learning applications to extend and enhance it, for example to further improve the image resolution. To allow the development of further applications, the scientists have developed an artificial intelligence toolbox as well as the full source code, enabling developers to reproduce, extend and transform the work.
Dr Cornebise continued: “Thousands of data users from around the world have already downloaded WorldStrat – and we look forward to seeing the ways in which they extend and improve it, using machine learning techniques.”
The project was supported by ESA’s Phi-Lab as part of the ESA-funded QueryPlanet project. The work was carried out bu Dr Julien Cornebise with Ivan Orsolic and Dr Freddie Kalaitzis (Oxford University).
Direct to WorldStrat Dataset
About Gary Price
Gary Price (firstname.lastname@example.org) is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. He earned his MLIS degree from Wayne State University in Detroit. Price has won several awards including the SLA Innovations in Technology Award and Alumnus of the Year from the Wayne St. University Library and Information Science Program. From 2006-2009 he was Director of Online Information Services at Ask.com. Gary is also the co-founder of infoDJ an innovation research consultancy supporting corporate product and business model teams with just-in-time fact and insight finding.