A new paper about cities 🏙️, their structure 🧩 and satellites 🛰️!
"Decoding (urban) form and function using spatially explicit deep learning" by me and @darribas
https://doi.org/10.1016/j.compenvurbsys.2024.102147
We took spatial signatures and built a neural network to pick them up from Sentinel 2. And since it didn't show great results, we mixed in some geographic data science to learn if and how much it helps.
A quick 🧵 on work that took us 3 years to finish 🙃.
The inclusion of spatial modelling 'on top' of the predictions from the CNN proved to consistently improve the performance of the modelling pipeline, while some other techniques showed occasional success.
In the end, we managed to improve the accuracy of the prediction by almost 0.2 in some cases, bringing the quality of the tricky signature prediction up to the level of common land use or land cover prediction models.
The link to the open access paper is in the first post. The code is available at https://urbangrammarai.xyz/signature_ai/ and the intermediate data at https://doi.org/10.6084/m9.figshare.26203367.
@martinfleis Awesome, this seems like a good thread to ask geo x ai questions 🤓
I always wondered why it's not more popular to take multiple scenes as inputs as in: video model or a transformer with positional embeddings encoding a sequence of scenes.
In addition I expected positional encodings of time and space so that the model can learn: this chip is from North America in December, so e.g. I can expect snow in the image.
Most earth observation models still seem to work on image data only?
@djh For sure. I share this annoyance. Also:
– Not even trying to understand miltispectral bands.
– Sample-normalizing calibrated, linear data, thereby losing information.
– Augmenting in ways that might make sense for snapshots but do not reflect anything that you would actually see in geospatial data.
– Assuming there is little information in fine details because they’re used to Bayer demosaicked images, where that’s interpolated anyway.
@vruba Wait it's not even about transformers.
It's just that I rarely see anyone in the geo space working on remote sensing machine learning projects treating it as a geo task and not simply pretending it's just images.
Examples, like I said: adding positional encoding of time and space:
- e.g. day in the year so we can pick up that snow in winter is reasonable
- e.g. coordinates or at least some sort of quadkey token so we can pick up that snow in winter in Berlin is reasonable