More than for any other post in this blog, the usual disclaimer applies here.
Let’s face it: what Google has implemented with the Earth Engine is very appealing since it is the first solution for Earth Observation data exploitation which concentrates all the open access EO data, the computing resources and the processing algorithms. This is the Remote Sensing Scientist dream. Or is it?
Talks and posters at ESA Living Planet Symposium this week show that an increasing number of people are using GEE to do science. One of the reasons put forward is the possibility of sharing the scripts, so that other people can reproduce the results. This is, in my opinion, an incorrect statement. Let’s have a look at a definition of reproducible research:
An article about computational science in a scientific publication is not the scholarship itself, it is merely advertising of the scholarship. The actual scholarship is the complete software development environment and the complete set of instructions which generated the figures. —D. Donoho
One important term here is complete. When you use GEE, or any other non free software like Matlab, even if you share your scripts, the core of the algorithms you are using is just a black box which can’t be inspected. Actually, the case of GEE is even worse than the one of non free software running locally. Google could change the implementation of the algorithms and your scripts would yield different results without you being able to identify why. Do you remember the “Climategate”? One of the main conclusions was:
… the reports called on the scientists to avoid any such allegations in the future by taking steps to regain public confidence in their work, for example by opening up access to their supporting data, processing methods and software, and by promptly honouring freedom of information requests.
During one of my presentations at the Living Planet Symposium I decided to warn my fellow remote sensers about the issues with GEE and I put a slide with a provocative title. The room was packed with more than 200 people and somebody tweeted this:
So it seems I was able to get some attention, but a 2-minute slide summarised in a 140 character tweet is not the best medium to start this discussion.
As I said during my presentation, I fully understand why scientists are migrating towards GEE and I don’t blame them. Actually, there is nobody to blame here. Not even Google. But in the same way that, after many years of scientists using non free software and publishing in non open access journals, we should take a step back and reflect together about how we want to do Earth Observation Science in a sustainable (which is the perenniality of GEE?) and really open way.
What I was suggesting in the 3 last bullet points in my slide (which don’t appear in the tweeted picture1) is that we should ask ESA, the European Commission and our national agencies to join efforts to implement the infrastructure where:
- all data is available;
- and every scientist can log in and build and share libre software for doing science.
And this is much cheaper than launching a satellite.
This is not to criticise what the agencies are doing. ESA’s Thematic Exploitation Platforms are a good start. CNES is developing PEPS and Theia which together are a very nice step forward. But I think that a joint effort driven by users’ needs coming from the EO Science community would help. So let’s speak up and proceed in a constructive way.