Python implementation of the Computational Geometry & Data Structures workshop
Part of the VGIscience Summer School 2017 was a workshop on Computational Geometry & Data Structures held by Prof. Dr. Sabine Storandt. Participants were taught a basic yet powerful approach for extracting relevant points from a large set of points.
We first implemented a “dumb” linear scan of the data, inspecting every single point’s spatial relation to a specific area of interest and its data value against a specific threshold. This is obviously a costly and slow approach.
The then presented “Priority Grid” approach instead uses preprocessing and an additional data structure to reduce the number of actually “inspected” points significantly. Rectangular grid cells are draped over the entirety of points and then each point is assigned to the cell enclosing it. Additionally, all the points assigned to a cell are ordered by their data value. While this preprocessing takes its time and results in additional memory usage, later queries are highly optimized: Instead of comparing each point’s coordinates to the area of interest, only those of intersecting cells (which are few) have to be inspected. Additionally, as the points were sorted by value, the lookup within each cell can stop once a data value below the threshold has been seen.
The benefit of a simple preprocessing step to sort the points into a tailored data structure became apparent.
The workshop focussed on an exemplary implementation in C++, but I decided to follow in a Jupyter Notebook as I am most proficient in Python and prefer its simplicity and clarity. While I did not manage to do the full implementation during the workshop, I got the basics done and finished the rest later, adding visualisations and interactive elements (which helped me find a bug in my code).
You can look at a static, non-interactive version of the Jupyter Notebook but please feel encouraged to run it on your own. I left lots of notes and suggestions on potential problems and promising improvements so there is lots of work for practising if you feel inclined.
My repository provides the notebook as well as a plain Python export and a static HTML version.