Intrepid Monkey in the Point Cloud: April 2024

One posting (When do point sets intersect(part 1) ) touched Poisson disk sampling (PDS). A grid based approximation was used there because ... it served the particular case considered there. There are many good algorithms, e.g. Bridson's algorithm (2007) is of $\mathcal{O}(d |A|)$, where $A\subset\mathbb{R}^d$ is a point set dense enough that the Poisson disk radius $r$ is smaller than the mean length $l_0$ between points: $r< l_0$.

PDS is sometimes needed in some manifold operations (e.g. for creating a well-behaving kernel, creating a balanced sampling) and quite fast algorithms exist for metric spaces. The defining property of PDS is that natural neighbor (NN) samples tend to be at distance $r$. We are interested in a case, where the PDS distance $r < l_0$ is used to remove close points (red circles in Fig. 1) so that the distance histogram becomes cut (e.g. to apply algorithms, which are numerically sensitive to small gaps. We also want the samples not to get attracted by local voids, therefore we fill in some temporary bogus samples (green in the Fig. 1).

Fig. 1. The initial step (black dots) is modified by removing close points (red circles) and adding bogus points (green) to free spaces.

The algorithm eliminates some points from the dense areas of the PC $A$ and adds the same amount of points to a PC $B$. The algorithm focuses on a set $C_i=A_i\cup B_i$ on each iteration $i$ over steps 1-3. Initially $i:= 0,\; A_0:= A$ and $B_0 =\; \{\}$:

Iteration $i$: Produce a Delaunay triangularization $(C_i,T_i)$.
Remove some points $p\in C_i$, which are very close to another point
Add the same amount of points to centers $c_t$ of largest triangles $t\in T_i$.
Go to 1 as long as changes happen in steps 2 and 3.
Output $A_n$ or $C_n$ depending on the application case.

This is actually just a scheme for a set of algorithms, and there is a chance for many variations at:

Step 1: $(A,T)$ can be approximative, especially when dimensionality $3<d$. Then, a set $NN(p)$ (in 2D case a triangle) can have a variable size, i.e. $|NN(p)| \equiv d+1$ characteristic to Delaunay simplices does not need to hold.
Step 2: Number $k$ of points to be removed can be controlled. For example: remove $p$ or $q$ if $\|p-q\| \le l_{min}$. Increase $l_{min}$ slowly until it reaches a value $r$ you want.
Step 3: Size of triangles $t$ corresponds to their measure (area/volume/hypervolume...) or an approximation of the measure. Addition may not be possible at some iterations, but addition may resume later. If there is an initial set with much higher density, the added points can be substituted by the nearest match; otherwise the added points $B$ will be ignored from the final result.
Step 4 can have a control for the current $l_{min}$ to approach the Poisson disk radius $r$.

The step 3 fills in synthetical points $b\in B$, which do not belong to the original data set. This is to prevent Poisson points concentrating to the edges of gaps in the data (phenomenon, which is prevalent in higher dimensions).

The iteration time of the above algorithm depends on the ratio $\lambda= r/l_0$. And here is the interesting question: how fast the iteration time grows when $r$ grows towards $l_0$? There have to be a certain point where there is no space left for addition of bogus points...

Anyways, if this algorithm scheme is used just to eliminate close points, one has to remove the synthetical points. Misquoting Hamlet: "There are more things about creating synthetical points on a manifold, Horatio, than are dreamt of by your brain".

Intrepid Monkey in the Point Cloud

Hexagonal lattice rotated

Monday, April 8, 2024

Eliminating close gaps between points