Intrepid Monkey in the Point Cloud
A maddening walk-through over some basic concepts of Computational Geometry. Quite a lot of stuff is going to be about the point clouds, 3D shape recognition and the Machine Learning pre-processing. This is learning by doing, errors included. Everything will happen in slow motion.
Hexagonal lattice rotated
Matching pairs of two regular lattices circled
Wednesday, July 31, 2024
Lattice lambada in higher dimensions
Thursday, June 20, 2024
Tangentially, yours!
A problem popped recently to the mental horizon: how close a predicted trajectory of a ship comes to a real trajectory? For a moment, let us consider only the unit sphere and unit vectors and we happily identify points $p$ of a unit sphere centered to origo $O$ with vectors $p-O$... Also, we identify a point $p= (\theta,\phi)$ expressed by its latitude $\phi$ and longitude $\theta$ with its 3D map $p\rightarrow q$ image $q= (\cos\theta\cos\phi,\, \cos\theta\sin\phi,\, \sin\theta)$. So the reader, be aware. (And if the notation gets you baffled, check the notation page...)
Let us assume that an arc $[p_0,b]$ is a part of the predicted trajectory and $p_1$ is the ground truth (the pun intended). $p_0$ can be the last reference point and $b= \hat{p}_1$ is the prediction for $p_1$. The problem can be presented as finding the arc distance of a point $p_1$ from a great circle $C$ defined by the arc $[p_0,b]\subset C$. This great circle $C$ can be identified with a unit vector $c$, which is a normal of the plane of $C$.
OK, $c\perp b$ and $c\perp p_0$, that is: $c= \pm(p_0\times b)^0$ and we can find a vector $a\in [p_0,b]\subseteq C$ which is a projection of $p_1$ on the plane $C$ with a suitable scaling so that $\|a\|=1$: \[a= (p_1 - (p_1\cdot c) c)^0.\;\; (1)\] Now, arc lengths shown in Fig. 1 are: \[\alpha_\perp= |[a,p_1]|= \pi/2-\cos^{-1}(c\cdot p_1)\;\; (2) \\ \alpha_\parallel= |[a,p_0]|= \cos^{-1}(a\cdot p_0).\;\; (3)\].
Eq. (2) is useful in estimating the relative randezvous error $e_p=|[a,p_1]|/|[p_0,p_1]|$ and Eq. (3) leads to the relative time error $e_t= |[p_0,a]|/|[p_0,b]|-1$. The estimate of $e_t$ assumes time information was not a part of the teaching of the predictor, which is not always the case.
Now, the question arises... Should we use spherical geometry here? Would computations of $e_p$ and $e_t$ become faster, or would the numerical accuracy be better, or both? The latter is important, since it is easy to have numerical instability at distances below 20 km. As it will be revealed soon (after the Midsummer Feast, I hope), the answer is a surprisingly close call. In general, a lot of trigonometric trickery can be substituted by vector formulas (using either geometric algebra or vector algebra) without much loss in computational efficiency, when operations happen in the modern processing environments and intermediate results have enough memory to dwell in...
But while you are waiting, here is the way where all points get projected to a plane defined by points $p_0$,$p_1$ and $b$: $l_\perp= \sqrt{1-\cos(\beta)^2}\|p_1-p_0\|$ and $l_\parallel= \cos(\beta)\,\|p_1-p_0\|$, where $\cos(\beta)= (p_1-p_0)^0\cdot (b-p_0)^0$.
It yields quite good results as long as the distances are under 200 km ($|[u,v]|/\|u-v\| - 1 < 32\times10^{-3}$).
Monday, April 8, 2024
Eliminating close gaps between points
One posting (When do point sets intersect(part 1) ) touched Poisson disk sampling (PDS). A grid based approximation was used there because ... it served the particular case considered there. There are many good algorithms, e.g. Bridson's algorithm (2007) is of $\mathcal{O}(d |A|)$, where $A\subset\mathbb{R}^d$ is a point set dense enough that the Poisson disk radius $r$ is smaller than the mean length $l_0$ between points: $r< l_0$.
PDS is sometimes needed in some manifold operations (e.g. for creating a well-behaving kernel, creating a balanced sampling) and quite fast algorithms exist for metric spaces. The defining property of PDS is that natural neighbor (NN) samples tend to be at distance $r$. We are interested in a case, where the PDS distance $r < l_0$ is used to remove close points (red circles in Fig. 1) so that the distance histogram becomes cut (e.g. to apply algorithms, which are numerically sensitive to small gaps. We also want the samples not to get attracted by local voids, therefore we fill in some temporary bogus samples (green in the Fig. 1).
Fig. 1. The initial step (black dots) is modified by removing close points (red circles) and adding bogus points (green) to free spaces.The algorithm eliminates some points from the dense areas of the PC $A$ and adds the same amount of points to a PC $B$. The algorithm focuses on a set $C_i=A_i\cup B_i$ on each iteration $i$ over steps 1-3. Initially $i:= 0,\; A_0:= A$ and $B_0 =\; \{\}$:
- Iteration $i$: Produce a Delaunay triangularization $(C_i,T_i)$.
- Remove some points $p\in C_i$, which are very close to another point
- Add the same amount of points to centers $c_t$ of largest triangles $t\in T_i$.
- Go to 1 as long as changes happen in steps 2 and 3.
- Output $A_n$ or $C_n$ depending on the application case.
- Step 1: $(A,T)$ can be approximative, especially when dimensionality $3<d$. Then, a set $NN(p)$ (in 2D case a triangle) can have a variable size, i.e. $|NN(p)| \equiv d+1$ characteristic to Delaunay simplices does not need to hold.
- Step 2: Number $k$ of points to be removed can be controlled. For example: remove $p$ or $q$ if $\|p-q\| \le l_{min}$. Increase $l_{min}$ slowly until it reaches a value $r$ you want.
- Step 3: Size of triangles $t$ corresponds to their measure (area/volume/hypervolume...) or an approximation of the measure. Addition may not be possible at some iterations, but addition may resume later. If there is an initial set with much higher density, the added points can be substituted by the nearest match; otherwise the added points $B$ will be ignored from the final result.
- Step 4 can have a control for the current $l_{min}$ to approach the Poisson disk radius $r$.
Saturday, October 28, 2023
Interpolation between curves
The audience approached me with the following msg:
- Hey, you! Give us something else but points, even just for an instance!
Well, here comes. One project had an interesting problem: Given two ships starting approximately from the same point, what could be a way to interpolate their routes? A simple answer is to use positions after a time shift so that the interpolated movement starts from a nice initial point $p_{ab}= (p_{a0}+p_{b0})/2$, where $p_{a0}=p_a(t_a)$ is a chosen reference point of a ship $a$ at a moment $t=t_a$, $p_{b0}= p_b(t_b)$ is the closest point a ship $b$ gets to $p_{a0}$ on its own route at $t=t_b$, and $t_{ab}=(t_a+t_b)/2$, see Fig.0. Then:\[p(t-t_{ab})+p_{ab}=w_a p_a(t+t_a) + w_b p_b(t+t_b),\] where $w_a,w_b, w_a+w_b= 1$ are the interpolation weights and $p(0)= 0$. But this leads to some awkward outcomes when ships move at different speed. It is possible to get loops and back and worth movements, for example.
Fig. 0: lineasr interpolation between ship positions.
Another way is to consider routes without speeds and consider the steering process (rudder movements and such, this all being summarized as orientation change per cruising length) as the signal to be interpolated. The speeds can be processed separatedly, then. Let us define an operation $K= C(P)$ which extracts an oriented curvature $K= \{\kappa(s)\,|s\in S\}$ along a curve $P= \{p(s)\,| s\in S\}$ where $S$ is the curve length parameter $S= [0,L]\subset\mathbb{R}$. This is similar (but not equal) to differentiation twice. Then one needs an inverse operation $P= C^{-1}(K, p_0,\beta_0)$ which is similar (but not equal) to integration twice with two boundary conditions: $p_0= p(t=0)$ and $\beta_0$ is the initial tangential direction of $P$ at $t=0$. With this tool set, one can express the curvature interpolation of curves:\[P_{ab}=C^{-1}\left( w_a C(P_a) + w_b C(P_b), p_{ab},\beta_{ab}\right)\]
Just have to define $C$ and its inverse, anymore. The curvature\[\kappa(s_i)= 2\frac{\beta_i - \beta_{i-1}}{l_i+ l_{i-1}},\] where the indexing of edge lengths $l_i$ and tangential orientations $\beta_i$ are shown in Fig. 1. The inverse process requires an initial point $p_i=p_{ab}$ with $i= 1$ and initial tangential orientation $\beta_i= \beta_{ab}$. Then an Euler -like segment-by-segment construction rule is: \[ \beta_i= \beta_{i-1} + l_{i-1}\kappa_{i-1}\\ p_i= p_{i-1} + l_i \left(\cos(\beta_i),\sin(\beta_i) \right)\].
I mentioned Euler? It means Euler integration, which is notorious in producing cumulative $\mathcal{O}(\epsilon)$ error due to discretization size $\epsilon$. But this can be alleviated with better integration procedures, e.g. 2-3 degree Runge-Kutta, see Hairer & Wanner 1993. But we demonstrate here the results with much too sparse segmentation just to show both that the method works, and what are the perilous effects of noise and errors in the path description.
Fig. 2 shows how the curve P3 (the rightmost one) has a small detail in the top which has a prominent effeft to interpolations, since also P2 has a similar, yet larger part.
Fig. 4: Interpolation between curves P1 and P2.
Finally, the Fig 5. has a case where $w_1= w_2=w_3= 1/3$. The initial contours counter each other but the kink to the left in the end is common to all and is shown in the final result. Note how the straightened length reaches further than the interpolant curves themselves.
If this feels somewhat odd and awkward, maybe it is because it it. It is a novel way to interpolate over several curves. There are some similar approaches (see e.g. the Abode curve drawing method (2017)), but they do not focus in the interpolation problem. And then there are moving window approaches, which do not utilize the curvature representation.
Tuesday, October 3, 2023
How common is an isosceles triangle?
Isosceles triangle $t$ is one which has two sides with equal lengths, But it should not be equilateral. (Logically speaking every equilateral triangle is also isosceles in three different ways, but let us not be that logical). And talking about equality, we need a relative tolerance $\epsilon_{12}$ for any two sides and another relative tolerance $\epsilon_3$ for any third side. And it will be hard to dictate the mostly psychological limit when a triangle stops being equilateral and starts being an isosceles.
Anyways, we are going towards a predicate $P(t,\epsilon_{12},\epsilon_3)$, which is true when the triangle is isosceles with given two limits.\[P(t,\epsilon_{12},\epsilon_3) \equiv |l_1 - l_2| < \epsilon_{12} l_{12} \text{ and } \epsilon_3 l_{12} < |l_{12} - l_3|\], where $l_{12}= (l_1+l_2)/2$ is the mean of any two edge lengths and $\epsilon_{12} \ll \epsilon_3$ and $\{l_1,l_2,l_3\}$ are the edge lenghts of the triangle $t$. In plain English, two sides of $t$ are of an approx. equal length with a difference of $\epsilon_{12}$ at most, and the third edge length is a significantly different from those two. So, how often one encounters such a triangle?
The tolerance condition visualized as a red segment.
Another formulation would define a predicate $P_3(.)$ for an equilateral triangle $t$:\[P_3(t,\epsilon_{12},\epsilon_3)\equiv \max_i(l_i) - \min_i(l_i) < \epsilon_3 \text{mean}_i \, l_i\].Then, an isosceles would have a predicate $P_{12}(.)$:\[P_{12}(t,\epsilon_{12}) \equiv |l_1 - l_2| < \epsilon_{12} l_{12} \text{ and not } P_3(t,\epsilon_3)\] .
One almost misses probabilistic geometry here... (Just as there is geometric algebra and algebraic geometry, there is also probabilistic geometry and geometric probability. The latter is -uhm- somewhat constructive and experimental, and the former theoretically solid, but somewhat obscure). So, let's go along the geometric probability road and make a choice of the topology of the probabilistic measure. Delaunay triangulation it is, and with it, the evident boundary effect. And the boundary effect depends on the number of points, when the PC shape is fixed (in this case, fixed to a unit square with a boundary of length 4). We have already outlined how to measure the 'boundariousness' of a PC, and how to define a boundary point. But now we do some census over the inhabitants of the Delaunay triangulation, approximating an infinite 2D plane with random points, at certain densities.
But about the results: Below left is the ratio $R$ of observed isosceles from a set $P$ of points in a unit square. The small size of $P$ is revealed by fluctuations in different computations (top right corner). An indication that the $R$ really is of first order of the shape accuracy limit $\epsilon_{12}$, is at the right detail, where the ratio $R/\epsilon_{12}$ is shown to stay within a limited range while $\epsilon_{12}$ has a thousandfold range.
A simplistic explanation for the category of $R$ is that the isosceles top vertex inhabits a narrow vertical band, when the best candidate for the base edge is scaled to unit length, see a sketch below. The band increases its width asymptotically towards infinity, and has some slight complexities at the spots where equilater triangles are. And long needle like triangles (those with the unit scaled presentation having the top vertex close to infinity) are more rare than others, but this does not get analyzed this time. The previous image is a better verification of the $T\in\mathcal{O}(\epsilon)$ behavior.
![]() |
| Img. 3. Isosceles triangles (striped zone) with the base edge as a unit segment. The part below is symmetrical and omitted. A missing spot is the domain of equilateral triangles. |
There is something to scrutinize in Fig. 2. The Delauny process (should be familiar for the reader at this point) favors close to equilateral triangles because of the so called Delaunay property (every triangle $t$ defines an enclosing circle which has no other points within it than those of $t$), and maybe produces more isoclines, too? This is true, but requires a bit more finesse, and is left out of this text. And extension to 3D is obvious, but rather fruitless.












