Let us have a unit simplex with $d+1$ vertices, one of them (with an index 0) at origo. The second vertex $e_1= (1,0,...)$ has length $\lambda_1=1$, where $\lambda_n$ means the perpendicular distance of each new vertex from the simplex face formed by earlier vertices.
That way: \[\lambda_n^2= 1 - \sum_{i=1}^{n-1} (\lambda_i/(i+1))^2 \] and we can assemble a matrix $E\in\mathbb{R}^{d\times d}$ from the vertices starting from the origo:
For example $\lambda_2= \sqrt{3}/2$ and $\lambda_3= \sqrt{2/3}$ are heights of an equilateral triangle and of a regular tetrahedron, respectively. The rows $e_i\in E$ have the following property:
That means simplex edges starting from origo are unit vectors with $60^o$ angle between all of them. Sounds a tight grid!? Well, not always a maximally tight one... E.g. with $d= 24$ a denser packing of unit spheres is possible, but the sampling presented here (when seen as a packing problem) performs consistently well in most dimensions.
Now, we refresh the procedure in the 2021 post. We form an index matrix $A\subset\mathbb{Z}^{n\times d}$
to find unique rows $A_u\subseteq A$: \[A= \{\text{round}(E^{-1}p/\epsilon)\mid p\in P\}\] and finally the sampling \[\{P\}_\epsilon= EA_u\epsilon.\] If one attacks the census info $n(q), q\in \{P\}_\epsilon$, which tells how many samples $p\in P$ gets rounded to $q$, we have achieved a histogram with a rather nice spatial regularity, which does not occur with the Poisson disc sampling. Otherwise, these two methods (Poisson and rounding by a slanted grid $E$) have equal usage in ML. The time complexity is of $\mathcal{O}(nd)$ when a nice hashing method per each column will be used to find unique rows of $A$. (The ranges at each column of $A$ can be balanced by first centering $P$ to origo).


No comments:
Post a Comment