squarenet package¶
Subpackages¶
Submodules¶
squarenet.artist module¶
- squarenet.artist.sqplot(grid, verbose, style='checkerboard', animate=False, save=True, save_path='sqrnet/plot', cfg=None)[source]¶
Render a structured grid — static snapshot or morphing animation.
- Parameters:
grid (np.ndarray) – Input structured grid of shape (…, D) where D is 2 (2-D) or 3 (3-D).
style (str) – One of
"checkerboard","mesh", or"scatter"."scatter"draws every point as a plain dot: coloured by depth (3rd coordinate after the isometric rotation) in 3-D, or in black in 2-D. No background grid is shown.cfg (dict, optional) – Rendering configuration. Defaults to
default_config(). Missing keys are filled in from the defaults, so you only need to specify what you want to override. Extra key for scatter:"cmap_scatter"(default"viridis").
- Returns:
fig (matplotlib.figure.Figure)
ani (matplotlib.animation.FuncAnimation or None) –
Nonewhencfg["animate"]is False.
squarenet.bubbletree module¶
squarenet.core module¶
- squarenet.core.carthesian_sort(gridmap, points, max_iter=100, method='fast', backend='numpy', loop=None, loopseq='decreasing', verbose=2)[source]¶
Supported:¶
method [fast, robust, ultimate] backend [numpy, jax, torch]
Goal¶
Given an unordered point cloud:
points.shape = (N, D)
rearrange point indices into a structured cartesian grid:
grid.shape = (N1, N2, …, ND)
such that neighbouring cells of the grid contain spatially coherent points.
GRID INTERPRETATION¶
A grid index map bijectively to a point index:
point index: n <=> grid index: f(n) = (n1, n2, …, nD)
Each grid axis defines a local neighbour relation.
For axis d, let define:
next(n, d) = f-1(n1,…, nd+1,…,nD)
such that P(next(n,d)) is the neighbour obtained by incrementing the d-th grid coordinate.
The objective is therefore:
nearby euclidean points -> nearby grid cells
Note that the reciprocal property
nearby grid cells -> nearby euclidean points
is NOT guaranteed.
Indeed, datasets may contain cracks, holes, disconnected clusters, folds, or other topological singularities.
Gridification will naturally tend to close cracks, stitch nearby boundaries together and overlap the folds.
This algorithm is primarily designed for speed and scalability on large datasets. It is NOT an optimal transport solver.
Therefore, users should always inspect the resulting grid, especially near boundaries where geometric distortions are more likely to appear.
HEURISTIC ORDERING PRINCIPLE¶
For each spatial dimension d, define d spatial heuristics:
H_d(point) -> scalar
Heuristics must be orthogonal and monotonic Typical choice: cartesian heuristics
H_0 = x H_1 = y H_2 = z …
One could think on an improved version of the algorithm which would learn heuristics that best fit the dataset but cartesian are already pretty good
The desired property is:
H_d(P(n)) <= H_d(P(next(n, d)))
for every point index n and every axis d.
In words:
values of heuristic d should increase along grid axis d.
DISORDER METRIC¶
A local inversion occurs when:
H_d(P(n)) > H_d(P(next(n, d)))
The total disorder is simply the number of such violations.
Pseudo-definition:
- disorder =
sum over all axes d sum over all point pairs (current, next) of inversion count
FAST METHOD¶
The simplest strategy consists in repeatedly sorting each grid axis independently.
Pseudo-code¶
initialize gridmap
for iteration in range(max_iter):
for axis d in range(D):
sort grid indexes along axis d using heuristic H_d
compute disorder
- if disorder == 0:
stop
Properties¶
- Advantages:
extremely fast
fully vectorized
memory efficient
surprisingly effective
- Limitations:
for loop on the axes
-> early axes dominate later ones -> result in weak axes
only adjacent comparison is a weak accuracy criterion
-> disorder is blind on what happens on diagonals -> may converge toward local minima
ROBUST METHOD¶
To reduce axis domination and improve stability, sorting is performed only on random independent subgrids at each step, thus learning is much more progressiv
At every iteration:
each axis is randomly partitioned
only selected cartesian sub-blocks are sorted
different blocks are used for different axes
Result:
smoother convergence
better axis symmetry
fewer local minima
improved isotropy
Pseudo-code¶
for iteration:
generate random cartesian subgrids
for axis d:
for subgrid in selected_subgrids[d]:
sort only inside this subgrid
compute disorder
ULTIMATE METHOD¶
Even robust cartesian sorting still treats grid lines as parallel and therefore almost independent.
This can produce:
stratification
layered artifacts
weak coupling between parallel slices
To solve this, a second refinement stage is introduced.
The grid is embedded into a “hash table” containing shifted/sheared copies of the grid.
Repeated cyclic shears produce strong cross-line coupling.
This allows information to propagate between previously independent cartesian lines.
High-level idea¶
repeat:
shear grid into staggered hash table
sort along one axis
project back to grid
Effect¶
The repeated shearing progressively destroys artificial parallel structures and greatly reduces stratification. ============================================================ SUMMARY ============================================================
- FAST:
deterministic global line sorting (cartesian sort)
- ROBUST:
stochastic partial sorting preserving axis symmetry
- ULTIMATE:
sheared multi-pass relaxation reducing stratification artifacts
COMPLEXITY¶
A few hundreds iterations will be largely enough for convergence in most of the cases.
Let:
N = total number of points
Each iteration performs approximately:
O(N (log N +D)) operations
with highly vectorized NumPy operations.
squarenet.neighbormap module¶
squarenet.sampler module¶
- squarenet.sampler.loaddatasets(force=False)[source]¶
Download dataset files from GitHub into a local cache directory.
- Parameters:
force (bool) – If True, re-download files even if they already exist.
- squarenet.sampler.place_at(points, position=(0, 0))[source]¶
Normalize point cloud to [0,1]^D and shift it at the specified position
- Parameters:
points (numpy.ndarray) – Input point cloud.
position (Target position of the minimum corner after normalization.)
- Returns:
point cloud normalized and placed at the
- Return type:
numpy.ndarray
- squarenet.sampler.samplepoints(method='gaussian', size=(1000000, 2), plot_points=False)[source]¶
Generate point clouds using various sampling strategies.
- Parameters:
method (str) –
Sampling method. Supported values:
General methods (any dimension D): - “square” : uniform in [-1, 1]^D - “ball” : uniform in the unit ball - “ring” : Gaussian with radial offset - “gaussian” : standard normal distribution - “spiky” : L^alpha ball (alpha < 1) - “holy” : hypercube with spherical holes
Dataset-based methods (fixed dimensions): - “barbara”, “lena”, “france”, “germany” : 2D datasets - “everest” : 3D dataset (elevation-based sampling)
size (tuple of int) – (N, D) where N is the number of points and D the dimension. Note: D must match the dataset dimension for dataset-based methods.
plot_points (bool, optional) – If True, display the generated points.
- Returns:
Array of shape (N, D) containing sampled points.
- Return type:
numpy.ndarray
squarenet.squarenet module¶
- exception squarenet.squarenet.ConvergenceWarning[source]¶
Bases:
UserWarningRaised when optimization does not converge.
- class squarenet.squarenet.SquareNet(gridshape, max_iter=1000, warnings_=True, verbose=2, backend='numpy', device='cpu')[source]¶
Bases:
objectGrid straightening algorithm for D-dimensional point clouds.
SquareNet reorganizes an unordered set of points into a structured grid by iteratively sorting indices along each spatial axis. The goal is to enforce a topology where neighboring points in the original space are also close on the grid.
The grid is internally represented as an array of indices of shape
(N1, ..., ND), where D is the dimensionality of the embedding space.- Parameters:
gridshape (tuple of int) – Shape of the target grid. The product of these dimensions must exactly match the total number of input points (bijective gridification).
max_iter (int, default=100) – Maximum number of iterations allowed for the sorting procedure.
warnings (bool, default=True) – If True, emits a
ConvergenceWarningif the grid is not perfectly ordered at the end of the process.backend ({"numpy", "torch" or "jax"}, default="numpy") – input and output backend.
- grid¶
The structured grid of indices, with shape
gridshape.- Type:
ndarray
- invert_grid¶
Inverse mapping from point indices to grid coordinates.
- Type:
ndarray, jax or numpy
- invgridflat¶
Flattened version of the inverse mapping for fast indexing.
- Type:
ndarray, jax or numpy
- learning_curve¶
History of the disorder metric across iterations.
- Type:
list of float
- 1. Run ``fit()`` with your preferred backend.
- 2. Call ``SquareNet.map()`` on any feature indexed like the points (N, \*C)
- To build a tensor version of it (*G, \*C) based on the points learned multi-indexes
- fit(points, method='fast', fit_with_numpy=False)[source]¶
Fit the grid to a point cloud.
- Parameters:
points (ndarray of shape (N, D) or str) – Input point cloud or sampling method name.
method (fast, robust or ultimate.) – robust can be up to 5 time slower, but will probably give a better grid. ultimate can be up to 30 time slower but will give the best results among the three methods.
fit_with_numpy – wether to apply conversion to numpy (just for the fit) and fit with numpy method. Might be faster, always consider it as an option. Default is to fit on the backend (and device) given at init e.g. numpy, torch or jax.
- Returns:
None – The method updates the internal state of the grid in-place.
see
squarenet.core
- mapidx(index)[source]¶
Cloud index -> grid index.
- Returns:
directly usable for indexing.
- Return type:
tuple of arrays
- neighbormap(max_sample_size=20000000, max_window_size=31, criterion='rank', thresholdcut=1, projection=(0, 1), log2=False)[source]¶
Compute and display a neighborhood map from gridded points.
For each point X (or a subsample of sample_size if the dataset is too large), scans a square window of radius wr = ws//2 in the grid centered on X, and adds 1 to each cell if the point Y found in the cell matches the criterion.
- Parameters:
log2 (bool, default False) – If True, applies log2 scaling to counts.
max_sample_size (int, default 20 million) – sample size for the number of pairs (X, Y)
max_window_size (int, default=31) –
Note that
wr, the window radius, is defined aswr = window_size // 2. Search window size, i.e. the size of the window in which the neighbor map is computed, such that: gridindex(X) - gridindex(Y) <= wr` where <= is relative to the L∞ norm on grid indices.Warning
You will get no information, and thus no garantee at all, on what happens outside the search window. Conversely, using a very large search window dilutes the information, since the probability of sampling pairs for a given grid offset decreases.
Choosing the window size is therefore a tradeoff between macroscopic information and microscopic information
criterion (str, optional) – Method used to define neighborhoods (“rank” or “value”).
thresholdcut (int or float, optional Cutoff threshold for connections, interpreted according to the criterion.) –
If criterion is “rank”: thresholdcut = k means Y matches X
if Y is among the k nearest points to X. - If criterion is “value”: thresholdcut is a distance threshold, and Y matches X if the distance(X, Y) <= thresholdcut.
projection (tuple of int, optional) – Axes of the grid used for 2D projection.
- Returns:
None – Prints the resulting neighborhood matrix.
see
squarenet.neighbormap
- plot(style='checkerboard', animate=False, save=True, save_path='sqrnet/plot', **kwargs)[source]¶
Display the mapped grid as a static figure or animation.
Many rendering options (scales, colors, figsize, DS, frames …) can be passed as keyword arguments or configured once via
self.plot_config[key] = value. print(self.plot_config) to see all available parameters).- Parameters:
style (str) – Rendering style:
"checkerboard","mesh"or"scatter".animate (bool) – If True, produce a morphing animation (grid → identity). If False, render a static plot.
save_path (str or None) – None → display with
plt.show(). str → save to disk
- Returns:
fig (matplotlib.figure.Figure)
ani (matplotlib.animation.FuncAnimation or None)
see
squarenet.artist
- search_sorted(X, n_iter=2, side='left')[source]¶
X must be a SINGLE point, search_sorted will return a multiindex IX in the grid. IX is the grid index of PROBABLY one of the closest points to X in the first (side = ‘left’) or last (side = ‘right’) quadrant (relative to X) of the space. The search is a greedy coordinate descent: each dimension is refined by applying 1D searchsorted on row, then columns… Augmenting n_iter increases accuracy.
squarenet.utils module¶
squarenet.views module¶
- squarenet.views.lazylocalview(*args, **kwargs)[source]¶
wrapper on localview for cleverly requesting a (small) subset of the views. args[0] must be the selection mask which specify wich views are requested
- Returns:
dictionary key -> View[key]
- Return type:
WindowCollector
- squarenet.views.localview(Xmap, wr, D, boundary='reflect', **pad_kwargs)[source]¶
Compute local neighborhood views of a data Xmap (related to the point cloud) on the latent grid.
The function applies a sliding window over the grid to collect the local view
- Parameters:
D – dimension of the points
wr (int or sequence of int) – window radius per grid axis. Window size per axis is: ws = 2 * wr + 1
boundary (str) – Padding mode = boundary condition passed to np.pad (e.g. “constant”, “reflect”, “edge”, …)
**pad_kwargs – additional arguments forwarded to np.pad, see Numpy doc
- Returns:
- Return type:
np.ndarray
Module contents¶
- class squarenet.SquareNet(gridshape, max_iter=1000, warnings_=True, verbose=2, backend='numpy', device='cpu')[source]¶
Bases:
objectGrid straightening algorithm for D-dimensional point clouds.
SquareNet reorganizes an unordered set of points into a structured grid by iteratively sorting indices along each spatial axis. The goal is to enforce a topology where neighboring points in the original space are also close on the grid.
The grid is internally represented as an array of indices of shape
(N1, ..., ND), where D is the dimensionality of the embedding space.- Parameters:
gridshape (tuple of int) – Shape of the target grid. The product of these dimensions must exactly match the total number of input points (bijective gridification).
max_iter (int, default=100) – Maximum number of iterations allowed for the sorting procedure.
warnings (bool, default=True) – If True, emits a
ConvergenceWarningif the grid is not perfectly ordered at the end of the process.backend ({"numpy", "torch" or "jax"}, default="numpy") – input and output backend.
- grid¶
The structured grid of indices, with shape
gridshape.- Type:
ndarray
- invert_grid¶
Inverse mapping from point indices to grid coordinates.
- Type:
ndarray, jax or numpy
- invgridflat¶
Flattened version of the inverse mapping for fast indexing.
- Type:
ndarray, jax or numpy
- learning_curve¶
History of the disorder metric across iterations.
- Type:
list of float
- 1. Run ``fit()`` with your preferred backend.
- 2. Call ``SquareNet.map()`` on any feature indexed like the points (N, \*C)
- To build a tensor version of it (*G, \*C) based on the points learned multi-indexes
- fit(points, method='fast', fit_with_numpy=False)[source]¶
Fit the grid to a point cloud.
- Parameters:
points (ndarray of shape (N, D) or str) – Input point cloud or sampling method name.
method (fast, robust or ultimate.) – robust can be up to 5 time slower, but will probably give a better grid. ultimate can be up to 30 time slower but will give the best results among the three methods.
fit_with_numpy – wether to apply conversion to numpy (just for the fit) and fit with numpy method. Might be faster, always consider it as an option. Default is to fit on the backend (and device) given at init e.g. numpy, torch or jax.
- Returns:
None – The method updates the internal state of the grid in-place.
see
squarenet.core
- mapidx(index)[source]¶
Cloud index -> grid index.
- Returns:
directly usable for indexing.
- Return type:
tuple of arrays
- neighbormap(max_sample_size=20000000, max_window_size=31, criterion='rank', thresholdcut=1, projection=(0, 1), log2=False)[source]¶
Compute and display a neighborhood map from gridded points.
For each point X (or a subsample of sample_size if the dataset is too large), scans a square window of radius wr = ws//2 in the grid centered on X, and adds 1 to each cell if the point Y found in the cell matches the criterion.
- Parameters:
log2 (bool, default False) – If True, applies log2 scaling to counts.
max_sample_size (int, default 20 million) – sample size for the number of pairs (X, Y)
max_window_size (int, default=31) –
Note that
wr, the window radius, is defined aswr = window_size // 2. Search window size, i.e. the size of the window in which the neighbor map is computed, such that: gridindex(X) - gridindex(Y) <= wr` where <= is relative to the L∞ norm on grid indices.Warning
You will get no information, and thus no garantee at all, on what happens outside the search window. Conversely, using a very large search window dilutes the information, since the probability of sampling pairs for a given grid offset decreases.
Choosing the window size is therefore a tradeoff between macroscopic information and microscopic information
criterion (str, optional) – Method used to define neighborhoods (“rank” or “value”).
thresholdcut (int or float, optional Cutoff threshold for connections, interpreted according to the criterion.) –
If criterion is “rank”: thresholdcut = k means Y matches X
if Y is among the k nearest points to X. - If criterion is “value”: thresholdcut is a distance threshold, and Y matches X if the distance(X, Y) <= thresholdcut.
projection (tuple of int, optional) – Axes of the grid used for 2D projection.
- Returns:
None – Prints the resulting neighborhood matrix.
see
squarenet.neighbormap
- plot(style='checkerboard', animate=False, save=True, save_path='sqrnet/plot', **kwargs)[source]¶
Display the mapped grid as a static figure or animation.
Many rendering options (scales, colors, figsize, DS, frames …) can be passed as keyword arguments or configured once via
self.plot_config[key] = value. print(self.plot_config) to see all available parameters).- Parameters:
style (str) – Rendering style:
"checkerboard","mesh"or"scatter".animate (bool) – If True, produce a morphing animation (grid → identity). If False, render a static plot.
save_path (str or None) – None → display with
plt.show(). str → save to disk
- Returns:
fig (matplotlib.figure.Figure)
ani (matplotlib.animation.FuncAnimation or None)
see
squarenet.artist
- search_sorted(X, n_iter=2, side='left')[source]¶
X must be a SINGLE point, search_sorted will return a multiindex IX in the grid. IX is the grid index of PROBABLY one of the closest points to X in the first (side = ‘left’) or last (side = ‘right’) quadrant (relative to X) of the space. The search is a greedy coordinate descent: each dimension is refined by applying 1D searchsorted on row, then columns… Augmenting n_iter increases accuracy.