squarenet package

Subpackages

Submodules

squarenet.artist module

squarenet.artist.atmost3D(gridpoints, projection=(0, 1, 2))[source]
squarenet.artist.default_config()[source]
squarenet.artist.kill_long_edges(x, threshold=1)[source]
squarenet.artist.sqplot(grid, verbose, style='checkerboard', animate=False, save=True, save_path='sqrnet/plot', cfg=None)[source]

Render a structured grid — static snapshot or morphing animation.

Parameters:
  • grid (np.ndarray) – Input structured grid of shape (…, D) where D is 2 (2-D) or 3 (3-D).

  • style (str) – One of "checkerboard", "mesh", or "scatter". "scatter" draws every point as a plain dot: coloured by depth (3rd coordinate after the isometric rotation) in 3-D, or in black in 2-D. No background grid is shown.

  • cfg (dict, optional) – Rendering configuration. Defaults to default_config(). Missing keys are filled in from the defaults, so you only need to specify what you want to override. Extra key for scatter: "cmap_scatter" (default "viridis").

Returns:

  • fig (matplotlib.figure.Figure)

  • ani (matplotlib.animation.FuncAnimation or None) – None when cfg["animate"] is False.

squarenet.bubbletree module

squarenet.core module

squarenet.core.carthesian_sort(gridmap, points, max_iter=100, method='fast', backend='numpy', loop=None, loopseq='decreasing', verbose=2)[source]

Supported:

method [fast, robust, ultimate] backend [numpy, jax, torch]

Goal

Given an unordered point cloud:

points.shape = (N, D)

rearrange point indices into a structured cartesian grid:

grid.shape = (N1, N2, …, ND)

such that neighbouring cells of the grid contain spatially coherent points.

GRID INTERPRETATION

A grid index map bijectively to a point index:

point index: n <=> grid index: f(n) = (n1, n2, …, nD)

Each grid axis defines a local neighbour relation.

For axis d, let define:

next(n, d) = f-1(n1,…, nd+1,…,nD)

such that P(next(n,d)) is the neighbour obtained by incrementing the d-th grid coordinate.

The objective is therefore:

nearby euclidean points -> nearby grid cells

Note that the reciprocal property

nearby grid cells -> nearby euclidean points

is NOT guaranteed.

Indeed, datasets may contain cracks, holes, disconnected clusters, folds, or other topological singularities.

Gridification will naturally tend to close cracks, stitch nearby boundaries together and overlap the folds.

This algorithm is primarily designed for speed and scalability on large datasets. It is NOT an optimal transport solver.

Therefore, users should always inspect the resulting grid, especially near boundaries where geometric distortions are more likely to appear.

HEURISTIC ORDERING PRINCIPLE

For each spatial dimension d, define d spatial heuristics:

H_d(point) -> scalar

Heuristics must be orthogonal and monotonic Typical choice: cartesian heuristics

H_0 = x H_1 = y H_2 = z …

One could think on an improved version of the algorithm which would learn heuristics that best fit the dataset but cartesian are already pretty good

The desired property is:

H_d(P(n)) <= H_d(P(next(n, d)))

for every point index n and every axis d.

In words:

values of heuristic d should increase along grid axis d.

DISORDER METRIC

A local inversion occurs when:

H_d(P(n)) > H_d(P(next(n, d)))

The total disorder is simply the number of such violations.

Pseudo-definition:

disorder =

sum over all axes d sum over all point pairs (current, next) of inversion count

FAST METHOD

The simplest strategy consists in repeatedly sorting each grid axis independently.

Pseudo-code

initialize gridmap

for iteration in range(max_iter):

for axis d in range(D):

sort grid indexes along axis d using heuristic H_d

compute disorder

if disorder == 0:

stop

Properties

Advantages:
  • extremely fast

  • fully vectorized

  • memory efficient

  • surprisingly effective

Limitations:
  • for loop on the axes

-> early axes dominate later ones -> result in weak axes

  • only adjacent comparison is a weak accuracy criterion

-> disorder is blind on what happens on diagonals -> may converge toward local minima

ROBUST METHOD

To reduce axis domination and improve stability, sorting is performed only on random independent subgrids at each step, thus learning is much more progressiv

At every iteration:

  • each axis is randomly partitioned

  • only selected cartesian sub-blocks are sorted

  • different blocks are used for different axes

Result:

  • smoother convergence

  • better axis symmetry

  • fewer local minima

  • improved isotropy

Pseudo-code

for iteration:

generate random cartesian subgrids

for axis d:

for subgrid in selected_subgrids[d]:

sort only inside this subgrid

compute disorder

ULTIMATE METHOD

Even robust cartesian sorting still treats grid lines as parallel and therefore almost independent.

This can produce:

  • stratification

  • layered artifacts

  • weak coupling between parallel slices

To solve this, a second refinement stage is introduced.

The grid is embedded into a “hash table” containing shifted/sheared copies of the grid.

Repeated cyclic shears produce strong cross-line coupling.

This allows information to propagate between previously independent cartesian lines.

High-level idea

repeat:

shear grid into staggered hash table

sort along one axis

project back to grid

Effect

The repeated shearing progressively destroys artificial parallel structures and greatly reduces stratification. ============================================================ SUMMARY ============================================================

FAST:

deterministic global line sorting (cartesian sort)

ROBUST:

stochastic partial sorting preserving axis symmetry

ULTIMATE:

sheared multi-pass relaxation reducing stratification artifacts

COMPLEXITY

A few hundreds iterations will be largely enough for convergence in most of the cases.

Let:

N = total number of points

Each iteration performs approximately:

O(N (log N +D)) operations

with highly vectorized NumPy operations.

squarenet.neighbormap module

squarenet.neighbormap.distfunction(x, y, axis)[source]
squarenet.neighbormap.distkernel(x, xview)[source]
squarenet.neighbormap.neighbormap(grided_points, mapidx, criterion='rank', thresholdcut=1, kernel=<function distkernel>, windowradius=10, projection=(0, 1), sample_size=1000000)[source]

squarenet.sampler module

squarenet.sampler.list_methods()[source]

Return available sampling methods.

squarenet.sampler.loaddatasets(force=False)[source]

Download dataset files from GitHub into a local cache directory.

Parameters:

force (bool) – If True, re-download files even if they already exist.

squarenet.sampler.place_at(points, position=(0, 0))[source]

Normalize point cloud to [0,1]^D and shift it at the specified position

Parameters:
  • points (numpy.ndarray) – Input point cloud.

  • position (Target position of the minimum corner after normalization.)

Returns:

point cloud normalized and placed at the

Return type:

numpy.ndarray

squarenet.sampler.plotpoints(points)[source]

plot a 2D projection of the point cloud

squarenet.sampler.samplepoints(method='gaussian', size=(1000000, 2), plot_points=False)[source]

Generate point clouds using various sampling strategies.

Parameters:
  • method (str) –

    Sampling method. Supported values:

    General methods (any dimension D): - “square” : uniform in [-1, 1]^D - “ball” : uniform in the unit ball - “ring” : Gaussian with radial offset - “gaussian” : standard normal distribution - “spiky” : L^alpha ball (alpha < 1) - “holy” : hypercube with spherical holes

    Dataset-based methods (fixed dimensions): - “barbara”, “lena”, “france”, “germany” : 2D datasets - “everest” : 3D dataset (elevation-based sampling)

  • size (tuple of int) – (N, D) where N is the number of points and D the dimension. Note: D must match the dataset dimension for dataset-based methods.

  • plot_points (bool, optional) – If True, display the generated points.

Returns:

Array of shape (N, D) containing sampled points.

Return type:

numpy.ndarray

squarenet.squarenet module

exception squarenet.squarenet.ConvergenceWarning[source]

Bases: UserWarning

Raised when optimization does not converge.

class squarenet.squarenet.SquareNet(gridshape, max_iter=1000, warnings_=True, verbose=2, backend='numpy', device='cpu')[source]

Bases: object

Grid straightening algorithm for D-dimensional point clouds.

SquareNet reorganizes an unordered set of points into a structured grid by iteratively sorting indices along each spatial axis. The goal is to enforce a topology where neighboring points in the original space are also close on the grid.

The grid is internally represented as an array of indices of shape (N1, ..., ND), where D is the dimensionality of the embedding space.

Parameters:
  • gridshape (tuple of int) – Shape of the target grid. The product of these dimensions must exactly match the total number of input points (bijective gridification).

  • max_iter (int, default=100) – Maximum number of iterations allowed for the sorting procedure.

  • warnings (bool, default=True) – If True, emits a ConvergenceWarning if the grid is not perfectly ordered at the end of the process.

  • backend ({"numpy", "torch" or "jax"}, default="numpy") – input and output backend.

grid

The structured grid of indices, with shape gridshape.

Type:

ndarray

invert_grid

Inverse mapping from point indices to grid coordinates.

Type:

ndarray, jax or numpy

invgridflat

Flattened version of the inverse mapping for fast indexing.

Type:

ndarray, jax or numpy

learning_curve

History of the disorder metric across iterations.

Type:

list of float

1. Run ``fit()`` with your preferred backend.
2. Call ``SquareNet.map()`` on any feature indexed like the points (N, \*C)
To build a tensor version of it (*G, \*C) based on the points learned multi-indexes
fit(points, method='fast', fit_with_numpy=False)[source]

Fit the grid to a point cloud.

Parameters:
  • points (ndarray of shape (N, D) or str) – Input point cloud or sampling method name.

  • method (fast, robust or ultimate.) – robust can be up to 5 time slower, but will probably give a better grid. ultimate can be up to 30 time slower but will give the best results among the three methods.

  • fit_with_numpy – wether to apply conversion to numpy (just for the fit) and fit with numpy method. Might be faster, always consider it as an option. Default is to fit on the backend (and device) given at init e.g. numpy, torch or jax.

Returns:

  • None – The method updates the internal state of the grid in-place.

  • see squarenet.core

invert_map(features)[source]

Map grid data back to cloud ordering.

Parameters:

features (jax or numpy ndarray of shape (*gridshape, *C))

Return type:

jax or numpy ndarray of shape (N, *C)

invert_mapidx(index)[source]

Grid index -> cloud index.

map(features)[source]

Map cloud data to grid structure.

Parameters:

features (jax, or numpy ndarray of shape (N, *C))

Return type:

jax or numpy ndarray of shape (*gridshape, *C)

mapidx(index)[source]

Cloud index -> grid index.

Returns:

directly usable for indexing.

Return type:

tuple of arrays

neighbormap(max_sample_size=20000000, max_window_size=31, criterion='rank', thresholdcut=1, projection=(0, 1), log2=False)[source]

Compute and display a neighborhood map from gridded points.

For each point X (or a subsample of sample_size if the dataset is too large), scans a square window of radius wr = ws//2 in the grid centered on X, and adds 1 to each cell if the point Y found in the cell matches the criterion.

Parameters:
  • log2 (bool, default False) – If True, applies log2 scaling to counts.

  • max_sample_size (int, default 20 million) – sample size for the number of pairs (X, Y)

  • max_window_size (int, default=31) –

    Note that wr, the window radius, is defined as wr = window_size // 2. Search window size, i.e. the size of the window in which the neighbor map is computed, such that: gridindex(X) - gridindex(Y) <= wr` where <= is relative to the L∞ norm on grid indices.

    Warning

    You will get no information, and thus no garantee at all, on what happens outside the search window. Conversely, using a very large search window dilutes the information, since the probability of sampling pairs for a given grid offset decreases.

    Choosing the window size is therefore a tradeoff between macroscopic information and microscopic information

  • criterion (str, optional) – Method used to define neighborhoods (“rank” or “value”).

  • thresholdcut (int or float, optional Cutoff threshold for connections, interpreted according to the criterion.) –

    • If criterion is “rank”: thresholdcut = k means Y matches X

    if Y is among the k nearest points to X. - If criterion is “value”: thresholdcut is a distance threshold, and Y matches X if the distance(X, Y) <= thresholdcut.

  • projection (tuple of int, optional) – Axes of the grid used for 2D projection.

Returns:

  • None – Prints the resulting neighborhood matrix.

  • see squarenet.neighbormap

plot(style='checkerboard', animate=False, save=True, save_path='sqrnet/plot', **kwargs)[source]

Display the mapped grid as a static figure or animation.

Many rendering options (scales, colors, figsize, DS, frames …) can be passed as keyword arguments or configured once via self.plot_config[key] = value. print(self.plot_config) to see all available parameters).

Parameters:
  • style (str) – Rendering style: "checkerboard", "mesh" or "scatter".

  • animate (bool) – If True, produce a morphing animation (grid → identity). If False, render a static plot.

  • save_path (str or None) – None → display with plt.show(). str → save to disk

Returns:

  • fig (matplotlib.figure.Figure)

  • ani (matplotlib.animation.FuncAnimation or None)

  • see squarenet.artist

search_sorted(X, n_iter=2, side='left')[source]

X must be a SINGLE point, search_sorted will return a multiindex IX in the grid. IX is the grid index of PROBABLY one of the closest points to X in the first (side = ‘left’) or last (side = ‘right’) quadrant (relative to X) of the space. The search is a greedy coordinate descent: each dimension is refined by applying 1D searchsorted on row, then columns… Augmenting n_iter increases accuracy.

to_backend(x)[source]

squarenet.utils module

squarenet.utils.breakpoint()[source]
squarenet.utils.dualgrid(grid, xp, N, IJ, D)[source]
squarenet.utils.dualgridflat(grid, xp, N)[source]
squarenet.utils.from_backend(x)[source]

Convert torch/jax/numpy array to numpy safely.

squarenet.utils.index_identity(shape)[source]
squarenet.utils.printmatrix(arr)[source]
squarenet.utils.progress_bar(it, total, bar_length=30)[source]
squarenet.utils.project(gridpoints, feature_axes=(0, 1), index=0)[source]
squarenet.utils.show_search_result(left, right, true, points, sn)[source]
squarenet.utils.to_backend(x, backend='numpy', device='cpu', warnings_=True)[source]

Convert numpy array to target backend.

squarenet.views module

squarenet.views.lazylocalview(*args, **kwargs)[source]

wrapper on localview for cleverly requesting a (small) subset of the views. args[0] must be the selection mask which specify wich views are requested

Returns:

dictionary key -> View[key]

Return type:

WindowCollector

squarenet.views.localview(Xmap, wr, D, boundary='reflect', **pad_kwargs)[source]

Compute local neighborhood views of a data Xmap (related to the point cloud) on the latent grid.

The function applies a sliding window over the grid to collect the local view

Parameters:
  • Xmap (np.ndarray) – Input data of shape (*G, *C)

  • D – dimension of the points

  • wr (int or sequence of int) – window radius per grid axis. Window size per axis is: ws = 2 * wr + 1

  • boundary (str) – Padding mode = boundary condition passed to np.pad (e.g. “constant”, “reflect”, “edge”, …)

  • **pad_kwargs – additional arguments forwarded to np.pad, see Numpy doc

Returns:

Local neighborhood views of shape (*G, *C, *ws)

Return type:

np.ndarray

Module contents

class squarenet.SquareNet(gridshape, max_iter=1000, warnings_=True, verbose=2, backend='numpy', device='cpu')[source]

Bases: object

Grid straightening algorithm for D-dimensional point clouds.

SquareNet reorganizes an unordered set of points into a structured grid by iteratively sorting indices along each spatial axis. The goal is to enforce a topology where neighboring points in the original space are also close on the grid.

The grid is internally represented as an array of indices of shape (N1, ..., ND), where D is the dimensionality of the embedding space.

Parameters:
  • gridshape (tuple of int) – Shape of the target grid. The product of these dimensions must exactly match the total number of input points (bijective gridification).

  • max_iter (int, default=100) – Maximum number of iterations allowed for the sorting procedure.

  • warnings (bool, default=True) – If True, emits a ConvergenceWarning if the grid is not perfectly ordered at the end of the process.

  • backend ({"numpy", "torch" or "jax"}, default="numpy") – input and output backend.

grid

The structured grid of indices, with shape gridshape.

Type:

ndarray

invert_grid

Inverse mapping from point indices to grid coordinates.

Type:

ndarray, jax or numpy

invgridflat

Flattened version of the inverse mapping for fast indexing.

Type:

ndarray, jax or numpy

learning_curve

History of the disorder metric across iterations.

Type:

list of float

1. Run ``fit()`` with your preferred backend.
2. Call ``SquareNet.map()`` on any feature indexed like the points (N, \*C)
To build a tensor version of it (*G, \*C) based on the points learned multi-indexes
fit(points, method='fast', fit_with_numpy=False)[source]

Fit the grid to a point cloud.

Parameters:
  • points (ndarray of shape (N, D) or str) – Input point cloud or sampling method name.

  • method (fast, robust or ultimate.) – robust can be up to 5 time slower, but will probably give a better grid. ultimate can be up to 30 time slower but will give the best results among the three methods.

  • fit_with_numpy – wether to apply conversion to numpy (just for the fit) and fit with numpy method. Might be faster, always consider it as an option. Default is to fit on the backend (and device) given at init e.g. numpy, torch or jax.

Returns:

  • None – The method updates the internal state of the grid in-place.

  • see squarenet.core

invert_map(features)[source]

Map grid data back to cloud ordering.

Parameters:

features (jax or numpy ndarray of shape (*gridshape, *C))

Return type:

jax or numpy ndarray of shape (N, *C)

invert_mapidx(index)[source]

Grid index -> cloud index.

map(features)[source]

Map cloud data to grid structure.

Parameters:

features (jax, or numpy ndarray of shape (N, *C))

Return type:

jax or numpy ndarray of shape (*gridshape, *C)

mapidx(index)[source]

Cloud index -> grid index.

Returns:

directly usable for indexing.

Return type:

tuple of arrays

neighbormap(max_sample_size=20000000, max_window_size=31, criterion='rank', thresholdcut=1, projection=(0, 1), log2=False)[source]

Compute and display a neighborhood map from gridded points.

For each point X (or a subsample of sample_size if the dataset is too large), scans a square window of radius wr = ws//2 in the grid centered on X, and adds 1 to each cell if the point Y found in the cell matches the criterion.

Parameters:
  • log2 (bool, default False) – If True, applies log2 scaling to counts.

  • max_sample_size (int, default 20 million) – sample size for the number of pairs (X, Y)

  • max_window_size (int, default=31) –

    Note that wr, the window radius, is defined as wr = window_size // 2. Search window size, i.e. the size of the window in which the neighbor map is computed, such that: gridindex(X) - gridindex(Y) <= wr` where <= is relative to the L∞ norm on grid indices.

    Warning

    You will get no information, and thus no garantee at all, on what happens outside the search window. Conversely, using a very large search window dilutes the information, since the probability of sampling pairs for a given grid offset decreases.

    Choosing the window size is therefore a tradeoff between macroscopic information and microscopic information

  • criterion (str, optional) – Method used to define neighborhoods (“rank” or “value”).

  • thresholdcut (int or float, optional Cutoff threshold for connections, interpreted according to the criterion.) –

    • If criterion is “rank”: thresholdcut = k means Y matches X

    if Y is among the k nearest points to X. - If criterion is “value”: thresholdcut is a distance threshold, and Y matches X if the distance(X, Y) <= thresholdcut.

  • projection (tuple of int, optional) – Axes of the grid used for 2D projection.

Returns:

  • None – Prints the resulting neighborhood matrix.

  • see squarenet.neighbormap

plot(style='checkerboard', animate=False, save=True, save_path='sqrnet/plot', **kwargs)[source]

Display the mapped grid as a static figure or animation.

Many rendering options (scales, colors, figsize, DS, frames …) can be passed as keyword arguments or configured once via self.plot_config[key] = value. print(self.plot_config) to see all available parameters).

Parameters:
  • style (str) – Rendering style: "checkerboard", "mesh" or "scatter".

  • animate (bool) – If True, produce a morphing animation (grid → identity). If False, render a static plot.

  • save_path (str or None) – None → display with plt.show(). str → save to disk

Returns:

  • fig (matplotlib.figure.Figure)

  • ani (matplotlib.animation.FuncAnimation or None)

  • see squarenet.artist

search_sorted(X, n_iter=2, side='left')[source]

X must be a SINGLE point, search_sorted will return a multiindex IX in the grid. IX is the grid index of PROBABLY one of the closest points to X in the first (side = ‘left’) or last (side = ‘right’) quadrant (relative to X) of the space. The search is a greedy coordinate descent: each dimension is refined by applying 1D searchsorted on row, then columns… Augmenting n_iter increases accuracy.

to_backend(x)[source]