Tuesday, April 7, 2026

More on Neural Networks and Simplices

As a follow-up to the Research Diary entry Research Diary: NN from Simplex-Wise Linear Interpolations,  today we look at mesh quality and task geometry. Plus, I’m personally curious whether simplex quality behaves differently across practical tasks—like computer vision, language models, or stock trading.

Triangulation properties, captured by mesh quality measures from computational geometry (e.g., Shewchuk 2002 for 2D and 3D; Knupp 2001, which takes a more algebraic approach using classical matrix groups and extends to higher dimensions), can influence predictive behavior. These measures are usually designed to avoid poorly shaped simplices and produce smooth surface approximations—something widely used in computer graphics—but it is still not clear how to translate them to machine learning models. Since common loss functions like MSE and MAE tend to reduce the impact of outliers, any analysis of model-induced triangulations should take this into account.

A key point is that “good” simplex geometry is not universal. Delaunay triangulations—unique for points in general position—are often preferred because they avoid elongated simplices. However, for data with cylindrical or other directional patterns, elongated simplices may actually approximate the data better (see Shewchuk 2002 for an illustration). This suggests that mesh quality depends on the underlying data structure rather than being an absolute criterion. An interesting experiment is to start with a neural network built from a Delaunay triangulation and then optimize it with gradient descent.

From the simplex-wise linear model perspective, gradient-based optimization may tend to favor induced SWLM representations whose geometry aligns with the data and the task. But not all such representations are equally well suited to this structure, so increasing model capacity—and effectively refining triangulations—may eventually lead to diminishing returns or even backfire.

This perspective also raises a broader structural question: classical triangulation algorithms build partitions of space using discrete geometric rules, while gradient-based training arrives at a simplex-wise linear representation through continuous optimization. It is not yet clear whether this process implicitly favors certain triangulation structures—or what criteria might drive that preference. Plus, I'm personally curious if we get different simplices qualities for different practical tasks: computer vision, language models or shares trading.