Genetic programming (GP) has long been used to evolve computer programs for tasks like symbolic regression, but it suffers from bloat—unnecessary growth of program size. A new algorithm, Minimalist Genetic Programming (MGP), offers a different approach by framing program induction as a syntactic derivation problem rather than an evolutionary search.
The Bloat Problem in Genetic Programming
According to a preprint on arXiv by Leonardo Trujillo, genetic programming is based on two core insights. First, any learning task can be posed as a program induction problem where the goal is to construct a symbolic hierarchical model expressed as a syntax tree. Second, this task is posed as a search problem using evolution to locate the desired model. However, standard GP systems are prone to bloat, making it difficult to find exact solutions in tasks like symbolic regression.
A New Inspiration: The Minimalist Program
MGP is also biologically inspired, but instead of evolution, it takes inspiration from the Minimalist Program in human language. In minimalism, syntax is understood as an optimal solution linking two mental systems. The core computational process in MGP is a binary set formation operator called $MERGE$, which incrementally constructs complex syntactic structures using a simple Markovian process. As the paper explains, MGP discovers the core building blocks of symbolic expressions and incrementally combines them using $MERGE$.
Benchmark Results on Symbolic Regression
The proposed system was benchmarked on symbolic regression tasks that are known to be difficult for standard GP due to bloat. The results show that when a proper lexicon of atomic syntactic objects is chosen, MGP consistently produces the exact ground truth model on a set of symbolic regression tasks where standard GP struggles to do the same.
| Aspect | Standard Genetic Programming | Minimalist Genetic Programming (MGP) |
|---|---|---|
| Core approach | Evolutionary search | Syntactic derivation via MERGE |
| Primary operator | Crossover, mutation | Binary set formation (MERGE) |
| Process | Population-based evolution | Incremental Markovian construction |
| Bloat propensity | High (prone to bloat) | Consistent exact models on tested tasks |
| Symbolic regression performance | Struggles to find exact ground truth | Consistently produces exact ground truth |
Implications for Program Induction
The insights provided by minimalism are shown to be relevant to the problem of program induction, and should be explored further based on the potential exhibited by MGP in this work. The preprint, titled 'Minimalist Genetic Programming,' is available on arXiv and authored by Leonardo Trujillo.