Real problems or phenomena can be quite complex with lots of aspects and factors affecting them, causing their particular behavior. Some of these factors have more, some have less, some might have neglectable influence. In modeling, one usually wants to reduce factors to a minimum, attempting to keep all the essential ones and to sort out the rest. A good model therefore classifies as the minimum set of aspects or factors that are needed to reproduce a phenomenon or a particular behavior.

Schematically the process of modeling could be outlined in the following way, with having a "real problem" marking the origin of the process:

A model can be valid in various ways:

- It can be empirically valid by showing numerical (i.e. quantitative) similarities to what is observed.
- It can be behaviorally valid by showing qualitative similar behavior to what is observed.
- It can be structurally valid if its effects show structural similarities to what is observed
- It can be valid in terms of applicability if it serves its purpose and suggests answers to open questions. It ain't valid in this respect if it causes mathematically unsolvable problems or withstands attempts of simulation.

Sandve GK / Nekrutenko A / Taylor J / Hovig E (2013) PLOS Comput Biol 9(10)

- For Every Result, Keep Track of How It Was Produced
- Avoid Manual Data Manipulation Steps
- Archive the Exact Versions of All External Programs Used
- Version Control All Custom Scripts
- Record All Intermediate Results, When Possible in Standardized Formats
- For Analyses That Include Randomness, Note Underlying Random Seeds
- Always Store Raw Data behind Plots
- Generate Hierarchical Analysis Output, Allowing Layers of Increasing Detail to Be Inspected
- Connect Textual Statements to Underlying Results
- Provide Public Access to Scripts, Runs, and Results

A basic kind of a model could be seen in the formula of the famous Fibonacci-sequence 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, ... which is generated by iteratively summing the last two consecutive numbers of the sequence and adding them to the end of it.

\(F_n=F_{n-1}+F_{n-2}\) with \(F_0=0\) and \(F_1=1\)

This simple model has the unusual feature of necessitating no abstraction. It reproduces what it is modeled reliably - without loss of information. Nevertheless, it is able to demonstrate a fundamental aspect of modeling: the ability to reduce complexity in respect to what is modeled.

This aspect becomes obvious if we consider a very long Fibonacci-sequence, say of several thousand numbers. Then the formula which is able to reproduce this sequence would definitely be easier to handle than the whole sequence itself. If additionally this formula is expressed as an algorithm that can be read and executed by a computer, the reduction in complexity becomes obvious - and measurable in terms of basic information units, in bits needed to memorize the sequence or the formula on a hard disk. This measure is known under the term Kolmogorov-complexity (aka algorithmic complexity), named after the mathematician Andrey Kolmogorov.

Click on the image to see a YouTube-clip of Craig Reynolds' original 1986 Boids simulation .

At the other extreme of the scope of modeling one could regard computer-generated models that enable machines to coordinate and to learn how to move. The so called "continuously self-modeling machine" works by generating a model of its environment, of itself and of its possibilities in this environment, and by iteratively updating this model with experiences. It thereby learns how to use its limbs in order to move forward.

Click on the image to see a YouTube-clip with Hod Lipson explaining the principle of the robot.

Bongard, Josh, Zykov, Victor, Lipson, Hod (2006). Resilient Machines Through Continuous Self-Modeling. Science 314/5802. 1118-1121.