Systems theory

A particularly interesting and advanced systems theory is the Theory of Social Systems as suggested by Niklas Luhmann. This theory is scientifically relevant as it emphasizes awareness for something which has been excluded from scientific research and reasoning for the biggest part of scientific history - the observer.

Luhmann N. (1995) Social Systems. Stanford University Press.

The arguments in favor of this awareness are simple and well-known from philosophical reasoning: since nothing can be perceived other than through the set of senses that humans are endowed with, one cannot reasonably speak about an unobserved and thus objective reality. Everything that is, is observed in the one or other way, and therefore filtered through (human) senses. It is always appearing to someone (or to something) and hence never objectively given. Therefore, one cannot speak about any reality without considering its observer.

What is essential in Luhmann's theory is the fact that this awareness of the observer does not succumb to subjectivism or to a disbelieve in the possibility of analytical science. On the opposite, Luhmann's suggestions on how to consider the observer builds on several insights from 20th century science and connects social sciences with other modern disciplines like information-, game-, network-, and last but not least, with systems sciences.

The Spencer-Brownian distinction/indication dual

One of Luhmann's main reference authors was the mathematician George Spencer-Brown who developed a formal calculus of observation. According to this conception, observation is a dual operation of drawing a distinction and indicating one of the distinct parts as the currently relevant one. Observation hence is a binary choice. The observer (if observed in its own right) appears to observe by differentiating its world into bisections and indicating one of them as the one relevant for further operations (i.e. for further observations). This conception is formal to the extent that it allows conceiving an air conditioning system as observing its "world" by differentiating warm and cold temperatures and indicating one of them as reason for sending an on-signal to a heater, in the same way as a computer differentiating binaries and indicating one of them as the state from which to start the next computation, and an organism distinguishing usable resources from unusable and indicating usable as the ones relevant for maintaining existence. In social sciences however, one would conceive an observation as the dual act of distinguishing for instance a group of people with one attribute (e.g. being poor) from others with another attribute (e.g. being rich) and directing investigative attention to the one group - to the poor - and away from the others. The others are currently not relevant and therefore excluded from observation. Only in another observation, a consecutive one, attention can be directed towards this group, at the expense however of again excluding something else from observation (that is, of distinguishing something and indicating this as the relevant side of the distinction).

In order to grasp this distinction/indication-dual formally, Spencer-Brown suggested to indicate it with the following sign: Mark He called this sign the mark and mentioned three ways to deal with the distinction it indicates: one way is to repeat the distinction and to accept it thereby - "yes, poor people are relevant". A second way is to cross the sides of the distinction and indicate the other side (which like repetition is also another observation) - "no, rich people are more relevant". And the third way is to observe what it does - to observe the distinction and indication in itself. If one decides for this third way, one gets to see the form of observation, the pure act of observing. Scientifically hence, one gets confronted with the question of who is this observer and how, under what conditions, with what constraints, what prejudices etc., is an observation performed. The scientifically correct way of observing poor people hence becomes forced to include all the preconditions and latent reasons an observer might face in his observation. This third way hence implies an observation of the (first) observation, a so called second-order observation. It directs attention away from "What is observed?" to the question "How is observed?" As we shall see below, Luhmann suggests that this is the actually relevant way of posing scientific questions in modern societies.

Information theory

The Spencer-Brownian way of defining observations can be aligned to another important concept of 20th century science - the theory of information as suggested by Claude E. Shannon (1948).

Shannon, C.E. (1948) A mathematical theory of communication. In A Mathematical Theory of Communication; Shannon, C.E., Weaver, W., Eds.; Illinois University Press: Urbana, IL, USA, 1948; pp. 29–125.

The technical problem behind Shannon's famous suggestion has been the lossless transfer of information between a sender and a receiver. Considering the set of symbols with which information can be transferred - for example the letters of the alphabet - , Shannon suggested to measure the expectation of a symbol being chosen from a given set of possible symbols. The certainty or uncertainty about a symbol being chosen, that is the Shannon entropy, then depends on the size of the set of different symbols and on the information that determines a choice. Formulated in terms of observations this corresponds to the question about how many single observations (i.e. distinction/indications) are necessary to unambiguously determine a symbol in a given set of symbols. For example, to select the letter h unambiguously from an alphabet with 26 letters, it needs 5 distinction/indications of the form "the letter is in the first half of the alphabet", "the letter is in the second half of the first half of the alphabet", "the letter is in the first half of the second half of the first half of the alphabet", ...

Binary choice

Assuming that all letters in this alphabet have equal probabilities of occurring in a message (which of course is not the case in natural languages), one letter would have the occurrence probability \(\frac{1}{26}\) in this alphabet.To unambiguously distinguish 26 letters hence, one needs a minimum set of 5 observations corresponding to a set of 5 binary choices (i.e. "yes / no", resp. "1 / 0") expressed in the relation \(2^5=32\) (note that \(32 > 26\), this set hence provides a bit more selection possibilities than needed, but is the minimum set for unambiguously distinguishing 26 letters. The ASCII-code for example uses binary numbers with seven digits and therefore allows to capture \(2^7=128\) different symbols).

In general, the formula for the minimally needed set size is \(2^I=N\), with \(I\) indicating the number of observations and \(N\) indicating the number of letters in the alphabet. Alternatively, this can be expressed with the binary logarithm as \(I=log_2 N\). In alphabets with equally likely letters the occurrence probability \(p\) of a letter is always \(\frac{1}{N}\). This can be reformulated into \(I=log_2 \frac{1}{p}\), with \(I\) now expressing the information content of a single letter, or alternatively formulated, its surprise value. Shannon reasoned that the more likely or often a letter or a sign occurs in a message, the lower is its surprise value, and vice versa. Rare signs hence have high surprise values. In order to mathematically represent the information content of a message with \(n\) letters, Shannon suggested the famous formula:

\[I=\sum_{i=0}^{n} p_i * -log_2 p_i\]

The crux of this definition of information is that it builds on the probability of a letter in the context of other possible letters. A letter has no probability on its own. Its probability is always determined by its context, or more exactly, by the conditions that reduce the probability of other letters in context to occur . The decisive question hence is not "what makes a letter likely to be chosen for transferring a message?", but rather "what makes the other letters in context unlikely, so that, relative to them, the letter in question becomes sufficiently likely to be the one that is chosen?"

This attention for the enabling conditions of context can be fruitfully applied to any kind of phenomena or events. As it often raises interesting scientific, in particular sociological questions, it is one of the founding stones on which Niklas Luhmann built his Theory of Social Systems. His methodological approach suggests to consider events or phenomena by themselves as highly unlikely and then to ask for the constraints that make them likely. What other options have to be excluded in order to make the phenomenon in question sufficiently likely to occur? The fact for example that this web site contains information on systems theory is by itself highly unlikely given the set of possibilities for other kinds of content on web sites. The question "what makes this kind of content likely?" then directs attention to the plurality of enabling conditions, such as for example the fact that this is not a site about music or sports or its author is not a biologist or geographer, and so on.  

The productivity of this approach shows in particular, when applied to questions concerning the peculiarity of systems to spawn a particular eigenbehavior, that is, a kind of behavior that cannot be easily explained by observing the behavior of the components of the system. An issue in this respect that Luhmann was particularly concerned with, is the emergence of social systems. Given the proverbial selfishness of humans - as expressed in the famous phrase homo homini lupus for instance - it seems highly unlikely for any social interaction to emerge in the first place. Luhmann suggested to regard this problem in terms of a conception that is known under the title "double contingency".

Double Contingency

The conception of “double contingency” has been suggested by the sociologist Talcott Parsons who was an early subscriber to systems theory. The conception builds on Shannon’s theory of information and is meant to explain the emergence of human interaction in general.

A situation of “double contingency” confronts an interlocutor EGO with the problem of what action, be it a gesture, an utterance, or a way of behavior, to choose from the set of possible actions or behaviors, so that ALTER in her turn can choose an appropriate action from her set of options which leaves EGO with a chance to again choose an action so that ALTER again can react, and so on. The crucial point in this theoretical setting is that EGO and ALTER face a vast set of possible actions and do not know which one of them would be appropriate to establish an interaction. They do not have too few options for communication, but too many. The problem consists in the contingency of the preconditions on both sides of the potential interaction. EGO and ALTER have no information about which one of their options might provide them with a good chance to maintain their interaction.

In Parsons’ conception this lack of information is solved by an a priori synchronized “shared symbolic system", a common culture for instance, or a common language, common behavioral habits, etc., which constrain the possibility spaces of the interlocutors to an extent at which at least some action becomes sufficiently likely. However, two interlocutors sharing the same symbolic system can be regarded as unlikely in its own turn. Luhmann thus objected that Parsons' “shared symbolic system" should rather be seen as a consequence and not as a precondition of communication. He argued that these constraints as well cannot have developed without interaction. Therefore, the situation of double contingency has to be reconsidered on a very basic level without any pre-given relational assets. Referring to constructivism, Luhmann conceives EGO and ALTER as self-referentially closed “black boxes”, as systems which act solely on the base of their own on-board means and perceive everything coming from “outside” as an irritation. Luhmann uses the term irritation to indicate that, when understood stringently, this kind of system can ascribe any inputs only retrospectively — i.e., when interaction will have taken off — to an “outside”, that is, to an “external world” or to another system. Hence systems can find ways to cope with these irritations, they can memorize these ways, and they can — as it would seem to an observer — eventually adapt to these irritations. If this happens concurrently on both sides of the EGO-ALTER relation, an observer might interpret this process as emergence of interaction. All this in spite of the fact that the systems themselves do nothing else than internally change their modes of operation, without any concept of an “outside”.

In its most abstract form, in Luhmann’s conception, these onboard means of EGO and ALTER are conceived with George Spencer-Brown’s dual act of differentiation and indication. With each act of differentiation and indication, systems generate states (or conditions), which in the next step can be differentiated again in order to indicate one part of it thereby generating the next state as precondition for another differentiation-indication. If the resulting sequence of differentiation-indications can be memorized, for example in the form of a particular symbol A, not B, women, not men, the set of options of this system becomes structured. The selection of some actions from the set of options becomes more likely than the one of others. Compare this to Shannon’s conception of a selection of a certain symbol from a given set being formalized as a sequence of binary decisions, and information being defined as the number of decisions needed to unambiguously determine the symbol.


What is described on the example of two interlocutors is generalizable in respect to systems in general. Systems interact by way of distinction-indication, with interaction possibly referring to the operations a system undertakes to maintain its own existence. In this case, a system might be seen as differentiating itself from its environment by indicating itself as option for further operations. In this respect, Luhmann suggests to define systems as the difference they maintain in respect to their environment. He famously expresses this with the following formula:

System definition

On first sight, this formula might seem paradox, as it obviously claims an equality of two entities which are not equal. Readers however, familiar with the assignment of variables as common in software programming, will read this formula as the expression of a process implying that a system is a system if it manages to maintain a delimitation from its environment, if it manages to distinct itself from this environment, and be it only for a short moment in time. A system hence is never on its own, but always in relation to its environment. It actually is the difference that it makes in respect to its environment. And it "makes" this difference always to someone, that is, to an observer. In this respect, Luhmann stresses the implication of an other proposal for a definition of information, the definition by Gregory Bateson, according to which information is a difference that makes a difference. Here as well, to make a difference implies a someone to whom this difference is made. No difference without an observer.

Higher-order observations

A nice little example for raising the awareness for observer-dependency has been given by Heinz von Foerster, another important reference thinker of Luhmann. Usually mathematicians consider a sequence of numbers as ordered if a formula can be proposed that reliably generates this sequence, such as for example the formula \(F_{n-1}+F{n-2}\) with \(F_0=0\) and\(F_1=1\), which generates the famous Fibonacci-sequence \({1 1 2 3 5 8 13 21 34 ...}\). In respect to this definition of orderliness, Von Foerster suggested to consider the following sequence:

\[8, 5, 4, 9, 1, 7, 6, 3, 2, 0\]

Mathematicians will agree that their is now obvious way to compress this sequence into a formula. They therefore might conclude that this sequence is unordered. However, as Von Foerster points out, this sequence can appear as highly ordered if one changes the observational viewpoint. The sequence is simply alphabetically ordered if written down not in numbers but in terms of their English numeral:

\[eight, five, four, nine, one, seven, six, three, two, zero\]

Just changing the viewpoint hence, can turn disorder into order. This simple fact however, has far-reaching consequences, since the observer itself is usually observed as ordered, implying that it is observed by another observer, which in its turn again has to be considered as observed, and so on. Since each observer has to be observed in order to be and this is true for any conceivable observer, there cannot be any "first" observer. As a consequence, the observer has to be thought as circularly constituted. This however complies with basic assumptions about the constitution of systems. As discussed in the context of the Page-rank  or of neural networks for instance, systems emerge from a concurrent interaction of a multitude of components, of which none can be reasonably classified as the most important or the "first".