What kind of multi-region interactions and emergence in the brain?

All brain properties are network (or circuit) properties. After all, no region is an island and although most of brain connectivity is relatively local, there’s a tremendous amount of non-local connectivity, too.

Let’s consider two kinds of networks or network interactions to further clarify what is meant by “network properties”. In a Type I network, the nodes – brain regions in this case – carry out (“compute”) fairly specific functions (Figure 1A). For example, in the context of fear extinction, the hippocampus determines contextual information (where is the learning taking place?) and the VTA computes omission prediction errors (the animal was expecting an aversive even but it actually didn’t happen). (Another example would be object processing in the visual system.) In this scenario, a process of interest (say, fear extinction) is viewed as a network property that depends on the interactions of the brain regions involved. That is to say, it is necessary to investigate the orchestration of multiple regions to understand how the regions, collectively, carry out the processes of interest. Importantly, however, the collective properties of the system are not accessible, or predictable, from the behavior of the individual regions alone. That is to say, the multi-region function, F(R1, R2., …, Rn) is poorly understood from considering f(R1), f(R2), and so on.

Figure 1: What kind of multi-region circuits? A) Type I network with nodes with well-defined computational properties. B) Type II network where the “primitive function” involves the interactions of two (or more) regions.

Poorly understood in what sense? In a near-decomposable system, lesion of R1, for example, will cause a deficit to the network that is directly related to the putative function on R1. However, this is not the outcome in an interactionally complex system. For example, consider multi-species ecological systems in which the introduction of a new species, or the removal of an existing one, cause completely unexpected knock-on effects. The claim being made here is that, in many cases, we need to consider brain networks in much the same way: as a complex system that is not well approximated by simple decompositions.

Now let’s turn to Type II networks, where nodes do not instantiate specific functions (Figure 1B). Instead, two or more regions working together instantiate the basic function of interest, such that its implementation is distributed across regions. It is easy to provide an example of this Type II networks if we consider computational models where undifferentiated units are trained together to perform a function of interest (such as typical neural network models). But are there examples of this type of situation in the brain? Multi-area functions are exemplified by reciprocal dynamics between the frontal eye fields and the lateral intraparietal area in macaques supporting persistent activity during a delayed oculomotor task (Hart and Huk, 2020). Based, among others, on the tight link between these areas at the trial level, the authors suggest that the two areas should be viewed as a “single functional unit” (see Murray et al., 2017 for a computational model).

In rodents, motor preparation requires reciprocal excitation across multiple brain areas (Guo et al., 2017). Persistent preparatory activity cannot be sustained within cortical circuits alone, but in addition requires recurrent excitation through a thalamocortical loop. Inactivation of the parts of the thalamus reciprocally connected to the frontal cortex results in strong inhibition frontal cortex neurons. Conversely, the frontal cortex contributes major driving excitation to the higher-order thalamus in question. What is more, persistent activity in frontal cortex also requires activity in the cerebellum and vice-versa (Gao et al., 2018), revealing that persistent activity during motor planning is maintained by circuits that span multiple regions. The claim, thus, is that persistent motor activity is a circuit property that requires multiple brain regions. In such case, one cannot point to a specific brain part and label “working memory” as residing there.

It could be argued that, in the brain, the two types of networks discussed here – with and without well-defined node functions – are not really distinct and that what differs is the granularity of the node function. After all, if above one could decompose the function “persistent motor activity” into basic primitives, it is conceivable that they could be carried out in separate regions. In such case, we would revert back to the situation of networks with nodes that compute well-defined functions. Put another way, a skeptic could quibble that, in the brain, a putative Type II network is a reflection of our temporary state of ignorance. The conjecture advanced here is that, in the brain, such reductive reasoning will fare poorly in the long run: It is not the case that one can develop a system of primitive properties that, together, span the functions/processes of interest. In many cases, network properties are not reducible to component interactions of well-defined sub-functions.


Gao, Z., Davis, C., Thomas, A. M., Economo, M. N., Abrego, A. M., Svoboda, K., … & Li, N. (2018). A cortico-cerebellar loop for motor planning. Nature, 563(7729), 113-116.

Guo ZV, Inagaki HK, Daie K, Druckmann S, Gerfen CR, Svoboda K. 2017. Maintenance of persistent activity in a frontal thalamocortical loop. Nature 545:181–186. DOI: https://doi.org/10.1038/nature22324,

Hart, E., & Huk, A. C. (2020). Recurrent circuit dynamics underlie persistent activity in the macaque frontoparietal network. Elife, 9, e52460.

Murray JD, Jaramillo J, Wang XJ. 2017. Working memory and Decision-Making in a frontoparietal circuit
model. The Journal of Neuroscience 37:12167–12186. DOI: https://doi.org/10.1523/JNEUROSCI.0343-17.2017

Neuroscience explanations

Although neuroscience studies are incredibly diverse, one way to summarize them is as follows: “Area or circuit X is involved in behavior Y” (where a circuit is a group of areas). A lesion study might determine that patients with damage of the so-called cortex of the anterior insula have the ability to quit smoking easily, without relapse, leading to the conclusion that the insula is a critical substrate in the addiction to smoking (1). Why? Quitting is hard in general, of course. But it turns out to be easy if one’s anterior insula is nonfunctional. It’s logical, therefore, to surmise that, when intact, this region’s operation somehow promotes addiction. An activation study using functional MRI might observe stronger signals in parts of the visual cortex when participants view pictures of faces compared to when they are shown many kinds of pictures that don’t contain faces (pictures of multiple types of chairs, shoes, etc.). This could prompt to the suggestion that this part of the visual cortex is important for the perception of faces. A manipulation study could enhance activity in prefrontal cortex in monkeys, and observe an improvement in tasks that require careful attention to visual information.

Many journals require “significance statements” in which authors summarize the importance of their studies to a broader audience. In the instances of the previous paragraph, the authors could say something like this: 1) the insula contributes to conscious drug urges and to decision-making processes that precipitate relapse; 2) the fusiform gyrus (the particular area of visual cortex that responds vigorously to faces) is involved in face perception; and 3) the prefrontal cortex enhances performance of behaviors that are challenging and require attention.

Figure . Because little is known about how brain mechanisms bring about behaviors, neuroscientists permeate their papers with “filler” verbs as listed above, most of which do not add substantive content to the statements made. Figure from Krakauer, J. W., Ghazanfar, A. A., Gomez-Marin, A., MacIver, M. A., & Poeppel, D. (2017). Neuroscience needs behavior: correcting a reductionist bias. Neuron, 93(3), 480-490.

The examples above weren’t gratuitous; all were important studies published in respected scientific journals (2). Although these were rigorous experimental studies, they don’t quite inform about the underlying mechanisms (3). In fact, if one combs the peer-reviewed literature, one finds a plethora of filler terms – words like “contributes”, “involved”, and “enhances” above (see Figure) – that stand in for the processes we presume did the “real” work. This is because, by and large, neuroscience studies don’t sufficiently determine, or even strongly constrain, the underlying mechanisms that link brain to behavior.


[1] Naqvi et al. (2007).

[2] Addiction: Navqi et al. (2007); faces: Kanwisher et al. (1997); attention: Noudoost et al. (2010).

[3] But the stimulation studies described by Noudoost et al. (2010) come closest.

The Networked Brain

Chapter 8: Complex systems: the science of interacting parts

What kind of object is the brain? The central premise of this book is that it cannot be neatly decomposed into a set of parts, so that each one can be understood on its own. Instead, it’s a highly networked system that needs to be understood differently. The language that is required is the one of complex systems, which we now describe in intuitive terms. Whereas mathematics is needed to formalize it, illustration of its central concepts provides the reader with “intuition pumps”. Thinking in terms of complex systems frees us from the shackles of linear thinking, enabling explanations built with “collective computations” that elude simplistic narratives.

Kelp carpets[1]

Many coastal environments are inhabited by a great variety of algae, including a brown seaweed called kelp[2]. The distribution of kelp can be very uneven, with abundance in some places, near absence in others. Ecologists noticed that in some coastal communities with tide pools and shallow waters largely devoid of kelp and other algae, killer whales (also called orcas) are also plentiful. The orcas don’t eat kelp, so the negative relationship between the two must be purely incidental, right? Sea otters are a frequent member of coastal habitats, too, and their population has rebounded strongly since they gained protected status in 1911, just at a time when they numbered only 2,000 worldwide. Could they be the ones responsible for the lack of kelp in some areas? Sea otters don’t consume kelp either, so what could be going on? It turns out that sea urchins are one of the most prevalent grazers of algae and kelp. And otters snack on sea urchins in large amounts. Therefore, because the presence of otters suppresses the urchin population, they have a direct impact on the kelp carpeting along the coast: the more otters, the fewer the urchins, and the richer the kelp. We see that otters and kelp are linked via a double negative logic: if you suppress a suppressor, the net effect is an increase (of kelp in this case).

But how about the orcas? The preferred meals of killer whales are sea lions and other whales, which are larger and richer in the fat content they need. But in some places these favored foods have become scarce. And with the large increase of otters given conservation efforts, the orcas appear to have turned to them as replacement meals. Altogether, we have a four-way relationship:  5killer whales à 6otters à 5 urchins à 6kelp. This cascade is not a one-of-a-kind illustration of indirect effects, it’s at the core of how ecological systems function. In other words, complex webs of interrelationships with many indirect effects – in fact, with multiple-step-removed indirect effects — are pretty much the norm.

Bacterial decision making

Bacteria are extremely simple creatures. But when they are grown in a medium containing glucose and carbon dioxide, they can make all twenty kinds of amino acids, which are the building blocks of proteins, the latter being the ones that do a lot of the heavy lifting in living things[3]. When a specific amino acid is added to a glob of bacteria that is generating amino acids, the biosynthesis of that specific amino acid stops soon thereafter. But how do bacteria know that they no longer need to synthesize that particular amino acid?

In the 1950s, biochemists started to understand that amino acids are manufactured via several steps, starting with an initial “precursor” that is modified by a series of reactions leading to the amino acid. We can represent this in the following way: P –> I1 –> I2 –> … –> amino acid, where P stands for the chemical precursor and I for the various intermediate products. They also discovered that introducing a certain amino acid (call it amino acid A) could terminate the synthesis of other amino acids, indicating that amino acid A was having a negative effect somewhere along a synthesis pathway like the one above. What was more interesting though was that providing the amino acid isoleucine inhibited the production of isoleucine itself. If the system is producing a specific amino acid (isoleucine), one could imagine that adding more of it would further increase the overall amount of this compound. But just the opposite was found. Thus, the system automatically adjusted itself to prevent the overproduction of isoleucine.

This is an example of negative feedback regulation. Feedback seems simple enough, thermostats and auto-pilot systems being common in the modern world. However, it carries the core of a fundamental property: the ability of a system, even if very basic, to regulate itself. In its simplest form the idea is rather benign and unproblematic. But feedback muddies our intuitions about causation.

Figure 8.1. Feedforward, feedback, and interacting systems. System 1 is purely feedforward, while system 2 contains negative feedback. System 3 contains two chains, one that produces A, one that produces B. Each of them contains negative feedback. The chains are also positively coupled so that production of A encourages production of B, and vice versa.

Consider again a system without feedback (call it system 1) (Figure 8.1). A precursor P causes some intermediate product I, which produces another chemical, and so on, until the last intermediate in the chain produces A. As far as causation goes, system 1 is straightforward. Now, consider system 2, which includes negative feedback. Here, if there’s too much of A, the production of A itself will be inhibited so that its concentration will not increase further. System 2 is not hard to understand, but with the small change (A loops back on the system), A has a causal effect on itself, which means that A is both an effect (A is produced by the system) and a cause (A affects itself). Let’s up things now, and consider system 3, which has feedback loops, and interactions between two amino-acid pathways. Here, amino acids A and B affect and reinforce each other; production of A stimulates the growth of B, and production of B stimulates the growth of A. But production and growth of A and B are not unbounded because they have negative loops within their respective systems. Clearly, mapping out the mechanisms of production of the amino acids in system 3 is substantially more challenging than the other two examples.

There is nothing odd, or potentially mysterious, about system 3. All the functioning is mechanistic, in the sense that all parts function according to the standard rules of chemistry and physics. All that is present in interdependence. Why do we even need to bring this up? In many experimental disciplines researchers are trained to think about causation pathways like that of system 1, and to some extent of system 2. Thus, causation appears to work in relatively simple ways; for example, higher levels of cholesterol “cause” (with multiple intermediate steps) greater heart-disease problems. As elaborated in this chapter, however, systems like system 3 can exhibit “complex” behaviors and “emergent” properties that are qualitatively different from those seen in simpler cases. And if the systems studied in biology are heavily interdependent, the field needs a change in perspective to move forward.

Of predators and prey

The Italian biologist Umberto D’Ancona was a prolific scientist who published over 300 papers and described numerous species. While studying fish catches in the Adriatic Sea, he noticed that the abundance of certain species increased markedly during the years of World War I[4], a time when fishing intensity reduced because of the war. Puzzled by the observation, he discussed it with the Italian mathematician and physicist Vito Volterra who had become interested in mathematical biology, and happened to be courting his daughter (incidentally, the two would later marry). It’s worth pointing out that at the time that D’Ancona made his observations ecology was not yet a systematic field of study (Charles Elton’s now-classic Animal Ecology was published in 1927).

In the early 1920s, Volterra, and independently Alfred Lotka, mathematically described how interactions between a predator and its prey could be precisely written out (in Volterra’s case, prey being fish, and predator being fishermen). While we don’t need to concern ourselves with the equations here, the model specifies that the number of predators, y, decays in the absence of prey, and increases based on the rate at which they consume prey. At the same time, the number of prey, x, grows if left unchecked, and decays given the rate at which it is preyed upon. The key point here is that x depends on y and, conversely, y depends on x. This interdependence means that we can eschew a description in terms of simple causation (say, “predation causes prey numbers to fall”) and consider the predator-prey system as a unit. Put differently, predator and prey numbers co-evolve, and as such characterizing and understanding them implies studying the “system,” predator plus prey.

By doing so, we aren’t saying that there are no causal interactions taking place. Fishermen do kill fish and have an immediate impact on their population. But we can treat the predator-plus-prey pair as the object of interest. Whereas this is a relatively minor conceptual maneuver in this case, it will prove instrumental when a larger constellation of actors interacts.

Against reductionism

The Lotka-Volterra predator-prey model formalized the relationship between a single predator species and a single prey species. Of course, natural habitats are not confined to two species but, as the killer whale and kelp example illustrated, multi-species interactions are the norm. Thus, unraveling an entire set of interconnections is required for deeper understanding.

The prevailing modus operandi of science can be summarized as follows: “explain phenomena by reducing them to an interplay of elementary units which could be investigated independently of each other.”[5] Such reductionistic approach reached its zenith, perhaps, with the success of chemistry and particle physics in the twentieth century. In the present century, its power is clearly evidenced by dramatic progress in molecular biology and genetics. At its root, this attitude to science “resolves all natural phenomena into a play of elementary units, the characteristics of which remain unaltered whether they are investigated in isolation or in a complex”.

In the 1940s and 1950s, “systems thinking” started to offer an alternative mental springboard. Scholars surmised that many objects of study could be studied in terms of collections of interacting parts, an approach that could be applied to physical, biological, and even social problems. The framework developed, which some called complex systems theory, doesn’t challenge the status and role of “elementary” units (no one was about to rescind Nobel prizes such as Ernest Rutherford’s for the atomic model!). Again in the words of one of its chief proponents, Von Bertalanffy, quoted in the previous paragraph, it “asserts the necessity of investigating not only parts but also relations of organization resulting from a dynamic interaction and manifesting themselves by the difference in behavior of parts in isolation and in the whole organism”.

What does it mean to say “difference in behavior of parts in isolation and in the whole organism”. Enter emergence, a term originally coined in the 1870s to describe instances in chemistry and physiology where new and unpredictable properties appear that aren’t clearly ascribable to the elements from which they arise[6]. When amino acids organize themselves – that is, self-organize – into a protein, the protein can carry out enzymatic functions that the amino acids on their own cannot. More importantly, they behave differently as part of the protein than they would on their own. But it’s actually more than that. The dynamics of the system (that is, the protein) closes off some of the behaviors that would be open to the components (amino acids) were they not captured by the overall system. Once folded up into a protein, the amino acids find their activity regulated – they behave differently. Thus, one definition of emergence is as follows: a property that is observed when multiple elements interact that is not present at the level of the elements. Accordingly, it becomes meaningful to talk about two levels of description, a lower level of elements, and a higher level of the system.

The growth of the complex systems approach was quickly popularized by expressions such as “system,” “gestalt,” “organism,” “wholeness,” and of course the much-used “the whole is more than the sum of its parts.” In a manner that anticipated debates that would persist for decades, and still do, Von Bertalanffy stated as early as 1950 that “these concepts have often been misused, and they are of a vague and somewhat mystical character.”[7] Even more presciently, he said that the “exact scientist therefore is inclined to look at these conceptions with justified mistrust.”

Consider research in biology. The stunning developments of molecular biology, for one, raise the hope that all seemingly emergent properties can eventually be “explained away” and thereby deduced from lower level characteristics and laws – the “higher” level can therefore be reduced to the “lower” level. Reduction to basic physics and chemistry become, then, the ultimate goal of scientific explanations. In this view, emergence is relegated to a sort of “promissory reductionism” – if not outright discredited – given that at a more advanced stage of science emergent properties will be entirely captured by lower-level properties and laws. No doubt, it is extremely hard to argue against this line of argument. As the philosopher Terrence Deacon nicely states, looking at the world in terms of constituent parts of larger entities seems like an “unimpeachable methodology.” It is as old as the pre-Socratic Greek thinkers and remains almost an “axiom of modern science.”[8]

Both scientifically and philosophically speaking, the friction caused by the idea of emergence arises because it’s actually unclear what precisely emerges. For example, what is it about amino acids as part of proteins that differs from free floating ones? The question revolves around the exact status of “emergent properties.” Philosophers formalize the terms used by talking about the ontological status of emergence, that is, concerning the proper existence of the higher-level properties. Do emergent properties point to the existence of new laws that are not present at the lower level? Is something fundamentally irreducible at stake? These questions are so daunting that they remain by and large unsolved – and subject to vigorous intellectual battles.

Figure 8.2. Levels of explanation and scientific reduction. (A) Describing airplane aerodynamics in terms of elementary particles such as quarks is clearly not very useful. (B) Molecular configurations in three dimensions may be investigated by determining how their properties depend on chemical interactions between amino acids (which themselves determine protein structure).

Fortunately, we don’t need to crack the problem here, and can instead use lower and higher levels pragmatically when they are epistemically useful – when the theoretical stance advances knowledge. To provide an oversimplified example, we don’t need to worry about the status between quarks and aerodynamics. Massive airplanes are of course made of matter, which are agglomerations of elementary particles such as quarks (when put together quarks form things like protons and neutrons, the stable components of atomic nuclei). But when engineers design a new airplane, they consider the laws of aerodynamics, the study of the motion of air, and particularly the behavior of a solid object, such an airplane wing, in air – they need no training at all in particle physics! So there’s no need to really agonize about the “true” relationship between aerodynamics and particle physics. The practical thing to do is simply to study the former.

One could object to the example above because the inherent levels of particle physics and aerodynamics are far removed (Figure 8.2), one level too “micro” and the other too “macro.” More interesting cases present themselves when the constituent parts and the higher-level objects are closer to each other. For example, the behavior of an individual ant and the collective behavior of the ant colony; or the flight of a pelican and the V-shape pattern of the flock. And of course, amino acids and proteins. As the researcher Alicia Juarrero says, it’s particularly intriguing when purely “deterministic systems exhibit organized and apparently novel properties, seemingly emergent characteristics that should be predictable in principle, but are not in fact”[9]. And it’s all the more fascinating when the systems involved are made of very simple parts that obey straightforward rules. Understanding higher-level properties without having to solve the ontological question – are these properties truly new? – is clearly beneficial.

We encountered John von Neumann previously in Chapter 2. He not only was one of the major players in defining computer science as we know it, but his contributions to mathematics and physics are astounding. For instance, in 1932 he was the first to establish a rigorous mathematical framework for quantum mechanics. One of his smaller contributions was the invention of cellular automata, and this without the aid of computers, just with pencil and paper. (Another “minor” contribution by von Neumann was the invention of game theory, which is the study of mathematical models of conflict and cooperation.) A simple way to think of cellular automata is to imagine a piece of paper onto which a regular grid is drawn. Each “cell” of the grid can be in one of two states (active or inactive, or 0 or 1; think of a computer bit). The cells transition state according to simple, but precise rules, depending on the state of the cells’ neighborhood. Different types of spatial neighborhood arrangements can be utilized but consider the simplest case, with just the cells to the left and to the right of a reference cell. A rule could turn the center cell active if either neighboring cell is active (called the OR rule); another rule could turn the center on if both neighbors are active (called the AND rule). If the cells start at some state, for instance a random configuration of 0s and 1s, one can let them change states according to a specific set of rules and observe the overall behaviors that ensue (imagine a screen with pixels turning on and off). Remarkably, even simple cellular automata can exhibit rather complex behaviors, including the formation of hierarchically organized patterns that fluctuate periodically.

Although cellular automata were not widely known outside computer science circles, the idea was popularized more broadly with the invention of the Game of Life (or simply, Life). The game has attracted much interest not least because of the surprising ways in which patterns can evolve. From a relatively simple set of rules, some of the observed patterns are reminiscent of the rise, fall, and alterations of a society of living organisms, and have been used to illustrate the notion that “design” and “organization” can emerge in the absence of a designer[10].

The examples provided by cellular automata, and others discussed in this chapter, suggest that we can adopt a pragmatic stance regarding the “true” standing of emergence. We can remain agnostic about the status itself, but adopt a complex systems framework to advance the understanding of objects with many interacting elements. Let’s discuss some ways in which this viewpoint is taking place in the field of ecology, the research area that originated the predator-prey models of Lotka and Volterra.

How do species interact?

Ecology is the scientific study of interactions between organisms and their environment. A major topic of interest centers around the cooperation and competition between species. One may conjure investigators withstanding the blazing tropical sun to study biodiversity in the Amazon, or harsh artic winters to study fluctuations in the population of polar bears. Although such field work is necessary to gather data, theoretical work is equally needed.

What are the mechanisms of species coexistence?[11] And how does the enormous diversity of species seen in nature per­sist despite differences in the ability to compete for survival. Diversity indeed. For example, a 25-hectare plot in the Amazon rainforest contains more than 1,000 tropical tree species. As we’ve seen, in the 1920s mathematical tools to model the dynamics of predator-prey systems were developed. The equations for these systems were further extended and refined in the subsequent decades, and continue to be the object of much research. The study of species coexistence focuses almost exclusively on pairs of competitors, so that when considering large groups of plants or animals, the strategy is to look at all possible couples. For example, one studies 3 pairs when 3 species are involved, or 6 pairs when 4 species are considered; more generally,  interactions between n species. Do we lose anything when examining only pairwise interactions? Higher-order interactions are missed, as when the effect of one competitor on another depends on the population density of a third species, or an even larger number of them. For example, the interaction between cheetahs and gazelles might be affected by hyenas, as the latter can easily challenge the relatively scrawny cheetahs after the kill, especially when not alone (Figure 8.3).

Figure 8.3. Species interactions. (A) Two-way interaction, such as between predator and prey. (B) A higher-order interaction occurs when an additional element affects the way the two-way interaction behaves.

The importance of high-order effects is that, at times, they make predictions that diverge from what would be expected from only pairwise interactions. In a classic paper entitled “Will a large complex system be stable?” the theoretical biologist Robert May showed formally that community diversity destabilizes ecological systems. In other words, diverse communities lead to instabilities such as the local elimination of certain species. Recent theoretical results show, however, that higher-order interactions can cause communities with greater diversity to be more stable than their species-poor counterparts, contrary to classic theory that is based on pairwise interactions[12]. These results illustrate that to understand a complex system (diverse community) of interacting players (species), we must determine (emergent) properties at the collective level (including coexistence and biodiversity). Not only do we need to consider interactions, but we need to describe them richly enough for collective properties to be unraveled.

Neural networks

Ideas about complex systems and the closely related movement of cybernetics didn’t take long to start influencing thinking about the brain. For example, W. Ross Ashby outlined in his 1952 book Design for a Brain the importance of stability. Cybernetics researchers were interested in how systems regulate themselves and avoid instability. In particular, when a system is perturbed from its current state, how does it automatically adapt its configuration to minimize the effects of such disturbances? Not long afterward, the field of artificial neural networks (or simply neural networks) started to materialize. The growth of this new area proceeded in parallel with “standard” artificial intelligence. Whereas the latter sought to design intelligent algorithms by capitalizing on the power of newly developed computers, neural networks looked at the brain for inspiration. The general philosophy was simple: collections of simple processing elements, or “neurons,” when arranged in particular configurations called architectures generate sophisticated behaviors. And by specifying how the connections between artificial neurons change across time, neural networks learn new behaviors.

Many types of architecture were investigated, including purely feedforward and recurrent networks. In feedforward networks, information flows from an input layer of neurons where the input (for instance, an image) is registered, to one or more intermediate layers, eventually reaching an output layer, where the output is coded (indicating that the input image is a face, say). Recurrent networks, where connections can be both feedforward and feedback, are more interesting in the context of complex systems. In this type of organization, at least some connections are bidirectional and the systems can exhibit a range of properties. For example, competition can occur between parts of the network, with the consequent suppression of some kinds of activity and the enhancement of others[13]. Interested in this type of competitive process, in the 1970s, Stephen Grossberg, whom we mentioned in the previous chapter, developed Adaptive Resonance Theory. In the theory, a resonance is a dynamical state during which neuronal firings across a network are amplified and synchronized when they interact bidirectionally – they mutually support each other (see Figure 7.6 and accompanying text). Based on the continued development of the theory in the decades since its proposal, these types of bidirectional, competitive interactions have been used to explain a large number of experimental findings across the areas of perception, cognition, and emotion, for example[14].

Nonlinear dynamical systems

As we’ve seen, in the second half of the twentieth century complex systems thinking began to flourish and influence multiple scientific disciplines, from the social to the biological. The ideas gained considerable momentum with the development of an area of mathematics called nonlinear dynamical systems. It’s no exaggeration to say that nonlinear dynamical systems provide a language for complex systems. This branch of mathematics studies techniques that allow applied scientists to describe how objects change in time. It all started with the discovery of differential and integral calculus by Isaac Newton and Gottfried Leibniz in the last decades of the seventeenth century. Calculus is the first monumental achievement of modern mathematics and many consider it the greatest advance in exact thinking[15]. Newton, for one, was interested in planetary motion, and used calculus to describe the trajectories of planets in orbit.[16]

Research in dynamical systems revealed that even putatively simple systems can exhibit very rich behaviors. At first, this was rather surprising because mathematicians and applied scientists alike believed deterministic systems behave in a fairly predictable manner. Because of this intuition, many techniques relied on “linearization,” that is, considering a system to be approximately linear (at least when small perturbations are involved). What is a linear system? In essence, it is one that produces an output by summing its inputs: the more the input, the more the output, and in exact proportion to the inputs. Systems like this are predictable, and thus are stable, which is desirable when we design a system. When you change the setting on the ceiling fan to “2” it moves faster than at “1”; when set to “3” you don’t want it spinning out of control all of sudden!

The field of nonlinear systems tells us that “linear thinking” is just not enough. Approximating the behavior of objects via linear systems does not do justice to the complexity of behaviors observed in real situations, as most clearly demonstrated a property called chaos. Confusingly, “chaos” does not refer to erratic, or random behavior but, instead, to a property of systems that follow precise deterministic laws but appear to behave randomly. Although the precise definition of “chaos” is mathematical, we can think of it as describing complex, recurring, yet not exactly repeatable behaviors. (Imagine a leaf floating in a stream caught between rocks and circling around them in a way that is both repeating but not identical.) The theoretical developments in nonlinear dynamics were extremely influential because until the 1960s even mathematicians and physicists thought of dynamics in relatively simple terms[17].

The field of dynamical systems has greatly enriched our understanding of natural and artificial systems. Even those with relatively simple descriptions can exhibit behaviors that are not possible to predict with confidence. Nonlinear dynamical systems not only contribute to our view of how interacting elements behave, but defines both a language and a formal system to characterize “emergent” behaviors. In a very real sense, they have greatly helped demystify some of the vague notions described in the early days of systems thinking. We now have a precise way to tackle the question of “the sum is greater than its parts.”

The brain as a complex system

Complex systems are now a sprawling area encompassing applied and theoretical research. The goal of this chapter was to introduce the reader to some of its central ideas (a rather optimistic proposition without writing an entire book!). Whereas the science of complexity has evolved enormously in the past 70-odd years, experimental scientists are all-too-often anchored on a foundation that is skeptical of some the concepts discussed in this chapter. But with the mathematical and computational tools available now, there is little reason for that anymore.[18] What are some of the implications of complex systems theory to our goal of elucidating brain functions and how they relate to parts of the brain?

Interactions between parts. The brain is a system of interacting parts. At a local level, say within a specific region, populations of neurons interact. But interactions are not only local to the area, and a given behavior relies on communication between many regions. Anatomical connectivity provides the substrate for interactions that span multiple parts of the cortex, as well as bridging cortex and subcortex. This view stands in sharp contrast to a “localizationist” framework that treats regions as relatively independent units.

Levels of analysis. This concept is related to the previous one, but emphasizes a different point. All physical systems can be studied as multiple levels, from quarks up to the object of interest. Not in all cases it’s valuable to study the multiple levels (worrying about quarks in aerodynamics, say). But in the brain, studying multiple levels and understanding their combined properties is essential. One can think of neuronal circuits from the scale of a few neurons in a rather delimited area of space, to larger collections across broader spatial extents. Multiple spatial scales will be of interest, including large-scale circuits with multiple brain regions spanning cortex and subcortex. A possible analogy is the investigation of the ecology of the most biodiverse places on earth, including the Amazon forest and the Australian barrier reef. One can study these systems at very different spatial scales, from local patches of the forest and a few species to the entire coral reef with all its species.

Time, process. Complex systems, like the brain, are not static – they are inherently dynamic. As in predator-prey systems, it is useful to shift one’s perspective from one of simple cause-and-effect to that of a process that evolves in time – a natural shift in stance given the interdependence of the parts involved. When we say a “process” there need not be anything nebulous about it. For example, in the case of three-body celestial orbits under the influence of Newtonian gravity, the equations can be precisely defined and solved numerically to reveal the rich pattern of paths traversed[19]. Decentralization, heterarchy. Investigating systems in terms of the interactions between their parts fosters a way of thinking that favors decentralized organization. It is the coordination between the multiple parts that leads to the behaviors of interest, not a master “controller” that dictates the function of the system. In many “sophisticated” systems, and the brain is no exception, it is instinctive to think that many of its important functions depend on centralized processes. For example, the prefrontal cortex may be viewed as a convergence sector for multiple types of information, allowing it to control behavior (see Chapter 7). A contrasting view favors distributed processing via interactions of multiple parts. Accordingly, instead of information flowing hierarchically to an “apex region” where all the pieces are integrated,

[1] “Intuition pumps” comes from Dennett (2013).

[2] Kelp ecosystem example from Carroll (2016).

[3] Bacterial regulation example from Carroll (2016).

[4] https://en.wikipedia.org/wiki/Lotka-Volterra_equations (July 28, 2020)

[5] This and the next two quotes are from Von Bertalanffy (1950, pp. 219-220).

[6] Paragraph largely based on Juarrero (2000, p. 7). The term emergence appears to have first been proposed in the 1870s by George Henry Lewes in his book Problems of Life and Mind and taken up by Wilhelm Wundt in his book Introduction to Psychology.

[7] von Bertalanffy (1950, p. 225)

[8] Deacon (2011).

[9] Juarrero (1999, p. 6).

[10] https://en.wikipedia.org/wiki/Conway%27s_Game_of_Life

[11] Paragraph draws from Levine et al. (2017).

[12] Bairey et al. (2016). Just a few years ago, Levine et al. (2017, p. 61) pointed out that “higher-order interactions need to be demys­tified to become a regular part of how ecologists envision coexistence, and identifying their mechanistic basis is one way of doing so.”

[13] Some forms of competition are possible in feedforward networks, too.

[14] Grossberg (2021).

[15] See quote by von Neumann (https://en.wikipedia.org/wiki/Calculus).

[16] The problem of stability was central to celestial mechanics. For example, what types of trajectories do two bodies, such as the earth and the sun, exhibit? The so-called two-body problem was completely solved by Johann Bernoulli in 1734 (his brother Jacob is famous for his contributions in the field of probability, including the first version of the law of large numbers). For more than two bodies (for example, the moon, the earth, and the sun), the problem has vexed mathematicians for centuries. Remarkably, the motion of three bodies is generally non-repeating, except in special cases. See https://en.wikipedia.org/wiki/Three-body_problem and http://www.sciencemag.org/news/2013/03/physicists-discover-whopping-13-new-solutions-three-body-problem.

[17] More technically, until the 1960s, attractors were thought in terms of simple geometric subsets of the phase space (roughly speaking, the possible states of a system), like points, lines, surfaces, and simple regions of three-dimensional space (https://en.wikipedia.org/wiki/Attractor).

[18] von Bertalanffy (1950) stated that concepts like “system” and “wholeness,” to which we could add “emergence” and “complexity,” are vague and even somewhat mystical, and indeed many scientists displayed mistrust when faced with these concepts.

[19] Šuvakov and Dmitrašinović, V. (2013). See also footnote (-3).

The Networked Brain

Chapter 1: From one area at a time to networked systems

We begin our journey into the how the brain brings about the mind: our perceptions, actions, thoughts, and feelings. Historically, the study of the brain has proceeded in a divide-and-conquer way, trying to figure out the function of individual areas – chunks of gray matter that contain neurons in either the cortex or subcortex –, one at a time. The book makes the case that, as the brain is not a modular system, we need conceptual tools that can help us decipher how highly networked, complex systems function.

In 2016, a group of investigators published a map of the major subdivisions of the human cerebral cortex — the outer part of the brain — in the prestigious journal Nature (Figure 1.1). The partition delineated 180 areas in each hemisphere (360 in total), each of which representing a unit of “architecture, function, and connectivity.[1]” Many researchers celebrated the new result highlighting the long-overdue need to replace the de facto standard called the “Brodmann map.” Published in 1908 by Korbidian Brodmann, the map describes approximately 50 areas in each hemisphere (100 in total) based on local features, such as cell type and density, that Brodmann discovered under the microscope.

Figure 1.1. Map of brain areas of the cortex published in with the hope of replacing the standard Brodmann map of 1908. In the 2016 map, each hemisphere (or half of the brain) contains 180 areas. Figure from Glasser et al. (2016).

Notwithstanding the need to move past a standard created prior to the First World War, the 2016 cartographic description builds on an idea that was central to prior efforts: brain tissue should be understood in terms of a set of well-defined, spatially delimited sectors. Thus the concept of a brain area or region[2]: a unit that is both anatomically and functionally meaningful. The notion of an area/region is at the core of neuroscience as a discipline, with its central challenge of unravelling how behaviors originate from cellular matter. Put another way, how does function (manifested externally via behaviors) relate to structure (such as different neuron types and their arrangement)? How do groups of neurons – the defining cell type of the brain – lead to sensations and actions?

As a large and heterogeneous collection of neurons and other cell types, the central nervous system – with all of its cortical and subcortical parts – is a formidably complex organ. (The cortex is the outer surface with grooves and bulges; the subcortex comprises other cell masses that sit underneath. We’ll go over the basics of brain anatomy in Chapter 2.) To unravel how it works, some strategy of divide and conquer seems to be necessary. How else can it be understood without breaking it down into subcomponents? But this approach also exposes a seemingly insurmountable chicken-and-egg problem: if we don’t know how it works, how can we determine the “right” way to subdivide it? Finding the proper unit of function, then, has been at the center of the quest to crack the mind-brain problem.

Historically, two winners in the search for rightful units have been the neuron and the individual brain area. At the cellular level, the neuron reigns supreme. Since the work of Ramon y Cajal[3], the Spanish scientific giant who helped establish neuroscience as an independent discipline, the main cell type of the brain is considered to be the neuron (which come in many varieties both in terms of morphology and physiology). These cells communicate with one another via electrochemical signaling. If they are sufficiently excited by other neurons, their equilibrium voltage changes they generate a “spike”: an electrical signal that propagates along the neuron’s thin extensions (called axons), much like a current flowing through a wire. The spike from a neuron can then influence downstream neurons. And so on.

At the supra-cellular level, the chief unit is the area. But what constitutes an area? Dissection techniques and the study of neuroanatomy during the European Renaissance were propelled to another level by Thomas Willis’s monumental Cerebri anatome published in 1664. The book rendered in exquisite detail the morphology of the human brain, including detailed drawings of subcortical structures and the cerebral hemispheres containing the cortex. For example, Willis described a major structure of the subcortex, the striatum, that we’ll discuss at length in the chapters to follow. With time, as anatomical methods improved with more powerful microscopes and diverse stains (which mark the presence of chemical compounds in the cellular milieu), more and more subcortical areas were discovered. In 1819, the German anatomist Karl Burdach described a mass of gray matter that could be seen in slices through the temporal lobe. He called the structure the “amygdala” — given that it’s shaped like an almond[4] (“amygdala” means almond in Latin) –, now famous for its contributions to fear processes. And techniques developed in the second half of the 20th century revealed that it’s possible to delineate a least a dozen subregions within its overall territory.

The seemingly benign question — what counts as an area? – is far from straightforward. For instance, is the amygdala one region or twelve? This region is far from an esoteric case. All subcortical areas have multiple subdivisions, and some have boundaries that are more like fuzzy zones than clearly defined lines. The challenges of partitioning the cortex, the outer laminated mantle of the cerebrum, are enormous too. That’s where the work of Brodmann and others, and more recently the research that led to the 180-area parcellation (Figure 1.1), comes in. It introduces a set of criteria to subdivide the cortex into constituent parts. For example, although neurons in the cortex are arranged in a layered fashion, the number of cell layers can vary. Therefore, identifying a transition between two cortical sectors is aided by differences in cell density and layering.

How modular is the brain?

When subdividing a larger system – one composed of lots of parts – the concept of modularity comes into play. Broadly speaking, it refers to the degree of interdependence of the many parts that comprise the system of interest. On the one hand, a decomposable system is one in which each subsystem operates according to its own intrinsic principles, independently of the others – we say that this system is highly modular. On the other hand, a nondecomposable system is one in which the connectivity and inter-relatedness of the parts is such that they are no longer clearly separable. Whereas the two extremes serve as useful anchors to orient our thinking, in practice one finds a continuum of possible organizations, so it’s more useful to think of the degree of modularity of a system.

Science as a discipline is inextricably associated with understanding entities in terms of a set of constituent subparts. Neuroscience has struggled with this modus operandi since its early days, and debates about “localizationism” versus “connectionism” – how local or how interconnected brain mechanisms are – have always been at the core of the discipline. By-and-large a fairly modular view has prevailed in neuroscience. Fueled by a reductionistic drive that has served science well, most investigators have formulated the study of the brain as a problem of dissecting the multitude of “suborgans” that make it up. To be true, brain parts are not viewed as isolated islands, and are understood to communicate with one another. But, commonly, the plan of attack assumes that the nervous system is decomposable[5] in a meaningful way in terms of patches of tissue (as in Figure 1.1) that perform well-defined computations. If we can only determine what these are.   

There have been proposals of non-modular processing, too. The most famous example is that of Karl Lahsley who, starting in the 1930s, defended the idea of “cortical equipotentiality,” namely that most of the cortex functions jointly, as a unit. Thus, the extent of a behavioral deficit caused by a lesion depended on the amount of cortex that was compromised – small lesions cause small deficits, large lesions cause larger ones. Although Lashley’s proposal was clearly too extreme and rejected empirically, historically, many ideas of decentralized processing have been entertained by neuroscientists. Let’s discuss some of their origins.

The networked brain

The field of artificial intelligence (AI) is said to have been born at a workshop at Dartmouth College in 1956. Early AI focused on the development of computer algorithms that could emulate human-like “intelligence,” including simple forms of problem solving, planning, knowledge representation, and language understanding. A parallel and competing approach – what was to become the field of artificial neural networks, or neural networks, for short – took its inspiration instead from natural intelligence, and adopted basic principles of the biology of nervous systems. In this non-algorithmic framework, collections of simple processing elements work together to execute a task. An early example was the problem of pattern recognition, such as recognizing sequences of 0s and 1s. A more intuitive, modern application addresses the goal of image classification. Given a set of pictures coded as a collection of pixel intensities, the task is to generate an output that signals a property of interest; say, output “1” if the picture contains a face, “0” otherwise. The underlying idea behind artificial neural networks was that “intelligent” behaviors result from the joint operation of simple processing elements, like artificial neurons that sum their inputs and generate an output if the sum exceeds a certain threshold value. We’ll discuss neural networks again in Chapter 8, but here we emphasize their conceptual orientation: thinking of a system in terms of collective computation.

The 1940s and 1950s were also a time when, perhaps for the first time, scientists started systematically developing theories of systems generally conceived. The intellectual cybernetics movement was centrally concerned with how systems regulate themselves so as to remain within stable regimes; for example, normal, awake human temperature remains within a narrow range, varying less than a degree Celsius. Systems theory, also called general systems theory or complex systems theory, tried to formalize how certain properties might originate from the interactions of multiple, and possibly simple, constituent parts. How does “wholeness” come about in a way that is not immediately explained by the properties of the parts?

Fast forward to 1998 when Duncan Watts and Steven Strogatz published a paper entitled “Collective dynamics of ‘small world’ networks”[6]. The study proposed that the organization of many biological, technological, and social networks gives them enhanced signal-propagation speed, computational power, and synchronization among parts. And these properties are possible even in systems where most elements are connected locally, with only some elements having “arbitrary” connections. (For example, consider a network of interlinked computers, such as the internet. Most computers are only connected to others in a fairly local manner; say, within a given department within a company or university. However, a few computers have connections to other computers that are geographically quite far.)

Watts and Strogatz applied their techniques to study the organization of a social network containing more than 200,000 actors. As we’ll discuss in Chapter 10, to make a “network” out of the information they had available, they considered two actors to be “connected” if they had appeared in a film together. Although a given actor was only connected to a small number of other performers (around 60), they discovered that it was possible to find short “paths” between any two actors. (The path A – B – C links actors A and C, which have not participated in the same film, if both of them have co-acted with actor B). Remarkably, on average, paths containing only four connections (such as the path A – B – C – D – E linking actors A and E) separated a given pair of actors picked at random from the set of 200,000. The investigators dubbed this property “small world” by analogy with the popularly known idea of “six degrees of separation”, and suggested that it is hallmark of many types of network – one can travel from A to Z very expediently.

The paper by Watts and Strogatz, and a related paper by Albert-László Barabási and Réka Albert that appeared the following year[7], set off an avalanche of studies on what has become known as “network science” – the study of interconnected systems comprised of more elementary components, such as a social network of individual persons. This research field has grown enormously since then, and novel techniques are actively being applied to social, biological, and technological problems to refine our view of “collective behaviors.” These ideas resonated with research in brain science, too, and it didn’t take long before investigators started applying networks techniques to study their data. This was particularly the case in human neuroimaging, which employs Magnetic Resonance Imaging (MRI) scanners to measure activity throughout the brain during varied experimental conditions. Network science provides a spectrum of analysis tools to tackle brain data. First and foremost, the framework encourages researchers to conceptualize the nervous system in terms of network-level properties. That is to say, whereas individual parts – brain areas or other such units – are important, collective or system-wide properties must be targeted.

Neuroscientific explanations

Neuroscience seeks to answer the following central question: How does the brain generate behavior?[8] Broadly speaking, there are three types of study: lesion, activity, and manipulation. Lesion studies capitalize on naturally occurring injuries, including due to tumors and vascular accidents; in non-human animals, precise lesions can be created surgically, thus allowing much better control over the affected territories. What types of behavior are impacted by such lesions? Perhaps patients can’t pay attention to visual information as they used to; or maybe they have difficulty moving a limb. Activity studies measure brain signals. The classical technique is to insert a microelectrode into the tissue of interest and measure electrical signals in the vicinity of neurons (it is also possible to measure signals inside a neuron itself, but such experiments are more technically challenging). Voltage changes provide an indication of a change in state of the neuron(s) closest to the electrode tip. And determining how such changes are tied to the behaviors performed by an animal provides clues about how they contribute to them. Manipulation studies directly alter the state of the brain by either silencing or enhancing signals. Again, the goal is to see how sensations and actions are affected.

Although neuroscience studies are incredibly diverse, one way to summarize them is as follows: “Area or circuit X is involved in behavior Y” (where a circuit is a group of areas). A lesion study might determine that patients with damage of the so-called cortex of the anterior insula have the ability to quit smoking easily, without relapse, leading to the conclusion that the insula is a critical substrate in the addiction to smoking[9]. Why? Quitting is hard in general, of course. But it turns out to be easy if one’s anterior insula is nonfunctional. It’s logical, therefore, to surmise that, when intact, this region’s operation somehow promotes addiction. An activation study using functional MRI might observe stronger signals in parts of the visual cortex when participants view pictures of faces compared to when they are shown many kinds of pictures that don’t contain faces (pictures of multiple types of chairs, shoes, etc.). This could prompt to the suggestion that this part of the visual cortex is important for the perception of faces. A manipulation study could enhance activity in prefrontal cortex in monkeys, and observe an improvement in tasks that require careful attention to visual information.

Many journals require “significance statements” in which authors summarize the importance of their studies to a broader audience. In the instances of the previous paragraph, the authors could say something like this: 1) the insula contributes to conscious drug urges and to decision-making processes that precipitate relapse; 2) the fusiform gyrus (the particular area of visual cortex that responds vigorously to faces) is involved in face perception; and 3) the prefrontal cortex enhances performance of behaviors that are challenging and require attention.

Figure 1.2. Because little is known about how brain mechanisms bring about behaviors, neuroscientists permeate their papers with “filler” verbs as listed above, most of which do not add substantive content to the statements made. Figure from Krakauer (2017).

The examples above weren’t gratuitous; all were important studies published in respected scientific journals[10]. Although these were rigorous experimental studies, they don’t quite inform about the underlying mechanisms[11]. In fact, if one combs the peer-reviewed literature, one finds a plethora of filler terms[12] – words like “contributes”, “involved”, and “enhances” above (Figure 1.2) – that stand in for the processes we presume did the “real” work. This is because, by and large, neuroscience studies don’t sufficiently determine, or even strongly constrain, the underlying mechanisms that link brain to behavior.

Scientists strive to discover the mechanisms supporting the phenomena they study. But what precisely is a mechanism? Borrowing from the philosopher William Bechtel, it can be defined as a ‘‘a structure performing a function in virtue of its parts, operations, and/or organization. The functioning of the mechanism is responsible for one or more phenomena’’[13]. Rather abstract, of course, but in essence how something happens. The more clear-cut we can be about it, the better. For example, in physics, precision actually involves mathematical equations. Note that mechanisms and explanations are always at some level of explanation. A typical explanation about combustion motors in automobiles will invoke pistons, fuel, controlled explosions, etc. It will not discuss these phenomena in term of particle physics, for instance; it won’t invoke electrons, protons, or neutrons.

We currently lack an understanding of most brain science phenomena. Therefore, when an experiment finds that changes occur in, say, the amygdala during classical aversive conditioning (learning that a once-innocuous stimulus is now predictive of a shock; see Chapter 5), we might find that cell responses there increase in parallel to the observed behavior – as the behavior is acquired, cells responses concomitantly increase. Although this is a very important finding, it remains relatively shallow in clarifying what’s going on. Of course, if via a series of studies we come to discern how amygdala activity increases, decreases, or stays the same when learning changes accordingly, we are closer to legitimately saying that we grasp the underlying mechanisms.  

Pleading ignorance

How much do we know about the brain today? In the media, there is no shortage of news about novel discoveries explaining why we are all stressed, overeat, or cannot stick to new-year commitments. General-audience books on brain and behavior are extremely popular, even if we don’t count the ever-ubiquitous self-help books, themselves loaded with purported insights from brain science. And judging from the size of graduate school textbooks (some of which are even hard to lift), current knowledge is a deep well.

In reality, we know rather little. What we’ve learned barely scratches the surface.

Consider, for example, a recent statement by an eminent neuroscientist: “Despite centuries of study of brain–behavior relationships, a clear formalization of the function of many brain regions, accounting for the engagement of the region in different behavioral functions, is lacking”[14]. A clear-headed description of our state of ignorance was given by Ralph Adolphs and David Anderson, both renowned professors at the California Institute of Technology, in their book The Neuroscience of Emotion:[15]

 We can predict whether a car is moving or not, and how fast it is moving, by ‘imaging’ its speedometer. That does not mean that we understand how an automobile works. It just means that we’ve found something that we can measure that is strongly correlated with an aspect of its function. Just as with the speedometer, imaging [measuring] activity in the amygdala (or anywhere else in the brain), in the absence of further knowledge, tell us nothing about the causal mechanism and only provides a ‘marker’ that may be correlated with an emotion.

Although these authors were discussing the state of knowledge regarding emotion and the brain, it’s fair to say that their summary applies to neuroscience more generally – the science of brain and behavior is still in its (very) early days.

The gap – no, gulf – between scientific knowledge and how it is portrayed by the general media is sizeable indeed. And not only a piece in a popular magazine in a medical office, but a serious article in, say, the New York Times or The Guardian, newspapers of some heft. The problem even extends to most science communication books, especially those with a more clinical or medical slant.

Mechanisms and complexity in biology

How does something work? As discussed above, science approaches this question by trying to work out mechanisms. We seek “machine-like” explanations, much like describing how an old, intricate clock functions. Consider a Rube Goldberg apparatus (for an example, see Figure 1.3), accompanied by directions on how to use it to turn a book page[16]

Figure 1.3. Rube Goldberg apparatus as an example of mechanical explanation. The text describes another example.

(1) Turn the handle on a toy cash register to open the drawer.

(2) The drawer pushes a golf ball off a platform, into a small blue funnel, and down a ramp.

(3) The falling golf ball pulls a string that releases the magic school bus (carrying a picture of Rube Goldberg) down a large blue ramp.

(4) Rube’s bus hits a rubber ball on a platform, dropping the ball into a large red funnel.

(5) The ball lands on a mousetrap (on the orange box) and sets it off.

(6) The mousetrap pulls a nail from the yellow stick.

(7) The nail allows a weight to drop.

(8) The weight pulls a cardboard ‘cork’ from an orange tube.

(9) This drops a ball into a cup.

(10) The cup tilts a metal scale and raises a wire.

(11) The wire releases a ball down a red ramp.

(12) The ball falls into a pink paper basket.

(13) The basket pulls a string to turn the page of the book.

The “explanation” above works because it provides a causal narrative: a series of cause-and-effect steps that slowly but surely lead to the outcome. Although this example is artificial of course (no one would turn a page like that), it epitomizes a style of explanation that is the gold standard of science.

Yet, biological phenomena frequently involve complex, tangled webs of explanatory factors[17]. Consider guppies, small fishes native to streams in South America, which show geographical variation in many kinds of traits, including color patterns. To explain the morphological and behavioral variation among guppies, the biologist John Endler suggested that we consider a “network of interactions” (Figure 1.4). The key point was not to focus on the details of the interactions, but the fact that they exist. Complex as it may look, Endler’s network is “simple” as far as biological systems go. It doesn’t involve bidirectional influences (double-headed arrows), that is, those in which A affects B and B affects A in turn (see Chapter 8). Still, most biological systems are organized like that.

Figure 1.4. Multiple explanatory factors that influence morphological and behavioral variation among South American guppies. Figure from Endler (1995).

Contrast such state of affairs to the vision encapsulated by Isaac Newton’s statement that “Truth is ever to be found in simplicity, and not in the multiplicity and confusion of things”[18]. This stance is such an integral part of the canon of science as to constitute a form of First Commandment. Newton himself was building on the shoulders of René Descartes, the French polymath who helped canonize reductionism (Chapter 4) as part of Western thinking and philosophy. To him the world was to be regarded as a clockwork mechanism. That is to say, in order to understand something it is necessary to investigate the parts and then reassemble the components to recreate the whole[19] – the essence of reductionism. Fast-forward to the second half of the twentieth century. The dream of extending the successes of the Cartesian mindset captivated biologists. As Francis Crick, one of the co-discoverers of the structure of DNA put it, “The ultimate aim of the modern movement in biology is to explain all biology in terms of physics and chemistry”[20]. Reductionism indeed.

So, where is neuroscience today? The mechanistic tradition set off by Newton’s Principia Mathematica – arguably the most influential scientific achievement ever – is a major conceptual driving force behind how brain scientists think. Although many biologists view their subject matter as different from physics, for example, scientific practice is very much dominated by a mechanistic approach. The present book embraces a different way of thinking, one that revolves around ideas of “collective phenomena”, ideas of networks, and ideas about complexity. The book is as much about what we know about the brain, as a text to stimulate how we can think about the brain as a highly complex network system.

Before we can start our journey, we need to define a few terms and learn a little about anatomy.

[1] The full quote from the paper abstract was “we delineated 180 areas per hemisphere bounded by sharp changes in cortical architecture, function, connectivity, and/or topography” (Glasser et al., 2016; p. 171).

[2] The terms “area” and “region” are not distinguished in the book. Neuroscientists more commonly use the former to specify putatively well-delineated parts.

[3] For his work on the structure of the nervous system, Ramon y Cajal was awarded the Nobel Prize in 1906.

[4] Burdach actually described what is currently called the “basolateral amygdala”. Other parts were added later by others, notably Johnston (1923). See Swanson and Petrovich (1988).

[5] When communicated by the media, neuroscience findings are almost exclusively phrased in highly modular terms. We’ve all heard headlines about the amygdala being the “fear center in the brain”, the existence of a “reward center,” as well as “spots” where memory, language, and so on, take place. Whereas the media’s tendency to oversimplify is clearly at play here, neuroscientists are at fault, too.

[6] Watts and Strogatz (1988).

[7] Barabási, A. L., & Albert, R. (1999).

[8] For a related discussion, see Krakauer et al. (2017).

[9] Naqvi et al. (2007).

[10] Addiction: Navqi et al. (2007); faces: Kanwisher et al. (1997); attention: Noudoost et al. (2010).

[11] But the stimulation studies described by Noudoost et al. (2010) come closest.

[12] Krakauer et al. (2017).

[13] Bechtel (2008, p. 13).

[14] Genon et al. (2018, p. 362): Although the statement refers to “many regions,” the point applies to most if not all regions.

[15] Adolphs, R., & Anderson, D. J. (2018, p. 31).

[16] Instructions 1-13 are verbatim from Woodward (2013, p. 43).

[17] Example borrowed from Striedter (2005) based on the work by Endler (1995).

[18] Cited by Mazzocchi (2008, p. 10).

[19] Mazzocchi (2008, p. 10).

[20] Mazzocchi, F. (2008, p. 10).

What kind of brain network? Overlapping and dynamic

In a highly networked system like the brain, we need to shift from thinking in terms of isolated brain regions, and adopt the language of networks: Networks of brain regions collectively support behaviors. The network itself is the unit, not the brain area (Figure 1). Consequently, processes that support behavior are not implemented by an individual area, but depend on the interaction of multiple areas, which are dynamically recruited into multi-region assemblies (more on dynamic aspects below).

Figure 1. What’s the rightful functional unit of interest? Historically, the brain area took center spot. A better unit is a network of brain regions working together.

Functional networks are based on the relationships of the signals in disparate parts of the brain, not on the status of their physical connections. The spatial scale of functional circuits varies considerably, from those linking nearby areas to large ones crisscrossing the brain. The most intriguing networks are possibly those discovered with functional MRI. To identify networks, investigators capitalize on “clustering methods”, general computer science algorithms that sort basic elements (here, areas) into different groups. The objective is to subdivide a set of elements into natural clusters, also known as communities. (These are also called modules by network researchers, but this is confusing in the case of neuroscience given the meaning of “modularity”).

Intuitively, a community should have more internal than external associations. For example, if we consider the set of all actors in the US, we can group them into theater and film clusters (theater actors work with and know each other more so than they work/know film actors). This notion can be formalized: communities are determined by subdividing a set of objects by maximizing the number of within-group connections, and minimizing the number of between-group connections. Remember that a connection in a graph is a link between two elements that share the relationship in question, such as between two theater actors who’ve worked together, or two actors that were in the same movie. Thus, theater actors will tend to group with other theater actors, and less so with film actors, and vice versa.

The most popular partitioning schemes parse individual elements (brain regions in a brain network, persons in a social network, etc.) into unique groupings – a node belongs to one and exactly one community. Based on functional MRI data at rest, the study by Yeo and colleagues discussed above described a seven-community division of the entire cortex, where each local patch of tissue belongs to a single community. In other words, the overall space is broken into disjoint communities. Their elegant work has been very influential and their 7-network partition was adopted as a sort of canonical subdivision of the cortex (see Figure 2). (Intriguingly, they also described an alternative 17-community subdivision of the cortex, but this one didn’t become very popular, likely because 17 is “too large” for neuroscientists to wrap their heads around.) Whereas discrete clusters simplify the description of a system, are they too inflexible, leading to the loss of valuable information?

Figure 2. Seven-network parcellation of Yeo et al. The organization of the human cerebral cortex estimated by intrinsic functional connectivity. J Neurophysiol 106: 1125–1165.

Think about the actors mentioned above. Perhaps they neatly subdivide into theater and film groups, and perhaps into some other clear subgroups (Broadway and Off-Broadway theater performers?). Yet, real-world groupings are seldom this simple, and in this case a particular artist might belong to more than one set (acting in both theater and film, say). In fact, several scientific disciplines, including sociology and biology, have realized the potential of overlapping network organization. For example, the study of chemical interactions reveals that a substantial fraction of proteins interact with several protein groups, indicating that actual networks are made of interwoven sets of overlapping communities. How about the brain?

Consider the versions of the connectivity “core” discussed previously, which contains regions distributed across the cortex, or across the entire brain. These areas are not only strongly interconnected, but also linked with many other parts of the brain. Put another way, by definition, regions of the core are those talking to lots of other ones. Traditional disjoint network partitioning schemes emphasize the within community grouping while minimizing the between community interactions. But regions of the core are both very highly interconnected and linked with disparate parts of the brain. So, how should we think about them?

Network science has additional tools that come to the help. One of them is to think of nodes as having a spectrum of computational properties. Both how well connected a node is, and how vastly distributed its links are matter. Nodes that are particularly well connected are called hubs (with a meaning similar to that in “airport hub”), a property that is formally captured by a mathematical measure called centrality. Hubs come in many different flavors, such as connector hubs that have links to many communities, and provincial hubs that are well connected within their particular community. We can thus think of the former nodes as more “central” in the overall system than the latter.

Nodes that work as connector hubs are distinctly interesting because they have the potential to integrate diverse types of signals if they receive connections from disparate sources, and/or to distribute signals widely if they project to disparate targets. They are a particularly good reminder that communities are not islands; nodes within a community have connections both within their community and to other clusters.

Networks are overlapping

We suggested that a better brain unit is a network, not a region. But in highly interconnected systems like the brain, subdividing the whole system into discrete and separate networks still seems too constraining. (The approach is more satisfactory in engineering systems, which are often designed with the goal of being relatively modular). An alternative is to consider networks as inherently overlapping. In this type of description, collections of brain regions – networks – are still the rightful unit, but a given region can participate in several of these, like the actors discussed previously. thinking more generally, we can even describe a system in terms of communities but allow every one of its nodes to belong to all communities, simultaneously. How would this work (Figure 3)?

Figure 3. Community organization in the brain.
(A) Standard communities are disjoint (inset: colors indicate separate communities), as illustrated via three types of representation. The representation on the right corresponds to a schematic correlation matrix.
(B) Overlapping communities are interdigitated, such that brain regions belong to multiple communities simultaneously (inset: community overlap shown on the brain indicated by intersection colors).
From: Najafi, M., McMenamin, B. W., Simon, J. Z., & Pessoa, L. (2016). Overlapping communities reveal rich structure in large-scale brain networks during rest and task conditions. Neuroimage, 135, 92-106.

In a study in my lab, we allowed each brain region to participate in a community in a graded fashion, which was captured by membership values varying continuously between 0 and 1; 0 indicated that the node did not belong to a community, and 1 that it belonged uniquely to a community. One can think of membership values as the strength of participation of a node in each community. It’s also useful to conceive of membership as a finite resource, such that it sums to 1. For example, in the case of acting professionals, a performer could belong to the theater cluster with membership 0.7 and to the film cluster with membership 0.3 to indicate the relative degree of participation in the two. In my lab’s study, we found that it was reasonable to subdivide cortical and subcortical regions into 5, 6, or 7 communities, like other algorithms have suggested in the past. But we also uncovered dense community overlap which was not limited to “special” hubs. In many cases, the entire community was clearly a meaningful functional unit, while at the same time most of its nodes still interacted non-trivially with a large set of brain regions in other networks.

The results of our study, and related ones by other groups, suggest that densely overlapping communities are well suited to capture the flexible and task dependent mapping between brain regions and their functions. The upshot is that it’s very difficult to subdivide a highly interconnected system without losing a lot of important information. What we need is a set of tools that allow us to do this in sophisticated ways. And we need them both to think about how networks are organized in space, as discussed in this section, and in time, to which we turn next.

Networks are dynamic

The brain is not frozen in place but is a dynamic, constantly moving object. Accordingly, its networks are not static but evolve temporally. As an individual matures from infancy to adolescence to adulthood and old age the brain changes structurally. But the changes that I want to emphasize here are those occurring at much faster time scales, those that accompany the production of behaviors as they unfold across seconds to minutes.

Functional connections between regions – the degree to which their signals covary – are constantly fluctuating based on cognitive, emotional, and motivational demands. When someone pays attention to a stimulus that is emotionally significant (it was paired with mild shock in the past, say), increased functionally connectivity is detected between the visual cortex and the amygdala. When she performs a challenging task in which an advance cue indicates that she may earn extra cash for performing it correctly, increased functional connectivity is observed between parietal/frontal cortex (important for performing the task) and the ventral striatum (which participates in reward-related processes). And so on. Consequently, network functional organization must be understood dynamically (Figure 4). In the past decade, researchers have created methods to delineate how networks change across time, informing how we view social, technological, and biological systems.

Figure 4. Brain networks are dynamic. (a,b) Specific network properties (‘network index’) evolve across time. (c,d) A region’s grouping with multiple networks evolves across time as indicated by the ‘membership index’ (inset: purple region and its functional connections to multiple networks). The region indicated in purple increases its coupling with one of the networks and stays coupled with it for the remainder of the time.

Brain networks are dynamic. For example, the fronto-parietal network mentioned previously is engaged by many challenging tasks, such as paying attention to an object, maintaining information in mind, or withholding a prepotent response. If a person transitions mentally from, say, listening passively to music to engaging in one of these functions (say, she needs to remember the name of a book just recommended to her), the state of the frontal-parietal network will correspondingly evolve, such that the signals in areas across the network will increasingly synchronize, supporting the task at hand (holding information in mind).

There’s a second, more radical, way in which networks are dynamic. That’s when they are viewed not as fixed collections of regions, but instead as coalitions that form and dissolve to meet computational needs. For instance, at time t1 regions R1, R2, R7, and R9 might form a natural cluster; at a later time t2 regions R2, R7, and R17 might coalesce. This shift in perspective challenges the notion of a network as a coherent unit, at least for longer periods of time. At what point does a coalition of regions become something other than community X? For example, the brain areas comprising the fronto-parietal network exist irrespective of the current mental operation; for one, the person could actually be sleeping or under anesthesia. The areas in question may not be strongly communicating with each other at all. Should it be viewed as a functional unit? When the regions become engaged by a mental operation, their signals become more strongly synchronized. But when along this process should the network be viewed as “active”? As the mind fluctuates from state to state, we can view networks cohering and dissolving correspondingly – not unlike a group of dancers merging and separating as an act progresses. The temporal evolution of their joint states is what is important.

Functional connectivity in the brain

A physical connection between two regions allows them to exchange signals, that much is clear. But there’s another kind of relationship that we need to entertain – what we call a functional connection. Let’s first consider an example unrelated to the brain, where in fact there aren’t any physical connections. Genes are segments of DNA that specify how individual proteins are put together, and a protein itself is made of a long chain of amino acids. Proteins have diverse functions, including carrying out chemical reactions, transporting substances, and serving as messengers between cells. We can think of genes that guide the building of proteins that have related functions (for example, acting as hormones in the body, such as insulin, estrogen, and testosterone) as “functionally connected.” The genes themselves aren’t physically connected but they are functionally related. Here, we’ll see how functional connectivity is a useful concept in the case of the brain.

At first glance, the notion of an architecture anchored on physical connections goes without saying. Region A influences region B because there’s a pathway from A to B. However, the distinction between anatomy and function becomes blurred very quickly. Connections are sometimes “modulatory,” in which case region A can influence the probability of responding at B, and at times “driving,” in which case they actually cause cells in B to fire. In many instances, the link between A and B is not direct but involves so-called interneurons: A projects first to an interneuron (often in area B itself) which then influences responses in other cells in B. The projections from interneurons to other cells in B can be excitatory or inhibitory, although they are often inhibitory. Of course, the strength of the fiber itself is critical. Furthermore, the presence of multiple feedforward and feedback pathways, as well as diffuse projections, further muddy the picture. Taken together, we see that connections between regions are not simply binary (they exist or not, as in a computer), and even a single weight value (say, a strength of 0.4 on a scale from 0 to 1) doesn’t capture the richness of the underlying information.

Functional connectivity thus answers the following question: how coordinated is the activity of two brain regions that may or may not be directly joined anatomically? (Figure 1). The basic idea is to gauge if different regions form a functional unit. What do we mean by “coordinated?” There are multiple ways to capture this concept, but the simplest is to ascertain how correlated the signals from regions A and B are, and the stronger their correlation, the higher the functional association, or functional connection. Correlation is an operation that is summarized by values from -1 to +1. When two signals are perfectly related (which is never the case with noisy biological measurements), their correlation is +1; when they are in perfect opposition to one another (one is high when the other is low, and vice versa), their correlation is -1; when they are unrelated to each other, their correlation is 0 (this means that information about one of the signals tells us nothing about the other one, and vice versa).

Figure 1. Functional connectivity measures the extent to which signals from two regions are in synchrony. Whether or not the regions are directed connected via an anatomical pathway is unimportant.

Let’s consider what I called the “two-step” property of the amygdala. Because this area is connected physically to approximately 40 percent of prefrontal subregions, it can influence a sizeable portion of this lobe in a direct manner, that is, via a single step (such as regions in the orbitofrontal cortex and the medial prefrontal cortex). But approximately 90 percent of prefrontal cortex can receive amygdala signals after a single additional connection within prefrontal cortex[1]. Thus, there are two-step pathways that join the amygdala with nearly all of prefrontal cortex. Consequently, the amygdala can engage in meaningful functional interactions with areas that are not supported by strong direct anatomical connections (such as the lateral prefrontal cortex), or even not connected at all.

The foregoing discussion is worth highlighting because it’s not how neuroscientists think typically. They tend to reason in a much more direct fashion, considering the influences of region A to be most applicable to the workings of regions B, to which it is directly connected – a type of connection called monosynaptic. To be sure, a circuit involving A –> X –> B is more indirect than A –> B, and if the intermediate pathway involving X is very weak, the impact of A on X may be negligible. But the point here is that this needn’t be the case, and we should not discard this form of communication simply because it’s indirect (recall the discussion about network efficiency above).

It’s natural to anticipate a functional association between brain regions that are directly connected. Yet, the relationship between structural and functional connectivity is not always a simple one, which shouldn’t be surprising because the mapping between structure and function in an object as interwoven as the brain is staggeringly complex. A vivid example of structure-function dissociation is illustrated by adults born without the corpus callosum, which contains massive bundles of axonal extensions joining the two hemispheres. Although starkly different structurally relative to controls, individuals without the callosum exhibit very similar patterns of functional connectivity compared to normal individuals[2]. Thus, largely normal coordinated activity emerges in brains with dramatically altered structural connectivity, providing a clear example of how functional organization is driven by factors that extend beyond direct pathways.

The upshot is that to understand how behavior is instantiated in the brain, in addition to working out anatomy, it is necessary to elucidate the functional relationships between areas. Importantly, anatomical architectural features support the efficient communication of information even when strong direct fibers aren’t present, and undergird functional interactions that vary based on a host of factors.

An experiment further illustrating the above issues studied monkeys with functional MRI during a “rest” condition, when the animal was not performing an explicit task[3]. They observed robust signal correlation (the signals went up and down together) between the amygdala and several regions that aren’t connected to it (as far as we know). They asked, too, whether functional connectivity is more related to direct (monosynaptic) pathways or connectivity via multiple steps (polysynaptic) by undertaking graph analysis. Are there efficient routes of travel between regions even when they aren’t directly connected? To address this question quantitatively, they estimated a graph measure called communicability (related to the concept of efficiency discussed previously), and found that amygdala functional connectivity was more closely related to their measure of communicability than what would be expected by only considering monosynaptic pathways. In other words, polysynaptic, multi-step routes should be acknowledged. In fact, their finding shows that to understand the relationship between signals in the amygdala and that of any other brain region, it’s important to consider all pathways that can bridge them.

[1] Averbeck and Seo (2008).

[2] Tyszka et al. (2011).

[3] Grayson et al. (2016).

The brain is not hierarchically organized: Is it a “small world” or actually a “tiny world”?

Engineers think of systems in terms of inputs and outputs. In a steam engine, heat (input) applied to water produces steam, and the force generated pushes a piston back and forth inside a cylinder; the pushing force is transformed into rotational force (output) that can be used for other purposes. Reasoning in terms of input-output relationships became even more commonplace with the invention of computers and the concept of a software program. Thus, it’s only natural to consider the brain in terms of the “inflow” and “outflow” of signals tied to sensory processing and motor acts. During sensory processing, energy of one kind or another is transduced, action potential reach the cortex, and are further processed. During motor acts, activity from the cortex descends to the brainstem and spinal cord, eventually moving muscles. Information flows in for perception and flows out for action.

Let’s describe a substantially different view based on what I call functionally integrated systems. To do so, it helps to discuss six broad principles of brain organization. To anticipate, some of the consequences of the principles are as follows: the brain’s anatomical and functional architectures are highly non-modular; signal distribution and integration are the norm, allowing the confluence of information related to perception, cognition, emotion, motivation, and action; and, the functional architecture is composed of overlapping networks that are highly dynamic and context-sensitive[1].

Principle 1: Massive combinatorial anatomical connectivity

Dissecting anatomical connections is incredibly painstaking work. Chemical substances are injected at a specific location and, as they diffuse along axons, traces of the molecules are detected elsewhere. After diffusion stabilizes (in some cases, it takes weeks), tissue is sectioned in razor-thin slices that are further treated chemically and inspected, one by one. Because the slices are very thin, researchers focus on examining particular target regions. For example, one anatomist may make injections in a few sites in parietal cortex, and examine parts of lateral prefrontal cortex for staining that indicates the presence of an anatomical connection. Injection by injection, study by study, neuroanatomists have compiled enough information to provide a good idea of the pathways crisscrossing the brain.

Figure 1. A graph is a mathematical object that can represent arbitrary collections of elements (person, computer, genes), called nodes (circles), and their relationships, called edges (lines joining pairs of nodes).

Although anatomical knowledge of pathways (and their strengths) is incomplete, the overall picture is one of massive connectivity. This is made clearer when computational analyses are used to combine the findings across a large number of individual studies. A field of mathematics that comes in handy here is called graph theory, which has become popular in the last two decades under the more appealing term of “network science.” Graphs are very general abstract structures that can be used to formalize the interconnectivity of social, technological, or biological systems. They are defined by nodes and the links between them, called edges (Figure 1). A node represents a particular object: a person in a social group, a computer in a technological network, or a gene in a biological system. Edges indicate a relationship between the nodes: people who know each other, computers that are physically connected, or genes with related functions. So, in the case of the brain, areas can be represented by nodes, and edges interlinking them represent a pathway between them. (A so-called directed graph can be used if the direction of the pathways are known; for example, from A to B but not vice versa.)

Graph analysis demonstrates that brain regions are richly interconnected, a property of both cortical and subcortical regions. In the cortex, this property is not confined to the prefrontal cortex (which is often highlighted in this regard), but is observed for all lobes. Indeed, the overall picture is one of enormous connectivity, leading to combinatorial pathways between sectors. In other words, one can go from point A to point B in multiple ways, much like navigating a dense set of roads. Computational neuroanatomy has greatly refined our understanding of connectivity.

High global accessibility. Rumors spread more or less effectively depending on the pattern of communication. It will spread faster and farther among a community of college students than among faculty professors, assuming that the former is more highly interconnected than the latter. This intuition is formalized by a graph measure called efficiency, which captures information spread effectiveness across members of a network, even those who are least connected (in the social setting, the ones who know or communicate the least with other members). How about the brain? Recent studies suggest that its efficiency is very high. Signals have the potential to travel efficaciously across the entire organ, even between parts not near each other, and even between parts that are not directly connected; in this case, the connection is indirect, such as travelling through C, and possibly D, to get from A to B. The logic of the connectivity structure seems to point to a surprising property: physical distance matters little.

For many neuroscientists, this conclusion is surprising, if not counterintuitive. Their training favors a processing-is-local type of reasoning. After all, areas implement particular functions. That is to say, they are the proper computational units – or so the thinking goes (see chapter 4). This interpretation is reinforced by the knowledge that anatomical pathways are dominated by short-distance connections. In fact, 70% of all the projections to a given locus on the cortical sheet arise from within 1.5 to 2.5 mm (to give you an idea, parts of occipital cortex toward the back of the head are a good 15 cm away from the prefrontal cortex). Doesn’t this dictate that processing is local, or quasi-local? This is where math, and the understanding of graphs, helps sharpen our thinking.

In a 1998 paper entitled “Collective dynamics of ‘small-world’ networks” (cited tens of thousands of times in the scientific literature), Duncan Watts and Steven Strogatz showed that systems made of locally-clustered nodes (those that are connected to nearby nodes), but that also have a small number of random connections (which link arbitrary pairs of nodes), allow all nodes to be accessible within a small number of connectivity steps[2]. Starting at any arbitrary node, one can reach another, no matter which one, by traversing a few edges. Helping make the paper a veritable sensation, they called this property “small-world”. The strength of their approach was to show that this is a hallmark of graphs with such connectivity pattern, irrespective of the type of data at hand (social, technological, or biological). Watts and Strogatz emphasized that the arrangement in question – what’s called network topology – allows for enhanced signal-propagation speed, computational power, and synchronizability between parts. The paper was a game changer in how one thinks of interconnected systems[3].

In the 2000s, different research groups proposed that the cerebral cortex is organized as a small world. If correct, this view means that signal transduction between parts of the cortex can be obtained via a modest number of paths connecting them. It turns out that the brain is more interconnected than would be necessary for it to be a small world[4]. That is to say, there are more pathways interconnecting regions than the minimum needed to attain efficient communicability. So, while it’s true that local connectivity predominates within the cortex, there are enough medium- and long-range connections – in fact, more than the “minimum” required – for information to spread around remarkably well.

Connectivity core (“rich club”). A central reason the brain is not a small world is because it contains a subgroup of regions that is very highly interconnected. The details still are being worked out, not least because knowledge of anatomical connectivity is incomplete, especially in humans.

In 2010, the computer scientists Dharmendra Modha and Raghavendra Singh gathered data from over four hundred anatomical tracing studies of the macaque brain[5]. Unlike most investigations, which have focused on the cortex, they included data on subcortical pathways, too (Figure 2). Their computational analyses uncovered a “tightly integrated core circuit” with several properties: (i) it is a set of regions that is far more tightly integrated (that is, more densely connected) than the overall brain; (ii) information likely spreads more swiftly within the core than through the overall brain; and (iii) brain communication relies heavily on signals being communicated via the core. The proposed core circuit was distributed throughout the brain; it wasn’t just in the prefrontal cortex, a sector often underscored for its integrative capabilities, or some other anatomically well-defined territory. Instead, the regions were found in all cortical lobes, as well as subcortical areas such as the thalamus, striatum, and amygdala.

Figure 2. Massive interconnectivity between all brain sectors. Computational analysis of anatomical connectivity by collating pathways (lines) from hundreds of studies. To improve clarity, pathways with a common origin or destination are bundled together (otherwise the background would be essentially black given the density of connections). Figure from Modha and Singh (2010).

In another study, a group of neuroanatomists and physicists collaborated to describe formal properties of the monkey cortex[6]. They discovered a set of 17 brain regions across parietal, temporal, and frontal cortex that is heavily interconnected. For these areas, 92% of the connections that could potentially exist between region pairs have indeed been documented in published studies. So, in this core group of areas, nearly every one of them can talk directly to all others, a remarkable property. In a graph, when a subset of its nodes is considerably more well connected than others, it is sometimes referred to as a “rich club,” in allusion to the idea that in many societies a group of wealthy individuals tends to be disproportionately influential.  

Computational analysis of anatomical pathways has been instrumental in unravelling properties of the brain’s large-scale architecture. We now have a vastly more complete and broader view of how different parts are linked with each other. At the same time, we must acknowledge that the current picture is rather incomplete. For one, computational studies frequently focus on cortical pathways. As such, they are cortico-centric, reflecting a bias of many neuroscientists who tend to neglect the subcortex when investigating connectional properties of the brain. In sum, the theoretical insights by network scientists about “small worlds” demonstrated that signals can influence distal elements of a system even when physical connections are fairly sparse. But cerebral pathways vastly exceed what it takes to be a small world. Instead, what we find is a “tiny world.”

[1] The ideas in this chapter are developed more technically elsewhere (Pessoa, 2014, 2017).

[2] Watts and Strogatz (1998).

[3] Another very influential paper was published soon after by Barabási and Albert (1999). The work was followed by an enormous amount of research in the subsequent years.

[4] Particularly useful here is the work by Kennedy and collaborators. For discussion of mouse and primate data, see Gămănuţ et al. (2018).

[5] Modha and Singh (2010). Modha, D. S., & Singh, R. (2010). Network architecture of the long-distance pathways in the macaque brain. Proceedings of the National Academy of Sciences, 107(30), 13485-13490.

[6] Markov et al. (2013). Markov, N. T., Ercsey-Ravasz, M., Van Essen, D. C., Knoblauch, K., Toroczkai, Z., & Kennedy, H. (2013). Cortical high-density counterstream architectures. Science, 342(6158).

The “reptilian” / triune brain: The origins of a misguided idea

The misguided idea of the “reptilian brain”. Figure from the excellent review by Ann Butler. Butler A B (2009). Triune Brain Concept: A Comparative Evolutionary Perspective. In: Squire LR (ed.) Encyclopedia of Neuroscience, volume 9, pp. 1185-1193. Oxford: Academic Press.

One of the presentations at the 1881 International Medical Congress in London was by Friedrich Goltz, a professor of physiology at the University of Strasburg. Like several of his contemporaries, Goltz was interested in the localization of function in the brain. He not only published several influential papers on the problem, but attracted widespread attention by exhibiting dogs with brain lesions at meetings throughout Europe. His presentations were quite a spectacle. He would take the lectern and bring a dog with him to demonstrate an impaired or spared behavior that he wanted to discuss. Or, he would open his suitcase and produce the skull of a dog with the remnants of its brain. In some cases, a separate panel of internationally acclaimed scientists would even evaluate the lesion and report their assessment to the scientific community.

In some of his studies, Goltz would remove the entire cortical surface of a dog’s brain, and let the animal recover. The now decorticated animal would survive, though it would generally not initiate action and remain still. Goltz showed that animals with an excised cortex still exhibited uncontrolled “rage” reactions, leading to the conclusion that the territory is not necessary for the production of emotional expressions. But if the cortex wasn’t needed, the implication was that subcortical areas were involved. That emotion was a subcortical affair was entirely consistent with nineteenth century thinking.

Victorian England and the beast within

In the conclusion of The Descent of Man, Charles Darwin wrote in 1871 that “the indelible stamp of his lowly origin” could still be discerned in the human mind, with the implied consequence that it was necessary to suppress the “beast within” – at least at times. This notion was hardly original, of course, and in the Western world can be traced back to at least ancient Greece. At Darwin’s time, with emotion being considered primitive and reason the more advanced faculty, “true intelligence” was viewed as residing in cortical areas, most notably in the frontal lobe, while emotion was viewed as residing in the basement, the lowly brainstem.

The decades following the publication of Darwin’s Origin of Species (in 1859) were a time of much theorizing not only in biology but in the social sciences, too. Herbert Spencer and others applied key concepts of biological evolutionary theory to social issues, including culture and ethics. Hierarchy was at the core of this way of thinking. For the survival of evolved societies, it was necessary to legitimize a hierarchical governing structure, as well as a sense of self-control at the level of the individual – it was argued[1]. These ideas, in turn, had a deep impact on neurology, the medical specialization characterizes the consequences of brain damage on survival and behavior. John Hughlings Jackson, to this day the most influential English neurologist, embraced a hierarchical view of brain organization rooted in a logic of evolution as a process of the gradual accrual of more complex structures atop more primitive ones. What’s more, “higher” centers in the cortex bear down on “lower” centers underneath, and any release from this control could make even the most civilized human act more like his primitive ancestors[2]. This stratified scheme was also enshrined in Sigmund Freud’s framework of the id (the lower level) and the super-ego (the higher level). (Freud also speculated that the ego played an in-between role between the other two.) Interestingly, Freud was initially trained as a clinical neurologist and was a great admirer of Jackson’s work.

Against this backdrop, it’s not surprising that brain scientists would search for the neural basis of emotion in subcortical territories, while viewing “rational thinking” as the province of the cerebral cortex, especially the frontal lobe.

The “reptilian” brain

In 1896 the German anatomist Ludwig Edinger published The Anatomy of the Central Nervous System of Man and other Vertebrates. The book, which established Edinger’s reputation as the founder of comparative neuroanatomy, described the evolution of the forebrain as a sequence of additions, each of which establishing new brain parts that introduced new functions.

Edinger viewed the forebrain as containing an “old encephalon” found in all vertebrates. On top of the old encephalon, there was the “new encephalon,” a sector only more prominent in mammals. In one of the most memorable passages of his treatise, Edinger illustrates his concept by asking the reader to imagine precisely inserting a reptilian brain into that of a marsupial (a “simple” mammal). When he superimposed them, the difference between the two was his new encephalon. He then ventures that, in the brain of the cat, the old encephalon “persists unchanged underneath the very important” new encephalon[3]. Put differently, the part that was present before is left unaltered. Based on his coarse analysis of morphological features, his suggestion was reasonable. But to a substantial degree, his ideas were very much in line with the notion of brain evolution as progress toward the human brain – à la old Aristotle and the scala naturae. Given the comprehensive scope of Endinger’s analysis across vertebrates, his views had a lasting impact and shaped the course of research for the subsequent decades.

More than a century later, knowledge about the brains of vertebrates has expanded by leaps and bounds. Yet, old thinking dies hard. Antiquated views of brain evolution continue to influence, if only implicitly, neuroscience. As an example, bear in mind that most frameworks of brain organization are heavily centered on the cortex. These descriptions view “newer” cortex as controlling subcortical regions, which are assumed to be (relatively) unchanged throughout eons of evolution. Modern research on brain anatomy from a comparative viewpoint indicates, in contrast, that brain evolution is better understood in terms of the reorganization of large-scale connectional systems (Figure 2). These ideas are developed extensively in [4].

Figure 2. The basic architecture of the brain is shared across all vertebrates. Extensive anatomical connectivity between sectors of the brain is observed in birds, reptiles, and mammals. The “pallium” corresponds to “cortex” in mammals. From: Pessoa, L., Medina, L., Hof, P. R., & Desfilis, E. (2019). Neural architecture of the vertebrate brain: Implications for the interaction between emotion and cognition. Neuroscience & Biobehavioral Reviews, 107, 296-312.

[1] Edinger (1910, p. 446).

[2] See Parvizi (2009) for an accessible discussion.

[3] For discussion, see Finger (1994, p. 271).

[4] Pessoa, L., Medina, L., Hof, P. R., & Desfilis, E. (2019). Neural architecture of the vertebrate brain: Implications for the interaction between emotion and cognition. Neuroscience & Biobehavioral Reviews, 107, 296-312.

What do brain areas do? They are inherently multifunctional

We’ll start with the simplest formulation, namely by assuming a one-to-one mapping between an area and its function. We’re assuming for the moment that we can come up with, and agree on, a set of criteria that defines what an area is. Maybe it’s what Brodmann defined early in the twentieth century. (A great source for many of the ideas discussed here is Passingham, R. E., Stephan, K. E., & Kötter, R. (2002). The anatomical basis of functional localization in the cortex. Nature Reviews Neuroscience, 3(8), 606-616.)

Figure 1. Structure-function mapping in the brain. The mapping from structure to function is many-to-many Abbreviations: A1, … , A4: areas 1 to 4; amyg: amygdala; F1, … , F4: functions 1 to 4. Figure from: Pessoa, L. (2014). Understanding brain networks and brain organization. Physics of Life Reviews, 11(3), 400-435.

For example, we could say that the function of primary visual cortex is visual perception, or perhaps a more basic visual mechanism, such as detecting “edges” (sharp light-to-dark transitions) in images. The same type of description can be applied to other sensory (auditory, olfactory, and so on) and motor areas of the brain. This exercise becomes considerably less straightforward for areas that are not sensory or motor, as their workings become much more difficult to determine and describe. Nevertheless, in theory, we can imagine extending the idea to all parts of the brain. The result of this endeavor would be a list of area-function pairs: L = {(A1,F1), (A2,F2),…, (An,Fn)}, where areas A implement functions F.

To date, no such list has been systematically generated. However, current knowledge indicates that this strategy would not yield a simple area-function list. What may start as a simple (A1,F1) pair, as research progresses, gradually is revised and grows to include a list of functions, such that area A1 participates in a series of functions F1, F2,…, Fk. From initially proposing that the area implements a specific function, as additional studies accumulate, we come to see that it participates in multiple ones. In other words, from a basic one-to-one A1F1 mapping, the pictures evolves to a one-to-many mapping: A1 → {F1, F2,…, Fk} (Figure 1).

Consider this example. Starting in the 1930s, lesion studies in monkeys suggested that the prefrontal cortex implements “working memory,” such as the ability to keep in mind a phone number for several seconds before dialing it. As research focusing on this part of the brain ramped up, the list of functions grew to include many cognitive operations, and the prefrontal cortex became central to our understanding of what is called executive function. In fact, today, the list is not limited to cognitive processes, but includes contributions to emotion and motivation. The prefrontal cortex is thus multifaceted. One may object that this sector is “too large” and that it naturally would be expected to participate in multiple processes. While this is a valid interjection, the argument holds for “small areas,” too. For example, take the amygdala, a region often associated with handling negative or aversive information. However, the amygdala also participates in the processing of appetitive items (and this multi-functionality applies even to amygdala subnuclei).

Let’s consider the structure-function (AF) mapping further from the perspective of the mental functions: where in the brain is a given function F carried out? In experiments with functional MRI, tasks that impose cognitive challenges engage multiple areas of frontal and parietal cortex; for example, tasks requiring participants to selectively pay attention to certain stimuli among many and answer questions about the ones that are relevant (in a screen containing blue and red objects, are there more rectangles or circles that are blue?). These regions are important for paying attention and selecting information that may be further interrogated. Such attentional control regions are observed in circumscribed sectors of frontal and parietal cortex. Thus, multiple individual regions are capable of carrying out a mental function, an instance of a many-to-one mapping: {A1 or A2,…, or Aj}→ F1. The explicit use of “or” here indicates that, say, A1 is capable of implementing F1, but so are A2, and so on[1]. Now, together, if brain regions participate in many functions and functions can be carried out by many regions, the ensuing structure-function mapping will be many-to-many. Needless to say, the study of systems with this property will be considerably more challenging than systems with a one-to-one organization (Figure 1). (For a related case, consider a situation where a gene contributes to many traits or physiological processes; conversely, traits or physiological processes depend on large sets of genes.)

Structure-function relationships can be defined at multiple levels, from the precise (for instance, primary visual cortex is concerned with detecting object borders) to the abstract (for instance, primary visual cortex is concerned with visual perception). Accordingly, structure-function relationships will depend on the granularity in question. Some researchers have suggested that, at some level of description, a brain region does not have more than one function; at the “proper” one, it will have a single function[2]. In contrast, the central idea here is that the one-to-one framework, even if implicitly accepted or adopted by neuroscientists, is an oversimplification that hampers progress in understanding the mind and the brain.

Brain areas are multifaceted

If brain areas don’t implement single processes, how should we characterize them? Instead of focusing on a single “summary function,” it is better to describe an area’s functional repertoire: across a possibly large range of functions, to what extent does an area participate in each of them? No consensus has emerged about how to do this, but below we’ll discuss some early results. But the basic idea is simple. Coffee growers around the world think of flavor the same way: via a profile or palette. For example, Brazilian coffee is popular because it is very chocolaty, nutty, and with light acidity, to mention three attributes.

Research with animals utilizes electrophysiological recordings to measure neuronal responses to varied stimuli. The work is meticulous and painstaking because, until recently, the vast majority of studies recorded from just a single (or very few) electrode(s), in a single brain area. Setting up a project, a researcher thus decides what processes to investigate at what precise location of the cortex or subcortex; for example, probing classical conditioning in the amygdala. Having elected to do so, the electrode is inserted in multiple neighboring sites as the investigator determines the response characteristics of the cells in the area (newer techniques exist where grids of finely spaced electrodes can record from adjacent cells simultaneously)[3]. For some regions, researchers have catalogued cell response properties for decades; considering the broader published literature thus allows them to have a fairly comprehensive view. In particular, the work of mapping cell responses has been the mainstay of perception and action research, given that the stimulus variables of interest can be manipulated systematically; it is easy to precisely change the physical properties of a visual stimulus, for example. In this manner, the visual properties of cells across more than a dozen areas in occipital and temporal cortex have been studied. And several areas in parietal and frontal cortex have been explored to determine neuronal responses during the preparation and elicitation of movements. 

It is thus possible to summarize the proportions of functional cell types in a brain region[4]. Consider, for example, two brain regions in visual cortex called V4 (visual area number 4) and MT (found in the middle temporal lobe). Approximately 85% of the cells in area MT show preference for the direction that a stimulus is moving (they respond more vigorously to rightward versus leftward motion, say), whereas only 5% of the cells in area V4 do so. In contrast, 50% of the cells in area V4 show a strong preference to the wavelength of the visual stimulus (related to a stimulus’s color), whereas no cells in area MT appear to do so. Finally, 75% of the cells in area MT are tuned to the orientation of a visual stimulus (the visual angle between the major elongation of a stimulus and a horizontal line), and 50% of the cells in area V4 do so, too. If we call these three properties ds, ws, and os (for stimulus direction, wavelength, and orientation, respectively), we can summarize an area’s responses by the triplet (ds, ws, os), such that area MT can be described by (.85, 0, .75) and area V4 by (.05, .50, .50).

This type of summary description can be potentially very rich, and immediately shifts the focus from thinking “this region computes X” to “this region participates in multiple processes.” At the same time, the approach prompts us to consider several thorny questions. In the example only three dimensions were used, each of which related to an attribute thought to be relevant – related to computing an object’s movement, color, and shape, respectively. But why stop at three features? Sure, we can add properties, but there is no guarantee that we will cover all of the “important” ones. In fact, at any given point in time, the attributes more likely reflect what researchers know and likely find interesting. This is one reason the framework becomes increasingly difficult for areas that aren’t chiefly sensory or motor; whereas sensorimotor attributes may be more intuitive, cognitive, emotional, and motivational dimensions are much less so – in fact, they are constantly debated by researchers! So, what set of properties should we consider for the regions of the prefrontal cortex that are involved in an array of mental processes? 

More fundamentally, we would have to know, or have a good way of guessing, the appropriate space of functions. Is there a small set of functions that describes all of mentation? Are mental functions like phonemes in a language? English has approximately 42 phonemes, the basic sounds that make up spoken words. Are there 42 functions that define the entire “space” of mental processes? How about 420? Although we don’t have answers to these fundamental questions[5], some form of multi-function, multi-dimensional description of an area’s capabilities is needed. A single-function description is like a strait jacket that needs to be shed. (For readers with a mathematical background, an analogy to basic elements like phonemes is a “basis set” that spans a subpace, like in linear algebra; or “basis functions” that can be used to reconstruct arbitrary signals, like in Fourier or Wavelet analysis.)

The multi-function approach can be illustrated by considering human neuroimaging research, including functional MRI. Despite the obvious limitations imposed by studying participants lying on their backs (many feel sleepy and may even momentarily doze off; not to mention that we can’t ask them to walk around and “produce behaviors”), the ability to probe the brain non-invasively and harmlessly means that we can scrutinize a staggering range of mental processes, from perception and action to problem solving and morality. With the growth of this literature, which accelerated in earnest after the publication in 1992 of the first functional MRI studies, several data repositories have been created that combine the results of thousands of studies in a single place.

Figure 2. Multifunctionality. (A): Functional profile of a sample region. The radial plot includes 20 attributes, or “task domains.” The green line represents the degree of engagement of the area for each attribute. (B): Distribution of a measure of functional diversity across cortex. Warmer colors indicate higher diversity; cooler colors, less diversity.
Figure from: Anderson, M. L., Kinnison, J., & Pessoa, L. (2013). Describing functional diversity of brain regions and brain networks. Neuroimage, 73, 50-58.

In one study, we capitalized on this treasure trove of results to characterize the “functional profile” of regions across the brain. We chose twenty “task domains” suggested to encompass a broad range of mental processes, including those linked to perception, action, emotion, and cognition. By considering the entire database of available published studies, at each brain location, we generated a twenty-dimensional functional description indicating the relative degree of engagement of each of the twenty domain attributes (Figure 2). Essentially, we counted the number of times an activation was reported in that brain location, noting the task domain in question. For example, a study reporting stronger responses during a language task relative to a control task, would count toward the “language” domain, at the reported location. We found that brain regions are rather functionally diverse, and are engaged by tasks across many domains. But this didn’t mean that they respond uniformly; they have preferences, which are at times more pronounced. To understand how multi-functionality varied across the brain, we computed a measure that summarized functional diversity. A brain region engaged by tasks across multiple domains would have high diversity, whereas those engaged by tasks in only a few domains would have low diversity. Functional diversity varied across the brain (Figure 2), with some brain regions being recruited by a very diverse range of experimental conditions.

The findings summarized in Figure 2 paint a picture of brain regions as functionally diverse, each with a certain style of computation. The goal here was to illustrate the multi-dimensional approach rather than to present a more definitive picture. For one, conclusions were entirely based on a single technique, which has relatively low spatial resolution. (In functional MRI, signal at each location pools together processing related to a very large number of neurons; a typical location, called a “voxel,” can easily contain millions of neurons.) The approach also doesn’t account for the confirmation bias present in the literature. For example, researchers often associate amygdala activation with emotion and are thus more likely to publish results reflecting this association, a tendency that will increase the association between the amygdala and the domain “emotion” (not to mention that investigators might mean different things when they say “emotion”). Finally, the study makes the assumption that the twenty-dimensional space of mental tasks is a reasonable decomposition. Many other breakdowns are possible, of course, and it might be even more informative to consider a collection of them at the same time (this would be like describing a coffee in terms of a given set of attributes but then using separate groups of attributes).

[1] When regions A1, A2 etc. jointly implement a function F, the situation is conceptually quite different from the scenario being described. We can think of the set of regions {A1, A2 , … } as a network of regions that, in combination, generates the function F.

[2] See discussion by Price and Friston (2005).

[3] Newer techniques, like two-photon imaging, allow the study of hundreds or even thousands of neurons simultaneously.

[4] Example in this paragraph discussed by Passingham et al. (2002).

[5] The book by Varela et al. (1990) offers among the best, and most accessible, treatment of these issues.

Transient brain dynamics

Here, I illustrate simple ideas to understand multi-region dynamics. The goal is to describe the joint state of a set of brain regions, and how it evolves temporally.

Left: Time series data from 3 regions of interest. Right: State-space representation with temporal trajectories for tasks A and B.
From: Venkatesh, M., Jaja, J., & Pessoa, L. (2019). Brain dynamics and temporal trajectories during task and naturalistic processing. Neuroimage, 186, 410-423.

Imagine a system of n brain regions labeled each of which with an activation (or firing rate) strength that varies as a function of time denoted  and so on. We can group these activities into a vector. Recall that a vector is simply an ordered set of values, such as x, y, and z in three dimensions. At time t1, the vector  specifies the state of the regions (that is, their activations) at time t1. By plotting how this vector moves as a function of time, it is possible to visualize the temporal evolution of the system as the behavior in question unfolds[1]. We can call the succession of states at t1, t2, etc., visited by the system a trajectory. Now, suppose an animal performs two tasks, A and B, and that we collect responses across three brain regions, at multiple time points. We can then generate a trajectory for each task (Figure 1). Each trajectory provides a potentially unique signature for the task in question[2]. We’ve created a four-dimensional representation of each task: it considers three locations in space (the regions where the signals were recorded from) and one dimension of time. Thus, each task is summarized in terms of responses across multiple brain locations and time. Of course, we can record from more than three places, that only depends on what our measuring technique allows us to do. If we record from n spatial locations then we’ll be dealing with an n+1 dimensional situation (the +1 comes from adding the dimension of time). Whereas we can’t plot that in a piece of paper, fortunately the mathematics is the same, so this poses no problems the for data analysis.

Thinking in terms of spatiotemporal trajectories brings with it multiple features. The object of interest – the trajectory – is spatially distributed and, of course, dynamic. It also encourages a process-oriented framework, instead of trying to figure out how a brain region responds to a certain stimulus. The process view also changes the typical focus on billiard-ball causation – the white ball hits the black ball, or region A excites region B. Experimentally, a central goal then becomes estimating trajectories robustly from available data. Some readers may feel that, yes, trajectories are fine, but aren’t we merely describing the system but not explaining it? Why is the trajectory of task A different from that of task B, for example? Without a doubt, a trajectory is not the be-all and end-all of the story. Deciphering how it comes about is ultimately the goal, which will require more elaborate models, and here computational models of brain function will be key. In other words, what kind of system, and what kind of interactions among system elements generate similar trajectories, given similar inputs and conditions?

[1] Rabinovich et al. (2008); Buonomano and Maass (2009).

[2] The proximity of trajectories depends on the dimensionality of the system in question (which is usually unknown) and the dimensionality of the space where data are being considered (say, after dimensionality reduction). Naturally, points projected onto a lower-dimensional representation might be closer than in the original higher-dimensional space.