Sull'isomorfismo di Curry-Howard (1)

Nel corso della stesura di alcuni articoli riguardanti la programmazione funzionale e i suoi rapporti con il lambda-calcolo, mi sono reso conto del fatto che la letteratura in Italiano è a riguardo spesso troppo tecnica e non sufficientemente contestualizzata; ne risulta che alcuni aspetti della teoria del calcolo computazionale e della logica formale si trovano ad essere in ombra, a dispetto della loro importanza fondante. Questi appunti (questo timido esercizio di lettere e matematiche) vorrebbero dare uno “sguardo dall’alto” al cosiddetto Isomorfismo di Curry-Howard.

L’Isomorfismo di Curry-Howard consiste nell’osservazione che due classi di sistemi formali (i modelli di computazione e i sistemi di provabilità), che al tempo della scoperta erano considerate del tutto disgiunte, sono in realtà strutturalmente lo stesso oggetto.

Nel 1958 venne scoperto da Curry un isomorfismo fra la versione tipata del lambda-calcolo e il frammento implicativo della logica costruttiva di Brouwer, indipendentemente poi riscoperto da Howard dieci anni dopo. L’isomorfismo, da allora noto anche come corrispondenza di Curry-Howard, mette in relazione tipi del calcolo con gli enunciati della logica proposizionale e i termini con le prove: alla riduzione nel lambda-calcolo corrisponde l’eliminazione dei tagli nelle prove.

Da un punto di vista più informale si può dire che una prova è un programma, e la formula che prova è un tipo per il programma. Ovvero, il tipo di ritorno di una funzione è analogo ad un teorema, i cui assiomi sono corrispondenti ai tipi degli argomenti passati alla funzione, e il corpo stesso della funzione è la prova di quel teorema. Questo ha conseguenze importanti sulla programmazione logica, fornendo delle fondamenta solide su cui basarla: le dimostrazioni possono essere rappresentate come programmi (lambda-termini, in particolare), ovvero possono essere eseguite.

La possibilità di estendere l’isomorfismo a logiche sempre più espressive (calcolo dei predicati, logiche di ordine superiore) e corrispondenti calcoli, fanno di questi un prezioso strumento, sul quale si basano nuove applicazioni della logica in ambito informatico, come l’estrazione di programmi dalle corrispondenti prove di correttezza e terminazione. Più recentemente, un’interessante conseguenza dell’isomorfismo ha a che fare con la possibilità di parallelizzare efficientemente l’esecuzione dei programmi scritti in linguaggi funzionali; l’analisi di questa prospettiva è delegata ai prossimi capitoli.

Windows 7 – First impressions

Windows Seven has a tough task to accomplish: redeem Microsoft’s image from the disastrous experience of Windows Vista. What is called the “Vista flop” is actually a powerful proof of how fickle the market W7 is supposed to exploit is: Windows Vista has been perceived as a slow, inefficient and problem-generating shitload of desperation since the very beginning of its deployment, without a real or even clear technical basis to support such conclusions.

Indeed it has been an unlucky operating system: it has introduced some great brand-new features, but Microsoft’s urge – after having spent too much time and resources on XP SP2 and without a clear internal developing strategy – to quickly develop and release a new product which wasn’t obviously ready to be deployed to the actual market, bugged the entire launch strategy. The flop hasn’t been due, this has to be clear, to technical issues: Vista was technically more secure and stable than XP; plus I would dare say that Windows XP was at least as bugged as Vista when it was formerly released in 2001. Nonetheless, more than 50% of the world’s computer users eventually got to love XP so much that they still stick to it, no matter the release of a technically improved and prettier operating system (Vista, indeed).

So, you may wonder, if Vista was technically superior to XP, why do we call it a failure? Well, basically because all the little improvements came at the price of a dramatic loss in the system performances, which is a price that almost no-one is ready to pay. Indeed, this is the same reason Linux is a failure itself: ’cause most of the people are just not into all the niche technical features that turn on nerds so bad (SuperFetch, BitLocker,Transitional NTFS, optimized Kernel, ecc); what they want is a simple, easy, reliable and responsive operating system to work on. And whereas Linux is none of them, Vista is not any better. What, in my opinion, cut off its career has been a mixture of three main things: 1, all the expectations brewed in almost six years of gestation, 2, Vista’s feeling, and 3, bad marketing strategy. Basically, the market was waiting for a revolution that hasn’t come, users were disappointed by the huge leap forward in terms of system requirements (“Why the fuck should I install a thing that will slow my computer down?”) with no significant nor immediate advantage, and as a result… FAIL!

Let’s clear this up before going any further: I never installed Vista on my computer. This is mainly due to the (2) point: to me, Vista always felt like a slow-moving full-of-shit caravan, since it was called Longhorn down to the betas and to the finished product itself. I tried installing the SP2 version on my two year old laptop and it was struggling, whereas XP still worked quite well, fast and did pretty much the same things. So everything comes down to this: I’m proudly alazy user: I don’t upgrade because a new upgrade is available, but because it’s convenient to do so.

Microsoft, anyway, can still count on a market share of about 90% in the operating systems sector, but apparently something has changed in the general approach towards the whole market: eventually, we understood that another Windows Millennium won’t be forgiven; the capitalist economy is a savage and dangerous place to live, and what we learnt the hard way is that you cannot stubbornly impose stuff on your customers “just because”: eventually they’ll find something better and migrate to it. Microsoft isn’t a dictatorship and can’t just produce operating systems, developing solutions or anything else and expect people to buy them “just because” they’re Microsoft-branded. In order to keep the share, we have to produce the best tech solutions on the market. This is concurrency, and this is where, along with the .NET Framework, Silverlight, the sharp languages, the developing environments, Windows 7 comes in.

I had Windows 7 installed by a colleague in Microsoft during a meeting that was spectacularly boring me, so I suddenly decided to get rid of the useless Vista partition and try seven. I wasn’t, honestly, expecting much from this new son of the 6.1 kernel. A while back, before they made the announcement that W7 would’ve been based on the same kernel as its predecessors, I was expecting the OS branch to eventually rewrite the system from its very foundations (MinWin, Midori, Singularity), but they rather went for what looked like an open highway to another disaster.

Very prejudiced, I logged into my account after a short installation process. What immediately impressed me was the responsiveness I was getting from the system and, of course, the new taskbar. I started browsing a little bit the new options and the start menu, but in general I was being still cautiously sceptical.

The taskbar is most certainly cute, and well-organized. One click on the application’s icon will open the associated window or, in case or multiple instances, display dynamic miniatures of the windows themselves.

Right-clicking or sliding the pointer upwards on the same icon will instead result in a contextual menu, personalized for every application (Explorer shows the frequent-accessed folders, Notepad the recent documents, Media Player some tasks, etc); quite useful.

There have been some other major improvements in the GUI, but I would reckon that the taskbar is by far the most important, because it can work as a paradigm of the system as a whole: after the minute it takes to get used to it, you realize how fast it is. And this is not about videogame-like fluidity (I’m the wrong person to talk about that), but rather about the smooth sensation of lightness that this 25GB monster actually provides you with when you click on the “Internet Explorer” icon and the browser starts immediately.

I have been told by some people that the taskbar is “copied” from MacOS X; well, most of the freetards who love Linux and MacOS X at the same time are likely to say something like that and, predictably, it’s an utterly imbecile statement: if you look back to OS history, it’ll be clear that GUIs produced from different vendors in the same period are all similar to each other in the same way cars from the same era have similar performances and technologies. But I’ve never heard a Mercedes owner complaining because Audi has adopted the ABS, for instance, whereas freetards and mactards are always complaining about others “stealing” ideas from them (ideas? The wobbly windows? The idiotic cube? The uber confusing exposé? Please), because they clearly cannot see beyond the tip of their nose.

The very point is another one: we are getting everyday closer to the expressiveness that a 2-D GUI can provide. A keyboard-mouse operated interface is obviously bounded to these input peripherals as for the behaviours it can offer, and we won’t go much further with just these instruments. There’s been a lot of research in the past years around 3-D interfaces, which usually suffer from the same problem: 2-D interaction in a 2-D space. In the meantime a solution is found (touch surfaces, vocal commands and natural interactions in general) and we still are confined into the 2-Ds of modern GUIs, seven’s is now quite an effective and smooth one.

I am not going to take snapshots of all the new windows/features/effects that Windows 7 introduces, because in general the look and the general using logic is very similar to Vista. And you know why? Because Windows 7 is Vista; or better, Windows 7 is almost all of what Microsoft wanted Vista to be and was not: a reliable, quick, handy, modular, working, state-of-the-art, cutting-edge operating system. Seven is the final result of Vista’s developing process: after all, just a major refinement of an operating system that was released way too early.

It is very good, though. It’s in a sense like a massive SP3: it magically turns a potentially good OS into an actual good OS, which is not quite a trivial task.

A brief introduction to functional programming – pt.2: Data Immutability

Let’s consider the following:

1. x = x +1

which is a typical imperative assignment; as programmers, we’ve kind of gotten used to such statements. But if shown to a mathematician, he would quickly and smugly look at it and then comment “I don’t know any x  for which this is true.” End of story.

Indeed, the 1 statement sums up what imperative programming is all about: with that instruction you’re incrementing the memory location of a 1 value. This obviously leads to mutable states, because the referenced object is a cell of memory whose content is sensitive to being modified to any value at any time by any method, and more generally, variables’ values change over the time. Having mutable states allow methods to have side effects, which means that you can have a void method that doesn’t take any parameter, yet things happen behind the curtains when you call it. Hence, it matters, for instance, at which point of the program you call it, as well as how many times you call it, which thread you call it from and so forth.

In a pure functional programming language, instead, states are immutable. The paradigm behind functional programming says that everything can be solved creating and combining functions, which are intrinsically immutable (given the same input, they’ll return the same output, no matter when or how many times you invoke them). Functional programming is in many ways a mathematically intuitive programming style; in fact

2. y = x +1

is both a functional statement and a mathematical function. This is indeed the core concept of functional programming: values are results of function evaluations, rather than pure stored values. As a consequence, there can’t be any side effect since there’s no variable data. This property is called referential transparency.

As already stated, it’s important to understand that this abstract concept is very easy to implement in a real situation; obviously, in a functional language such as F# a statement like

let x = x + 1

is completely meaningless, because of what we just said. But also it’s important to understand that

let y = x + 1

would be meaningless without any x already declared with a value in the scope of the program. Hence, something like

let x = 1

let y = x + 1

would evaluate y as 2, no matter when called.

It’s clear that an F# statement like

let increment x = x + 1

is completely different from a C++ statement such as

x = x + 1;

or even

x++;

but, instead, much more similar to

int increment (int arg){

return arg++;

}

which is, indeed, a function (for simplicity I am assuming here that the type is int). Indeed, the keyword let in F# is not an equivalent of dim in Visual Basic or var in Pascal, which are strictly procedural languages. It doesn’t mean “assign to the memory address pointed by the reference on the left the value on the right,” but rather “create a function that evaluates the expression on the right by substituting the given arguments and then return the value.

At this point it’s nice to note how everything comes down to λ-expressions once again. In fact, the previous function can easily be expressed through a trivial λ as follows:

3. λx.+ x 1

More extensively, we can see how it’s possible to adopt a functional programming style in a typical C# class such as the following.

Problems here arise in two points: the public variables x and y are fully accessible and editable from outside the object, and the void method move operates through side-effects on the x and y variables. A simple way to turn that class into a functional class is by marking readonly the x and y fields, and, more importantly, by editing the move function as follows:

Every time we move the point, a new Point object is created and returned, and once it’s been created, it’s immutable. This is, essentially, a trivial application of a functional programming style in an imperative object-oriented language such as C#.

Philosophical Foundations of Artificial Intelligence (1)

Artificial intelligence is certainly one of the most interesting fields of modern science, since it involves many different issues and uses tools and insights from many fields, including computer science, psychology, philosophy, neuroscience, cognitive science, linguistics, ontology, operations research, economics, control theory, probability, optimization and logic.

Most AI books define it as “the study and design of intelligent agents,” although it’s clear that in order to reproduce intelligence through agents of any sort, it needs to be understood what the word “intelligence” actually denotes.

It’s evident that, since we define ourselves as intelligent (sometimes mistakenly) and since our intelligence is notoriously due to our brain, every discussion about intelligence can basically be reduced to a discussion about human brain. It’s nice to remember that, according to Princeton’s professor Jonathan Cohen “There are more synapses in the brain than stars in the galaxy” and “[The human brain is] the most complex device in the known universe“.

There are many different approaches to the AI-issue, depending on which sides or aspects one is interested in exploring. In this brief introduction, we will focus on an abstract yet fundamental question, which Alan Turing has already undertaken: “Can machines think?“. In order to make things easier, we need to analyze the question and, if possible, decompose it into sub-problems. What do we mean with “machine”? What does the word “think” mean?

machine is commonly defined as any algorithm-capable automaton; I think this definition is limiting, since it assumes that intelligence is an algorithmic process, so we will assume here that a “machine” is anything human beings can build and reproduce.

To ask what the verb “think” means is the same as asking what intelligence actually  is: we’re more or less at the same point where we started. A good hint to a possible solution to the problem comes from noticing that, apparently, we just assume that other people (intelligence is perforce anthropocentric) think only according to their behaviours. So the question might become “Can a machine act displaying intelligence?”

But what about the real ability to solve problems intelligently? We just formulated an exterior question, but we want an ontological answer, we want to know if a machine can actually feel like humans, if it can share with us the same intelligence structure. To put it simply, we’re asking whether a machine can have a mind.

The first serious attempt in order to answer to the question “Can machines think?” has been taken by Alan Turing in a famous 1950 paper, where he basically reduced this question to the exterior one we pointed out earlier. Trough thin deductions and fine reasoning he concluded that “if a machine acts as intelligently as human being, then it is as intelligent as a human being.”.

Marvin Minsky gave another, though not less relevant, answer to the issue in a 1943 paper, where he claimed that “if the nervous system obeys the laws of physics and chemistry, which we have every reason to suppose it does, then we ought to be able to reproduce the behavior of the nervous system with some physical device.”

Another fundamental contribution came from Alan Newell and Herbert Simon, which proposed that “symbol manipulation” was the essence of both human and machine intelligence. They wrote: “A formal symbol system has the necessary and sufficient means of general intelligent action.”
Another, actually relevant, version of this position has been formulated by Hubert Dreyfus, who called it “the psychological assumption”: “The mind can be viewed as a device operating on bits of information according to formal rules.”; Dreyfus criticized this assumption, claiming that it rests on two others: the epistemological and ontological assumptions. The first one is that “all activity (either by animate or inanimate objects) can be
formalised(mathematically) in the form of predictive rules or laws“. The ontological assumption is that “reality consists entirely of a set of mutually independent, atomic (indivisible) facts“. It’s because of the epistemological assumption that workers in the field argue that intelligence is the same as formal rule-following, and it’s because of the ontological one that they argue that human knowledge consists entirely of internal representations of reality.
According to Dreyfus, a context free psychology is a contradiction in terms, since we cannot be able to understand our own behavior in the same way as we understand objects.
Dreyfus’s arguments against this position are taken from the phenomenological and hermeneutical tradition (especially the work of Martin Heidegger). Heidegger argued that, contrary to the cognitivist views on which AI is based, our being is in fact highly context bound, which is why the two context-free assumptions are false. Dreyfus doesn’t deny that we can choose to see human (or any) activity as being ‘law governed’, in the same way that we can choose to see reality as consisting of indivisible atomic facts…if we wish. But it is a huge leap from that to state that because we want to or can see things in this way that it is therefore an objective fact that they are the case. In fact, Dreyfus argues that they are not (necessarily) the case, and that, therefore, any research program that assumes they are will quickly run into profound theoretical and practical problems. Therefore the current efforts of workers in the field are doomed to failure.

Another famous (despite its weakness) argument against symbol processing, is John Lucas‘ oen; he argued that “Gödel’s theorem seems to me to prove that mechanism is false, that is, that minds cannot be explained as machines.“ Roger Penrose “expanded” on this argument in his ridicolus 1989 book “The Emperor’s New Mind“, where he pathetically speculated that quantum mechanical processes in neurons’ structures called microtubules, gave humans this special advantage over machines.

Luckily, Douglas Hofstadter, in his pulitzer-awarded “Gödel, Escher, Bach” book, explains that these “Gödel-statements” always refer to the system itself, similar to the way the Epimenides paradox uses statements that refer to themselves, such as “this statement is false” or “I am lying”. But, of course, the Epimenides paradox applies to anything that makes statements, whether they are machines or humans, even Lucas. This shows that Lucas himself is subject to the same limits that he describes for machines, as are all people, and so Lucas’s argument is pointless.

Regarding the other question, it basically revolves around a position defined by John Searle as “strong AI“:

A physical symbol system can have a mind and mental states.

Searle distinguished this position from what he called “weak AI“:

A physical symbol system can act intelligently.

Searle introduced the terms to isolate strong AI from weak AI so he could focus on what he thought was the more interesting and debatable issue. He argued that even if we assume that we had a computer program that acted exactly like a human mind, there would still be a difficult philosophical question that needed to be answered.

Neither of Searle’s two positions are of great concern to AI research, since they do not directly answer the question “can a machine display general intelligence?” (unless it can also be shown that consciousness is necessary for intelligence). Turing wrote “I do not wish to give the impression that I think there is no mystery about consciousness … but I do not think these mysteries necessarily need to be solved before we can answer the question [of whether machines can think].” Bertand Russell and Peter Norvig agree: “Most AI researchers take the weak AI hypothesis for granted, and don’t care about the strong AI hypothesis.

For scientists the word “consciousness” refers to the familiar everyday experience of having a “thought in your head”, like a perception, a dream, an intention or a plan, and to the way we know something, or mean something or understand something. What is mysterious and fascinating is not so much what it is but rather how it works: how does a lump of fatty tissue and electricity give rise to this (familiar) experience of perceiving, meaning or thinking?

A brief introduction to Functional Programming – pt. 1

Overview: programming paradigms

If you are not totally inexperienced about computer programming, in which case reading this article would just plainly be a total loss of time, you certainly know that there are a lot of different programming languages that have been developed in different periods to supply different needs; each one of them carries a different view about the semantic world they cover. There is, indeed, an entire branch of computer science that studiesprogramming paradigms, as are these “views” called.

We don’t really need to review the story of computer programming back to hard wiring in this article, but a brief synopsis is needed in order to make things a bit clearer. So, after someone realized that wiring physically the programs wasn’t just going to work, computers started being programmed using binary code that represented control sequences fed to the computer CPU. That probably made programmers sweat a bit less, but still it was as comfortable as being stabbed. Shortly after, in the early 50s, assembly languages were invented in an attempt to make computer programming easier.

At the time symbolic labels and mnemonic keywords probably seemed like heaven, but still writing assembly is not exactly the most comfortable thing you can think of. It’s somehow like working in the sewers: you know that it’s an important work, and that someone’s to do it, but you really don’t want to know anything more, no matter how many documentaries Discovery Channel makes on it. Coping with the dirty secrets of complexness is not for everybody.

Indeed, it didn’t take much longer again before somebody else got tired of assembly’s complexity and tried to abstract it into what is known as imperative programming: a series of statements that express, step by step, exactly the procedure that should be followed to solve a problem. I don’t really want to dig into the issue, but procedural programming (as is it’s also called, with a particular reference to those languages capable of organizing groups of operations into blocks called “functions”) carried an immense number of mathematical and logical issues, some of which are really worth a read (i.e. Dijkstra’s GOTO statement article, compilation-related automaton theory, design patterns, data representing, ecc), and it’s still the mainly used programming technique.

Yes, because the next-step, id est Object-Oriented programming, is still procedural. By OO programming we mean languages where data, and methods that can manipulate the data, are packed together into single units called “objects” that represent a coherent and transparent set of information; in fact, external user can access the data is via the object’s “methods”, which are nothing else than well-fitted procedures. The revolution carried by OO programming doesn’t lie on a brand new programming paradigm itself, but rather on how programs are structured thanks this sort of overlay. In fact object-orientation is implementable, in different degrees, on every typed language. There are indeed masochistic implementations of assembly-OO-languages (HLA) as well as functional OO languages (Needle), pure OO languages (Ruby) ecc.

OO programming is an elegant way of expressing a program as an algebra. We define objects (classes) as a hierarchy of structured containers of data plus the operations available on the structured container. This way of defining programs allows for a much cleaner level of abstraction, since we do not manipulate data with “random” chunks of code (procedures) but the data only allows structured accesses to itself through its operators, which in turn become part of the data themselves. [GM]

After OO-programming, another little step toward another layer of abstraction has been given by code managing, which means that the program is executed under the management of a virtual machine, instead that directly on the computer’s CPU; wrapping the executable into a sandbox enhances its safety and security and allows the creation of monsters like garbage collectors, but usually at a price of a noticeable overhead (JIT compiling, intermediate languages and similia have been invented to increase efficiency of interpreted languages). It’s important to observe, though, that theoretically every language could be interpreted (executed into a sandbox) or compiled (and then executed directly on the CPU).

Managed code allows to define safer constructs, for example forbidding (let the Lord of Heaven be praised) pointer arithmetic (cit. Djikstra). By providing some restrictions on managed code, we can achieve compilers that guarantee very strong statically validated properties (see Singularity compiler) and very high level runtime primitives ranging from lambdas to reflection to quotations. [GM]

On the other side of the barricade (which means universities), another paradigm was being developed in the meantime: declarative programming. For a long time this has been considered just an academic exercise, but later it has gained popularity among specialist. A program written in a declarative language will express the logic of a computation without describing its control flow, focusing on (indeed) declaring what the program should accomplish, rather than describing how to go about accomplishing it.

Different needs

The relationship between languages and expressiveness is hence summarized in the following picture.

Where imperative languages allow the programmer to have a deeper control of what the machine’s actually doing, the declarative approach simplifies a lot the task of programming: we’ll see in the next paragraphs how this abstraction is obtained, what it’s useful for and which needs may supply.

What should be clear at this stage is that there’s no such thing as a “better language” as a general statement. There are more suitable languages for different purposes: nobody would enjoy writing a parser in assembly just as nobody would write a kernel in F# without being a total sandwich short of a picnic, but both things are possible.

Trends

As suggested by Anders Hejlsberg recently, there are three trends that broadly impact on computer programming today. The first is the tendency toward forms of more declarative programming, which is made clear by the wide diffusion of Domain Specific Languages, which are languages targeted to a specific context (XAML, XSLT, HTML, SQL, CSS, Makefiles and so forth). There’s a huge debate around what is and what isn’t a DSL, but here we’ll just observe that they play a relevant role in today’s programming scene.

Second comes the need for dynamic languages. According to Wikipedia, such languages are “high-level programming languages that execute at runtime many common behaviors that other languages might perform during compilation, if at all. These behaviors could include extension of the program, by adding new code, by extending objects and definitions, or by modifying the type system, all during program execution. These behaviors can be emulated in nearly any language of sufficient complexity, but dynamic languages provide direct tools to make use of them.

Last and definitely not least, is the need for more concurrency. As you may know, nowadays Moore’s law doesn’t apply to CPUs frequency any longer, but it rather does to the number of cores that processors have. This is an actual revolution the market is pushing, which is throwing into crisis the instruments that have been developed after more than half a century of time-sharing mono-processor multitasked programming. Anyone who has ever had the misfortune of coping with Java’s threading model, for instance, knows how complex is to develop multithreaded applications that perform real concurrency. There are countless problems due to shared memory locations and resources, side-effects, (dead)locks, and so on. Thanks to memory sharing issues, it’s impossible to parallelize efficiently different operations if they share variables or resources, because the broad and unavoidable use of mutexes would make them wait for each other exactly as it would happen in a sequential execution. Such solutions don’t really sound like viable possibilities in a market where CPUs with hundreds of cores are foreseen to be developed within the next five years.

Functional Programming and λ-calculus

So, straight to the core of this article, functional programming is possibly an answer to all the needs that computer programming is asking for today. FP is a particular kind of declarative programming where the program consists entirely of functions, in a mathematical sense. Hence, functional programs do not contain any assignment statement, since there aren’t variables and therefore, no side-effects (Although very few modern FP languages don’t allow for side-effects. Haskell is a notable exception, confining side-effects only to the restricted scope of monads. Languages like OCaML or F# allow wide use of mutability while making everything immutable by default. [GM]). So since expressions denote one single value, they can be dynamically evaluated at any time, freeing the programmer from the burden of prescribing one particular control flow; functional programs are hence said to be referentially transparent. We’ll see how all of this is possible further on this series of articles.

The theoretical framework in which FP can be put into is Lambda calculus, a formal system devised in the 30s by Alonzo Church to investigate functions, function application and recursion. It’s basically a very general programming language itself, made of a single transformation rule (variable substitution) and a single function definition schema. In spite of this simplicity, it does formalize every possible computable function.

Objects of study in Lambda calculus are λ-terms, which are anonymous functions that perform an expressed transformation to their arguments. A basic λ-term is this:


which represent an identity function. The following expression is a simple λ-term that expresses a function (λ) that takes x as a argument an returns its double (2x).


Obviously functions can be nested, like this:


That is the function of x that returns the function of y that returns the product of the sum of x and y and 2. Since all λ expressions have only one argument, they can be cascaded like in 3. Such a practice is called currying, not because it’s spicy, but because it’s named after Haskell Curry, whom we’ll meet again soon.

Evaluating λ-expressions is a simple task that reminds formal grammars sentence derivation:

When lambdas come in, the situation gets a little bit more interesting. (The parenthesis is used to “call” a function; when a λ-term is inserted into parenthesis that means that it’s to be evaluated).

The substitution we operated between 8 and 9 on the formal parameter is so called β-reduction, which is a particularly important operation (along with α- and η-conversion), because the meaning of lambda expressions is defined by how expressions can be reduced. It’s possible to apply recursively λ also to other expressions, as follows:

In an expression, each appearance of a variable is either bound or free, depending on whether such variable is an argument of λ or not. β-reduction of

Replaces every x that occurs free in E with y, where E is another well-formed λ-expression. In general, to make λ-terms less confusing, we can use the α-conversion, which is nothing more than a rule that allows us to rename parameters and that we implicitly used also in 3.

It comes quite natural to evaluate λ-terms using the same application order that we are used to normalize polynomials with, but there are actually several issues related to the order that can be chosen to reduce an expression. Basically there are two possibilities: always reduce the leftmost innermost redex (which is a sub-expression that can be reduced), or always reduce first the leftmost outmost redex. These two approaches are respectively called applicative and normative reduction. The former is usually more efficient, the latter accepts a larger class of programs. What matters here is that progressive reduction should lead to what it’s known as a normal form, that is a λ-expression that cannot be reduced any further. Thus,


Is the normal form of

Like human relationships, not every λ-expression has a normal form. Trying to reduce a non-reducible expression to a normal form will only lead to the same form again and again in an infinite loop. There is, anyway, no algorithm which takes as input two λ-expressions and outputs TRUE or FALSE depending on whether or not the two expressions are equivalent. This was historically the first problem for which undecidability could be proven. (There is a strong relationship between Russel’s Paradox and untyped λ-terms where something is applied to itself. [GM])

What is probably the most important result in λ-calculus, is the Church–Rosser theorem, which states that if there are two distinct reductions starting from the same lambda calculus term, then there exists a term that is reachable via a sequence of reductions from both redex. In other words, if there’s a normal form, there’s a strategy to find it and the result is unique; in formal terms, CR theorem is a proof of confluence. At a higher level, it means that evaluation (β-reduction) can be carried out in any order, even in parallel.

It’s important to notice that functional programming is not language-specific, because it’s a general theory (and functional programming is λ-calculus with a more palatable syntax). For instance, C++ or Java can be programmed in a functional way (an article about this will follow), although there are languages specifically designed for the task. What matters is that, on a more formal side, it’s been proven the isomorphism between Turing Machines and λ-calculus.

Λ-calculus give us a solid mathematical foundation, basing on which it’s possible to comprehend that functional programming is no more an academic exercise, but rather it’s the next natural step in computer programming.

—————————————————————————–

[GM] notes by Giuseppe Maggiore