The Misp Chronicles XII: The Lost Interpreter

William Taysom on Wed, 17 Jan 2007

How easy is it to become distracted? And once distracted to remain so for months if not years on end? Many have asked when I would conclude the Misp Chronicles. The answer is today.

As promised, the interpreter is 50 lines of Misp. Using the syntactic shorthand we’ve discussed, it becomes 44 somewhat clearer lines. Finally, we have a Scheme version of the interpreter of 72 lines. For the scheme version, I replace Misp’s if with ef. Why? Because Scheme already uses if.

Thank you for your patience. Though distracted I’ve been, I’m excited to tell you about the object of my distraction. You may recall that in my first Misp post, we saw a far off peak, Misp’s strange twin, a passion, a distraction, I call it Wisp.

Tags ,

The Misp Chronicles XI: Closure Currying

William Taysom on Mon, 27 Mar 2006


If you've ever met a functional programmer, you might think that his favorite food is curry. In some cases, this is quite likely. But in all cases, functional programmers are always making curry. They even do it without powder. Instead they use closures.

What's a Curry?

A curry is a special way of defining functions. Functional curries are only accidentally related to Thai and Indian cuisine. Functional curries come from Haskell, not the programming language, not originally, rather the man Haskell B. Curry. And not him either, really, because Christopher Strachey coined the term and Frege was currying long before Haskell. Others were currying long before that and I'm sure you've curried too since comes up in a elementary algebra.

When I first learned algebra, I was told that a ^ (b * c) is the same as (a ^ b) ^ c. I couldn't tell you why that's true because they don't teach math in school (or science for that matter) instead they teach dogma. I did poorly in my mathematical dogma classes so I needed remedial education: I majored in math. Math requires logic. I'm not a very logical person, so I got my Masters in Philosophy. Philosophy was far too tricky, so I got a job programming computers. Computers, as we all know, are remarkably stupid. You have to explain everything, and they're inclined to misunderstand. It's easy to mistake their ignorance for malice. Or in the immortal words of Cleveland the Janitor, "It's a dumb beast you be foolin' with. It don't never sleep."

Algebra class curring is the process of converting an expressions of the form a ^ (b * c) to (a ^ b) ^ c. As applied to functional programming (and Misp in particular), its the process of converting an expression of the form {|a b| c} to {|a| {|b| c}}. Let's consider a specific example. The function which forwards directly to pair is {|a b| (pair a b)}. The curried version is {|a| {|b| (pair a b)}}. This difference, though small, has practical uses. Rather than building a pair all at once, you can supply part at a time: the head first and the tail later. From what I hear, Python is adding this spiffy trick.

Let's see an example of curried pairing in action:

(({|a| {|b| (pair a b)}} 'serpents-head) 'lions-tail)

This expression evaluates to (serpents-head . lions-tail): it makes a chimera. How so? Let's look at the steps. We start with a curried pair. After add a serpents head, the resulting mixture is a potion of hydradic growth. When applied to a lions tail. The lion becomes a chimera. (We still need to add the goat head, but you get the idea.) Using the environment building transformation we talked about previously, we can make all the steps explicit:

({|curried-pair| ({|hydradic-growth| (hydradic-growth 'lions-tail)} (curried-pair 'serpents-head))} {|a| {|b| (pair a b)}})

Whoa. After reading that, I need a break.

Sugar Break

Improving the environment on mountain of Misp is wonderful service oriented activity. But it's pretty hard for a regular guy. It's a job for Context Free Man, and I'm not up to the challenge. Building the environment is a balancing act like balancing parenthesis. The rule for balanced parenthesis is simple:

paren-wrapped ::= ( yummy-goodness )

As you add yummy goodness, the open and close parenthesis get further and further apart. It's the context-free-separation-disaster, and we need a super hero to take it on.

The disaster gets worse when you try to enrich the Misp environment:

rich-environment ::= ({|object-name| yummy-goodness} object-value)

See how the name of an object gets separated from its definition as you add yummy goodness. That's crazy-go-nuts! It's so crazy that Lisps resort to macros to combat the problem. In the Misp world, hiring Macro Man is out the the question: he's too expensive. All we can afford is some more meta-level syntactic sugar. It's empty calories, but Misp has been scare on programmable substance from the start. Misp proper doesn't have strings or numbers or files or IO of any kind -- just symbols and pairs. We have ways of writing down symbols and pairs, but it's all meta-language. Meta-language is cheap since you can't program with it. It won't break the bank if we add one more bit. Here's the rule:

Instead of: ({|object-name| yummy-goodness} object-value)
We write: object-name = object-value; yummy-goodness

If the expression takes more than one line, we can (and will) leave off the semicolon. The sugared chimeric formula is:


curried-pair = {|a| {|b| (pair a b)}}
hydradic-growth = curried-pair 'serpents-head
partial-chimera = hydradic-growth 'lions-tail

Syntactic sugar does a body good.

The Closure Conundrum

Curry is yummy, but making it is tricky. See curried-pair is function which returns a function such as hydradic-growth. hydradic-growth has the effect of causing its argument to sprout a serpent's head by pairing it with serpents-head. Thus our partial-chimera looks like (serpents-head . lions-tail). So hydradic-growth does the same thing as the function {|b| (pair 'serpents-head b)}. It's as though the a in curried-pair was replaced with 'serpents-head. That's a natural way to think about it, but it isn't what Lisps do. Why? Well suppose that the inner body of curried-pair was complex with many occurrences of a. Replacing every a would require scanning the whole body. If the body contained conditionals, then you might end up doing a lot of work that no one ever sees. Lisps work differently.

Instead of replacing every a, Lisps usually make a table to represent the environment. In hydradic-growth, the symbol a is paired with the symbol serpents-head. Most Lisps have no way of representing these so called closures. Since we need to talk about them, we should have a way of writing them. We'll use a Misp expression for it! We'll represent hydradic-growth with the expression (__closure__ ((a . serpents-head)) (b) (pair a b)). Notice that the body of the closure is the same as the body of curried-pair. We use the list of pairs ((a . serpents-head)) to show that a refers to serpents-head.

Finally, when we apply a closure hydradic-growth 'lions-tail each time we come across a variable we first check if its an argument, and if not we see if it has an associated value in the environment. That's all there is to it. And now we're ready to write a Misp interpreter in Misp.

Tags

The Misp Chronicles X: Environmental Functions

William Taysom on Tue, 21 Feb 2006


Functions in Lisps do all sorts of amazing things: some are truly surprising. One of the most common uses is to give objects names. Since Misp doesn't have any other way of naming, this technique is absolutely essential.

Applying Functions: A Review

A function in Misp has two parts: a head and a body. The head is a list of symbols or a single symbol. These symbols are assigned values when a function is applied. The body of a function is any Misp expression. When you evaluate the body, the symbols in the head refer to their associated values.

Consider the self pairing function (fn (x) (pair x x)). The head is (x). The body is (pair x x). To apply it with 'hello, we write ((fn (x) (pair x x)) 'hello). First we associate x with the symbol hello. Then we evaluate (pair x x) by looking up each symbol and applying when were done. pair is a primitive, so we move on to the other two terms. They are both x. Since x is associated with hello, we pair hello with hello to get (hello . hello). Hello!

Syntactic Sugar: Soul Sweetener

When a programmer is born into the world, he is given a lifetime supply of parenthesis. Unfortunately I wasn't in the Lisp line. I've already used up most of mine. In an effort to avoid an untimely end, I'll use a block syntax like Ruby for functions.







BlandSugaryArgsReturns
(fn (x) (pair x x)){|x|(pair x x)}'hi(hi . hi)
(fn (x y z) z){|x y z| z}'a 'bnil
(fn (ls) ((fn (x . xs) xs) . ls)){|ls|({|x . xs| xs} . ls)}'(a b c)(b c)
(fn xs xs){|.xs| xs}'a 'b 'c(a b c)

The second example shows that z gets bound to nil when there is no corresponding argument. The third example shows that a function can be used inside of a function.

Improving the Environment

It would be great if we could give functions names instead of explicitly defining them whenever we use them. For instance, (list 'a 'b 'c) is clearer than ({|.xs| xs} 'a 'b 'c). Misp only has one way of assigning meaning to symbols: function application. An assignment of symbols to values is called an environment or a context. The association created in function application extends Misp's environment thereby enriching the expressivity of the language.

To give the symbol list a meaning in the expression (list 'a 'b 'c), list must be the argument to a function {|list|(list 'a 'b 'c)}. What happens when we evaluate ({|list|(list 'a 'b 'c)} {|.xs| xs})? We extend the environment so that symbol list refers to {|.xs| xs}. When we evaluate the body (list 'a 'b 'c), we look up list in the environment. Since list refers to {|.xs| xs}, the result of the body is (a b c).

Suppose that instead of building a the list (a b c), we want to append a and b to the front of any list, (c d e) for example. Assuming we have an environment where the symbol list refers to {.xs|xs}, we say (list 'a 'b . '(c d e)). How do we generalize from (c d e) to any given list ls? For starters, we replace '(c d e) with a symbol ls. Then we make (list 'a 'b . ls) the body of the function {|ls| (list 'a 'b . ls)}. That's it except we're assuming the definition of list. To make our assumption explicit we say ({|list|{|ls|(list 'a 'b . ls)}} {.xs| xs}). The function application simply extends the environment so that list is defined in the inner function {|ls|(list 'a 'b . ls)}.

Though simple, clear, and intuitive, Lisp didn't build environments in this way for a long time. Scheme was the first Lisp-like language to do it, Common Lisp followed in the '80s. What took so long? What's the fuss about? Next time: The Closure Conundrum.

Tags

The Misp Chronicles IX: fn

William Taysom on Sat, 11 Feb 2006


Functions are Misp's final and most important construct. They have so many uses that it will take several episodes to explore their mysteries. Today, we'll cover how to define functions and how to invoke them.

Functions are created with the fn keyword. Other Lisps use lambda in tribute to the Lambda Calculus. Unfortunately, the word "lambda" makes me think of llamas, which are quite upsetting, so we'll use fn instead. A function has two parts. The head is a list of symbols that are bound when the function is invoked. The body is an expression which uses those symbols. Consider the function (fn (x) (pair x x)). The subexpression (x) is its head or parameter list. The subexpression (pair x x) is the the body.

We invoke a function by making it the first element in a list. The subsequent items are used as arguments. For example, ((fn (x) (pair x x)) 'hello) takes an argument x and returns x paired with itself. In this invocation, 'hello is bound to x. This means that x refers to 'hello when the body of the function is evaluated. Every time we come across x we know to replace it with 'hello. Thus (pair x x) means the same thing as (pair 'hello 'hello) which evaluates to (hello . hello).

When describing the Lambda Calculus, we usually say that 'hello is substituted for all (free) occurrences of x in (pair x x). Lisps see the world differently. Instead of replacing x upfront, we wait. We make a note that x ought to be replaced with 'hello, but we don't actually do the replacement until its needed. The collection of notes saying what ought to be replaced is called an environment. We'll talk a lot more about environments in the future.

For today, we'll introduce some shorthand for functions. Experienced Lisp programmers say, "after a while you don't notice the parenthesis." I'm not one of them. With all the rough terrain ahead, I don't want us overburdened by parens. I'd rather trade simplicity for clarity. We'll adopt Ruby like block syntax for our function definitions.

Let's start with ((fn (x) (pair x x)) 'hello). Instead of writing (fn ...), let's use curly braces {...}. Instead of grouping arguments with parenthesis (x), let's group them with vertical bars |x|. These are little changes, but I think it helps: ({|x|(pair x x)} 'hello). As another example, consider this function (fn (x y) y) which returns its second argument. It becomes {|x y| y}.

This shorthand leaves a two sticky cases. The expression ((fn (x . xs) xs) 'a 'b 'c) ends the parameter list with the symbol xs instead of nil. (Recall that (x y) is short for (x . (y . nil)).) What do we do now? Scheme binds x to the first argument a and binds xs to the rest of the arguments (b c). Therefore, ((fn (x . xs) xs) 'a 'b 'c) evaluates to (b c). The shorthand for this must wait because we've stumbled across a curious hole. Intrepid explorers we will see what there is to see.

Recall that (tl '(a b c)) is (b c). We just saw a function that is remarkably similar. Can we go all the way? Note that ((fn (x . xs) xs) 'a 'b 'c) is the same as ((fn (x . xs) xs) . '(a b c)). We need to generalize from '(a b c) to any old list. We're in luck, functions are good at this. We just replace '(a b c) by a symbol ls, bind ls by a function, and apply '(a b c).

Behold ((fn (ls) ((fn (x . xs) xs) . ls)) '(a b c)). The expression evaluates to (b c) just like (tl '(a b c)). The equality isn't an accident: it works in general. We've defined a function that does the same thing as tl without mentioning tl. So tl is not an irreducible primitive in Misp. We can replace tl with the function (fn (ls) ((fn (x . xs) xs) . ls)). Similarly, we can replace hd with the function (fn (ls) ((fn (x . xs) x) . ls)). fn just ate hd and tl. The function construct is a cannibal god. It demands the sacrifice of primitives.

Tags

Nine in Nine

William Taysom on Mon, 06 Feb 2006


To those following the Misp chronicles, let me assure you that the series has not been cancelled midseason. Production on episode nine was initially delayed due to an emergency trip to Britannia. With Lord British restored to sovereignty, I've been preparing our equipment for the journey into the majestic and perilous territory of functions. Functions have many uses, and are used in many languages. Today, I thought I'd show you some examples. We'll see the anonymous function for adding two numbers defined in nine languages. But wait there's more, we'll apply the function to four and five yielding nine in nine.

An anonymous functions have several important properties:

  1. Anonymous functions don't necessarily have names.
  2. Anonymous functions can be generated at runtime.
  3. Anonymous functions can be passed as arguments to other functions.
  4. Names used in anonymous functions are looked up lexically.

The last point could use some further elaboration. Looking up a name "lexically" means that you determine its referent using a specific context (or scope). A context is an assignment of names to values. The lexical context is the context where an object is defined. A function that captures such a context is called a closure. In practical terms, using lexical scope means that you can tell the identity of an identifier by looking at your source code. If a name is used in a function, first you check if it's local or a parameter. If not, then you look to the enclosing block. If the enclosing block doesn't define it, then you keep looking outwards until you reach the global top-level. Some languages have simple scope hierarchies: C has a fixed number of levels (it's either three or four). Today's nine languages all support many levels. That said, let's bring out the contestants.

Today's Contestants

Java bursted out of Sun in 1994. A systems language cleverly retargeted and marketed for the web, Java soon became the de-facto standard object oriented language. (C++ doesn't count, "C++ pretends to provide an object-oriented data model, C++ programmers pretend to respect it, and everyone pretends that the code will work," Guy Steele coauthor of the Java Language Specification.) Though Java doesn't have anonymous functions per se, it does have anonymous inner classes. Again Guy Steele, "when anonymous inner classes were introduced into Java, we had a full implementation that made them act like closures. We got push-back from users." Despite the push-back, from the right base class, function-like inner classes are readily obtained.

JavaScript first appeared in Netscape Navigator at the end of 1995. Infamous for popups, cross-browser incompatibility, and being mistaken for Java, JavaScript has cleaned up its act. With the emergence of web-standards, JavaScript's semantics was refined by the ECMA-262 spec earning JavaScript it's official title: ECMAScript. Standing alone as a popular scripting language with a formal spec, ECMAScript was realized as ActionScript in Macromedia Flash and even found its way into Adobe Photoshop. Professionally, JavaScript has been my primary language for the passed year as I've been employed in mutating Mozilla, Navigator's open source child, for research purposes.

Slate is a new Smalltalk-80 descendant in development since 2003. Inspired by CLOS (the Common Lisp Object System) and Sun's research language Self, Slate is an attempt to redeem the promises of Smalltalk that were never realized. Only the future can tell if Slate will succeed.

Ruby, brain child of Yukihiro Matsumoto, was conceived in February of 1993 by the then unemployed Matz. After a two year gestation, Ruby became public in 1995. True to its design philosophy, "we need to focus on humans, on how humans care about doing programming or operating the application of the machines. We are the masters. They are the slaves," Matz produced an industrial quality gem. Excellent for everyday programming tasks of all sizes, Ruby is the most practical programming language I know. My only complaint is a consequence of its greatest virtue. Says Matz, "Actually, I'm trying to make Ruby natural, not simple." Natural it is, simple it is not: Mixins are Modules, Procs aren't Blocks, and custom instances are Eigenclasses.

Scheme: "Programming languages should be designed not by piling feature on top of feature, but by removing the weaknesses and restrictions that make additional features appear necessary." First designed in 1975 as a cleaner Lisp, Scheme has had five major revisions since then. If Common Lisp is at the bottom of the language lattice: the great repository of all language features, then Scheme is at the top: remove anything and all you have left is a bunch of parenthesis.

Misp is a minimal Lisp like language that I invented on November 11 of last last year. Misp is a minimal by only having symbols and pairs. Due to its simplicity, a Misp interpreter can be implemented with any of today's languages in just about a hundred lines. My Misp meta-interpreter is exactly fifty lines.

OCaml is an impure strongly typed functional programming language that performs. Creator Xavier Leroy, "OCaml delivers at least 50% of the performance of a decent C compiler." OCaml is the 1996 child of the then year old Caml. Both were developed by INRIA (the French National Institute for Research in Computer Science and Control). OCaml extends Caml with Object Oriented constructs. Caml itself is impure because it has some imperative primitives.

Haskell, the premiere pure functional language, was created by committee in 1987 to be a language that's as appropriate for academic papers as it is for compilation on a computer. I know I've read much more Haskell than I've written. Haskell looks more like mathematical notation than a programming language. Paul Hudak, "we provided [our Haskell prototype] without explaining that it was a program, and based on preconceptions from their past experience, they had studied [it] under the assumption that it was a mixture of requirements specification and top level design. They were convinced it was incomplete because it did not address issues such as data structure design and execution order."

wl is my favorite language. In fact, wl is my, Will's, language. In frantic development (I don't sleep), I'm hoping to have a public release ready by the middle of 2010.

Show me the code!

Now that everyone's introduced, the game to write an idiomatic program that applies an anonymous function for adding two numbers to the numbers four and five. Notes follow the results.












LanguageCodeLengthCompared to Misp
Java
((Integer)(new Function() {
public Object call(int x, int y) {
return new Integer(x + y);
}
}).call(4, 5)).intValue();
122508%
JavaScript
(function(x, y) { return x + y; })(4, 5);
41171%
Slate
[| :x :y | x + y] applyWith: 4 with: 5.
39162%
Ruby
lambda {|x, y| x + y}.call 4, 5
32133%
Scheme
((lambda (x y) (+ x y)) 4 5)
28116%
Misp
((fn (x y) (+ x y)) 4 5)
24100%
OCaml
(fun x y -> x + y) 4 5;;
24100%
Haskell
(\ x y -> x + y) 4 5
2083%
wl
{v + w} 4 5
1146%

The Java version assumes the Function class with several varieties of call. call returns an Object in order to allow all kinds of return values for functions of two ints. Since Java 1.5 has a stronger type system, one might be able to simplify this code.

Though syntactically similar to Java, JavaScript's biggest difference is the function keyword for defining anonymous functions. I use semicolons ; here because it's the preferred JS style; however, in my own code, I omit them whenever possible.

Slate uses a block syntax very much like Smalltalk's. A colon : before an identifier in the header indicates that the variable is an argument. If there were no colon, you are declaring a local variable. I chose to send the applyWith:with: message instead of the roughly equivalent applyTo: because applyTo: would require that I put the arguments into a list first.

Ruby is inspired by Smalltalk and it shows. The lambda is necessary because Blocks are not Procs: blocks must be the argument to a method, so the lambda method is used to capture one. Ruby makes parenthesis around arguments optional. I exercise the option whenever I can. Some people say that after writing enough Lisp, you disregard the parenthesis; you stop seeing them. I disregard them, so I see them as noise.

Notice that both Scheme and Misp use prefix syntax for the sum +. Strictly speaking, Misp doesn't have numbers. Moreover, symbols in Misp are completely atomic: there's no way to discover the letters used to write them. Therefore, you couldn't really implement a sum function like this even if you wanted to.

Since Haskell and OCaml are functional languages, it's no surprise that the function definition syntax is very elegant. It's not a typo, OCaml does use two semicolons ;; to end a statement. The backslash \ in Haskell is meant to suggest the Greek letter lambda.

In wl, you don't have to name the arguments to a function. If you don't, then v refers to the first argument and w refers to the second argument. Other arguments are accessible by index through vs. Implicit binding has great uses. All of the languages (except Misp) support implicit binding in one way or another. In wl, it is conventional to use v and w for short functions in which more descriptive names wouldn't do much good. If had been set on naming the variables, we could have written {x y -> x + y} 4 5.

Tags

The Misp Chronicles VIII: If

William Taysom on Thu, 12 Jan 2006


So far our Misp adventures have focused on the construction, destruction, and comparing of structures. Our final two primitives take us passed simple structures into the realm of computation. Today, we will add the conditional if to our toolbox.

The conditional if takes three arguments: an antecedent, a consequent, and an alternative. The antecedent is a test expression. It's evaluated before the consequent or the alternative. If result is nil, the alternative is evaluated and becomes the value of the conditional; otherwise, the consequent is evaluated. Consider a few examples:

(if nil 'nope 'yes) => yes
(if (eq nil nil) 'yes 'nope) => yes
((if (eq nil 'x) quote pair) (if (eq 'a 'b) 'c 'd) (if (eq 'e 'e) (pair 'g nil) 'h)) => (d g)

Notice that the last example contains three conditionals. Since (eq nil 'x) is nil, the alternative pair of the first conditional is evaluated. As one of our primitives, pair evaluates to itself. The antecedent (eq 'a 'b) of the second conditional is also nil. Its alternative 'd evaluates to the symbol d. So the value of the second conditional is d. The antecedent (eq 'e 'e) of the third conditional is not nil. So its consequent (pair 'g nil) is used. After having resolved the three conditionals, we have the operator pair, the symbol d, and the pair (g). Applying pair to d and (g) gives us the final answer of (d g).

Conditionals are simple and straightforward. Misp's final construct will appear simple, but is far from straightforward. The final construct is a more general conditional. With it, having a primitive if is redundant. This final construct is a powerful form of goto: it's a goto which takes arguments. Arguments allow you to give symbols special meanings inside the goto. This potent goto is known as abstraction since arguments abstract (or parameterize) some part of a wrapped expression. Function definition is another name for abstraction. In Misp we'll use the symbol fn to create abstractions. You must also know that abstraction has the infamous title lambda: the ultimate goto.

Tags

The Misp Chronicles VII: Seven of Nine

William Taysom on Wed, 04 Jan 2006


Back from the holidays and having a cold, I'm back to describing the minimal Lisp which we have come to call Misp.

Over the last seven episodes, we've looked at seven of Misp's constructs. In the next few episodes, we'll examine that last two. Then we'll take the final step of implementing a Misp meta-interpretor. Before pressing on, let's review the first seven constructs we've used so far.

Syntax

Misp has two kinds of objects: symbols and pairs. We use strings to represent symbols. Here are some symbols:


look bunny RaPiD voluntary-scuba-diving fabric $++%++ perlification nil

We use parenthesis and dots to represent pairs:


(head . tail)

We adopt two rules for abbreviating constructs involving pairs:

Nil Hiding: (head) is short for (head . nil).

Tail Folding: (head tail . etc) is short for (head . (tail . etc)).

We use the shorthand rules to make Misp easier to write and read.

Semantics

Languages give meaning to structure. The structure a language uses can be anything: sound, shapes, hand motions. In Misp, it's symbols and pairs.

In a programming language, determining the meaning of an expression is the same as running a program. Misp is a pure language: when you run a Misp program it returns a result without having any other meaningful side effects.

Misp has a generic rule for evaluating expressions. Formulating the rule precisely amount to writing an interpreter for Misp. Later in the series, we will write an interpreter for Misp using Misp.

The meaning of a pair depends on its head. Once you know the meaning of the head. You know how to use the tail to determine the meaning of the whole. The meaning of a symbol depends on context. Regardless of context nine symbols have special meaning. Today, we review how seven of the nine special symbols are interpreted.

nil evaluates to itself. nil is used to represent a "no" answer, falsehood, the empty list, or anything generally vacuous.
nil => nil

quote. Quoting an expression causes it to be interpreted literally. Use quote to get a hold of the literal structure of an expression.
Notation: 'expression is short for (quote expression).
'hi => hi
'(a . pair) => (a . pair)
'(what (is up)) => (what (is up))
''double-quote => 'double-quote
'(a nested 'quote) => (a nested 'quote)

eq tests if things are identical. Two symbols are identical if they have the same letters. No two quoted pairs are identical.
(eq 'hello 'world) => nil
(eq 'hello 'hello) => true
(eq nil nil) => true
(eq nil (eq 'hello 'world)) => true
(eq '(pair) '(pair)) => nil

atom tests if something is a symbol.
(atom nil) => true
(atom 'true) => true
(atom '(pair)) => nil
(atom '(pair of ((pair) . (pair)))) => nil

pair creates a pair of things.
(pair nil nil) => (nil)
(pair 'pair nil) => (pair)
(pair 'hello (pair 'world nil)) => (hello world)
(pair (pair 'a 'b) (pair 'c 'd)) => ((a . b) . (c . d))

hd gets the head of a pair.
(hd '(head . tail)) => head
(hd (pair 'head 'tail)) => head
(hd (hd (pair (pair 'a 'b) (pair 'c 'd)))) => a

tl gets the tail of a pair.
(tl '(head . tail)) => tail
(tl (pair 'head 'tail)) => tail
(tl (tl (pair (pair 'a 'b) (pair 'c 'd)))) => d

Tags

The Misp Chronicles VI: Pair, Hd, and Tl

William Taysom on Sat, 10 Dec 2005


Now that we can tell pairs from symbols, let's add a way to make pairs and to look inside them. Historically, the three functions for doing it are called 'cons', 'car', and 'cdr' where 'cons' is constructs pars, 'car' returns the head, and 'cdr' returns the tail. Why 'car' and 'cdr'? They were names of assembly language macros for the IBM 704 where Lisp was first implemented. With all do respect to IBM, I prefer hd and tl. I pick pair over cons simply because I'm biased against velar stops. Furthermore, pears are yummy. Bet you can't say that about conses? Call them what you will, here's how they work.

To make a pair of things you say (pair 'of 'things). The result of evaluating (pair 'of 'things) is (of . things). To get the head of a pair say (hd '(the-head the-tail)). Here, the answer is the-head. To get the tail say (tl '(the-head the-tail)). The answer is (the-tail). Remember, (the-head the-tail) is short for (the-head . (the-tail . nil)). So (tl '(the-tail)) is nil. What's (hd 'just-try-to-take-my-head)? There's no answer. Symbols aren't pairs so they don't have heads. The answer isn't yes or no, it just doesn't make any sense. If I were to ask you the question, "Are you still studying blacksmithing?" Both "yes" or "no" seem funny if you've never studied blacksmithing. Same thing with (hd 'just-try-to-take-my-head). Even Common Lisp and Scheme disagree about the answer. If you ask Common Lisp about nil by saying (car nil), it says nil. If you ask Scheme, it by saying (car ()), it says huh? Specifically, the Scheme I use says, Error in car: expected type pair, got '()'. Since no two Lisps can agree, I'm not going to say what Misp does. To make a meta-interpreter you don't need to specify. The meta-interpreter will do whatever the underlying hd or tl operator does. Take your pick.

Tags

The Misp Chronicles V: Atom

William Taysom on Fri, 09 Dec 2005


Misp has two kinds of objects: symbol and pairs. Our second function tells the difference between them. When you ask (atom '(hog wash)) the answer is nil since (hog wash) is a pair. On the other hand, if you ask (atom 'hog-wash) the answer is true since symbols are atomic. Even though the strings "(hog wash)" and "hog-wash" resemble each other, in Misp they mean completely different things. Since nil is a symbol, (atom nil) is true. When the answer to a question is not nil, Misp uses something besides nil to represent it. What does it use? The exact value doesn't matter. I'm able to write my Misp meta-interpreter without ever needing to know. If forced to pick, I'd use the symbol true. Common Lisp uses T and Scheme uses #t (which is not a symbol). I prefer true, but I'm not going count it as one of my necessary constructs since I get along fine without knowing what it is.

Tags

The Misp Chronicles IV: Eq

William Taysom on Thu, 08 Dec 2005


Like any normal language, Lisp provides a way to talk about itself. In English, you can say that the word "word" is a word with four letters and the third letter is "r". Lisp works the same way except since symbols are atomic, we can only say whether or not they are the same. We use the symbol eq to ask the question. So (eq 'eq 'happy-go-lucka-day) asks the question, "Is the symbol eq the same as the symbol happy-go-lucka-day?" We can tell that they are different since their representations are different strings. In Misp, the symbol nil means "no." Therefore, this statement is true (eq (eq 'eq 'happy-go-lucka-day) nil). Notice two things. First off, you can nest expressions in Misp. The meaning of the whole expression depends on the meaning of its parts just as the meaning of a whole English sentence depends on the meaning of its words and constituent phrases. Secondly, I didn't forget to quote nil. In Misp the meaning of nil is nil. In other words, (eq 'nil nil). This is also true in Common Lisp. It is not true in Scheme. By default nil doesn't have any meaning. Scheme uses () for the empty list and #f to mean "no." Neither () nor #f are symbols in Scheme: () is null and #f is a boolean. Unlike symbols, pairs are constructed: no two are eq. Thus (eq '(a . pair) '(a . pair)) is nil.

Tags

The Misp Chronicles III: Quote

William Taysom on Wed, 07 Dec 2005


Having a way of writing things great; but, scratches on a page or black and white dots on a screen are not a language unless those scratches and pixels mean something: unless they have an interpretation. As a start, being able to recognize dots as letters, letters as strings, and strings as symbols, amounts to a primitive language. The language is literal. It has no intrinsic meaning beyond the structure of its symbols. The structure can be relatively simple like what we've seen here or it can be as complex as XML.

Unlike XML, programming languages let you do more than just structure things. A programming language gives expressions meanings and interpretations. Often, the meaning of an expression is a computation. For example, in a language with arithmetic operators, the interpretation of 7 + 2 is 9. Misp doesn't have arithmetic operators, it only has symbols and pairs. Misp gives nine symbols special meanings. Today, we will consider the first of these symbols: quote.

Quoting an expression makes it so that we treat the expression as a literal. Let's look at the example (quote hello). This expression contains two pairs and three symbols. See them? Try looking at the expression without any abbreviations (quote . (hello . nil)). A Misp interpreter determines the meaning of pair by looking at its head. In our example, the head is quote. When Misp sees quote, it knows it should interpret the expression literally. To find the literal component, it looks for the head of the second pair in the list: hello in our example. Thus, the interpretation of (quote hello) is the symbol hello.

Since quote is so useful in Lisps, it has a special shorthand. You can drop the outermost parenthesis and write a single quotation mark at the front. We can reduce (quote hello) to 'hello. The meaning of 'hello is hello. Similarly, the meaning of '(hello nurse) is (hello nurse). What's the shorthand for (quote (quote quote)) and what's it's meaning? The shorthand is ''quote. It's meaning is 'quote. Using the shorthand notation for quotes makes determining their meaning easy: just drop the quote at the front.

Tags

The Misp Chronicles II: Writing

William Taysom on Tue, 06 Dec 2005


Like all Lisps, Misp starts with symbols. Symbols are atomic, like points. They have no internal structure, only relational structure. Their relational structure is as simple as can be: two symbols are either the same or different. For convenience, we use words, strings of characters, to represent symbols. So hello-sailor, what?, and !@$%^&* are all representations of symbols. (We'll stick to strings without whitespace.) Remember that these strings are only representations of symbols, not symbols themselves. What does that mean? Well, both what? and hello-sailor have the letter 'a' in them; their associated symbols don't have any letters. All we know about the symbols is that they're different since what? and hello-sailor are different strings. Flipwise, since hello-sailor is the same string as hello-sailor, they represent the same symbol. We single out one symbol as being special. That symbol is nil. Later we'll show how nil is special because it has a specific meaning in Misp programs.

Since you can't do much in a world with primitive points, Misp needs a way to construct things out of those points. The simplest construction is to just make pairs of things. We write pairs by putting a dot between the two parts and wrapping the whole thing in parenthesis. For example, the pair of hello and world is (hello . world). Since we use dots and parenthesis in writing a pair, we won't use them as symbols. Same goes for parenthesis.

When we make pairs that have nil on their right side, we can leave off both the dot and nil. For example, in place of (surprise! . nil) we can write (surprise!). Similarly, when we pair something with a pair, we can leave off the dot and the the parenthesis around the tail. For example, in place of (fantastic . (surprise!)), we can write (fantastic surprise!). Or in place of (has . (a . (dangling . tail))), we can write (has a dangling . tail). Once we start writing a lot of things, these conventions will become very useful.

Why write (eq . ((hd . ((hd . (env . nil)) . nil)) . (key . nil))) when (eq (hd (hd env)) key) is simpler and just as unambiguous?

Tags

Minimal Lisp: Misp

William Taysom on Sun, 04 Dec 2005


Since the beginning, Lisp was meant as a language capable of interpreting itself. Today we're going to look at a minimal Lisp dialect, Misp. Like other demonstration Lisps, Misp is very simple. Misp has five built-in functions, three special forms, and one singled out constant. With those nine constructs, you can implement a Misp meta-interpreter in just fifty lines.

All of you Muses readers know that I have a preference for writing extended articles, what my coworkers call "long textual documents." I've learned that said long textual documents don't often get too far. Likewise, extended articles are not the standard for pages like Cubicle Muses. Therefore, I've resolved to publish the Misp Chronicles serially. Each installment will explore one principle of Misp: just enough to whet the appetite. The series will culminate in a Misp meta-interpreter implemented in exactly fifty lines of Misp code. To kick things off, I'll tell you a little bit about what Misp is.

Semantically, Misp is a trimmed down Scheme. Misp has lexical scope, multiple argument anonymous functions, and allows arbitrary expressions in head position (unlike Common Lisp). Misp differs from Scheme in that nil evaluates to false in an if expression. Scheme uses the special value #f for this purpose, Common Lisp uses nil.

As you would expect from any Lisp, Misp has symbols, first class functions, and builds data structures by pairing things together. Misp is purely functional and has no top-level: function application is the only way to extend the environment. This makes implementing things a bit tricky. However, just tacking on a top-level would besmirch Misp's elegance with a cantankerous cancer of kludgity.

Tweaking Misp into another Lisp would be of no use. Once you reach the summit, you can climb no higher. What's that? In the distance, through the haze, a tall dark shadow looms in the North: another peak. Cold, forbidding, the pinnacle is lost in a wisp of cloud, a brushstroke on a twilight canvas. It is the anti-Lisp. Lisp's polar opposite. It has no functions. It needs no functions. Rather, this coLisp is all environments. It has first class environments: environment after environment, in environment, of environment, through environment.

What a strange far-off place! I wonder what it's like there. I wonder how to get there. Perhaps we'll go someday. For now, enjoy Misp's tree-lined slopes. Eventually, we'll scale the meta-interpreter heights and one day we may leave the vast kingdom of Lisp for that other, distant, unexplored wilderness.

Tags
plants