Helping ordinary people create extraordinary websites!
GET OUR NEWSLETTER
Your Email:
 

Higher Order Functions

By Jonathan Bartlett
2005-05-13


Building functions at runtime

In Scheme, lambda creates a function. However, functions in Scheme are "bigger" concepts than in many other languages.

To understand the difference, it is first necessary to understand how Scheme views variables. In Scheme, variables are grouped together in environments, of which there are two kinds -- the local environment and the global environment. The global environment consists of all of the global variables of a function; Scheme handles the global environment essentially the same as most other programming languages.

In addition to the global environment, Scheme has numerous local environments. A local environment is the collection of local variables and formal parameters that are visible at a given point in the program. But the local environment is more than just the names of the variables -- it's also the storage/values to which they refer. Unlike local environments in other programming languages, local environments in Scheme can be saved for later use.

So how do you save a local environment; what does that have to do with functions? In Scheme, when a function is defined, if it is defined within a local environment, that local environment permanently attaches itself to the function definition. Therefore the function has access to all of the variables that were active and in scope when the function was created, even if the function that created those variables has since returned. Let's look at some examples of functions with attached local environments.

Listing 10. Examples of functions declared within local environments

;Environment test is a function taking one argument.  It will then return

;a function which remembers that argument
(define make-echo-function
(lambda (x)
;The local environment now consists of "x"

;Create a function which takes no parameters and return it
(lambda ()
;This function returns the x defined in the enclosing
;local environment
x)))

(define echo1 (make-echo-function 1))

(display (echo1)) ;should display 1
(newline)

(define echo2 (make-echo-function 2))

(display (echo2)) ;should display 2
(newline)

(display (echo1)) ;should display 1
(newline)
First notice that echo1 and echo2 are functions. Now echo1 always gives back 1 even though it is not using any global variables and is not being passed any parameters. That's because when you create a function, the local environment is saved. In this case, the local environment consists of x. Therefore, unlike functions in languages like C, the variables created on each call of make-echo-function survive even past the point where make-echo-function ends! As long as the function returned from make-echo-function survives, the local environment will survive too.

In fact, when you define echo2, you now have two different local environments for make-echo-function that remain active. The variable names are the same, but the values in those variables are different. The computer knows which value to use because the environment is tied to the function when it is created by lambda. This strange fusion of environments and functions is called a closure.

Let's look at a slightly more complicated closure. You will create a function that starts at a given number and then, every time it is called, returns one number higher.

Listing 11. Number counter program to illustrate function-generating functions

(define make-counter

(lambda (curval)
(lambda ()
(set! curval (+ curval 1))
curval)))

(define my-counter (make-counter 0))

(display (my-counter)) ;writes 1
(newline)

(display (my-counter)) ;writes 2
(newline)

(define my-other-counter (make-counter 25))

(display (my-other-counter)) ;writes 26
(newline)

(display (my-counter)) ;writes 3
(newline)

(display (my-other-counter)) ;writes 27
(newline)
These types of functions can be useful for creating functions that maintain state from call to call. Combined with function parameters, this is a powerful technique for programming. You can make the functions passed as parameters even more powerful because Scheme closures allow you to ship pieces of state (the local environment) with the function. In programming languages such as C, any state that might be used in a function passed as an argument would either need to exist in a global variable or would need to be passed explicitly.

The point of having a function parameter within a procedure is to extend it beyond what was originally envisioned by the programmer. But if you have to explicitly pass all data you want used in your function, then you have already defeated its purpose. (I will discuss a solution to this problem later, but even then the solution isn't nearly as elegant as using closures in Scheme to attach a local environment to a function.)

When you put a function definition within an environment, what you are doing in essence is creating a template for a function. Every time the lambda form is evaluated, the template is used to create a new function using the current local environment to fill in the unknown slots.

Callback functions make closures quite handy. It is rare for the person who writes the API to know anything about the type of data that you are handling. Therefore, the writer doesn't know what kind of data needs to be passed to your callback function to make it work.

With closures, you can pack all of the data you want directly into a callback function. Let's say for instance, that an API wants a callback function of no parameters when it builds a button that will be called every time that button is pushed. That might seem problematic, especially if you have multiple buttons sharing a callback function. If the API isn't passing any data, how will you know which button was pushed?

With closures, you can define your callback function within an environment containing any information needed to make the callback effective. The API which calls receives and calls the callback doesn't need to know about any of it -- it just looks like a normal function to the API.

You can also use closures to create functions that create specialized versions of other functions. Let's say you have a function called filter-evens that takes a list and returns all even numbers from that list. The function might look like this:

Listing 12. Filtering even numbers from a list

(define filter-evens

(lambda (the-list)
(let
(
;This will hold the filtered elements
(new-list '()))
;Cycle through the elements
(for-each
(lambda (entry)
;if it's even, add it to new-list
(if (even? entry)
(set! new-list (cons entry new-list))))
the-list)
;return new-list (note that this will be in
;reverse order that the items were in the
;original list)
new-list)))

(display (filter-evens '(1 2 3 4 5 6 7 8 9))) ;should display (8 6 4 2)
(newline)
Notice first that the function you define and pass to for-each is used as a closure. new-list is defined in the enclosing environment. new-list is going to be a different value each time filter-evens is called, but you will need that same value for each iteration of for-each in order to build the list.

Filtering even numbers is a nice exercise, but if you want to filter for something else -- like odd numbers, numbers below 20, or even filter on non-numeric lists (such as all addresses in the state of Oklahoma) -- you will basically have to re-code the algorithm even though each instance of the filtering algorithm would be largely similar. So, based on the criteria you discussed previously, the code that does the filtering selection would present a great opportunity to make a function parameter.

You can break out the filtering selection into its own function. This function will return true if the element is in the list and false otherwise. It can then be made a parameter to the function to allow for alternative filters. Here is how it would look:

Listing 13. A generic filter function

(define filter

(lambda (filter-func the-list)
(let
(
;This will hold the filtered elements
(new-list '()))
;Cycle through the elements
(for-each
(lambda (entry)
;if it matches the filter function
;add it to the result list
(if (filter-func entry)
(set! new-list (cons entry new-list))))
the-list)
;return new-list (note that this will be in
;reverse order that the items were in the
;original list)
new-list)))

(define number-list '(1 2 3 4 5 6 7 8 9))

(display (filter even? number-list)) ;should display (8 6 4 2)
(newline)

(display (filter odd? number-list)) ;should display (9 7 5 3 1)
(newline)

(display (filter (lambda (x) (< x 5)) number-list)) ;should display (4 3 2 1)
(newline)
You now have a pretty useful function. Sometimes though, it may be useful to pre-package a filter function so you don't have to write it out every time you use it. This is especially true if the function parameter is a long lambda expression. In order to help create new, specialized filtering functions, you can define a new function that will generate new filtering functions. Let's call this function make-filter (Listing 14):

Listing 14. A filter-generating function

(define make-filter

(lambda (comparison)
(lambda (the-list)
(filter comparison the-list))))

(define filter-evens (make-filter even?))
(define filter-odds (make-filter odd?))
(define filter-under20 (make-filter (lambda (x) (< x 20))))
Although these examples are quite trivial, when you have more complicated filtering procedures, make-filter can save a lot of time and errors.

As mentioned, in the C programming language closures are not directly supported. C-language functions have no data associated with them when they are defined. You can have static variables, but you can only have one value in the static variable at a time. With closures, each closure has its own environment attached, allowing the programmer to have multiple instances of the function each with its own local environment.

Under the hood, Scheme closures are a structure containing two pointers -- one to the function itself and one to the environment. That allows you to simulate a closure in C by having a struct with two pointers -- one a function pointer and the other a void pointer. You use a void pointer so that you don't have to worry about what the environment looks like. You give the programmer the freedom to choose.

The function pointer is declared in a very general way in order to support a large number of functions without casting. The closure structure looks like this:

Listing 15. Closure definition in C

typedef void * (*generic_function)(void *, ...);

typedef struct {
generic_function function;
void *environment;
} closure;
The generic_function type was created in order to have the most generic calling interface to be used in a wide variety of situations. It also makes it easy to typecast other function types to be used within the closure. The environment is a void pointer, again for ease of use in casting. The parts of the program that create and consume the environment are responsible for knowing it's layout. The closure struct is only responsible for holding the pointer. Some APIs for closures define an entire marshalling system for creating and using closures (see the Resources section for a link), but this is all that's necessary for a basic implementation.

Let's look at a C version of your counter program to see how all of this works together:

Listing 16. Counter closure in C

#include <stdio.h>

#include <stdlib.h>

/* Closure definitions */
typedef void *(*generic_function)(void *p, ...);
typedef struct {
generic_function function;
void *environment;
} closure;

int nextval(void *environment);

/* This is the function that creates the closure */
closure make_counter(int startval)
{
closure c;

/* The data is very simple. You are just storing an int,
so all you need is a pointer to an int.
*/
int *value = malloc(sizeof(int));
*value = startval;

/* Setup the closure */
c.function = (generic_function)nextval;
c.environment = value;

/* Return the closure */
return c;
}

/* This is the function that is used for the closure */
int nextval(void *environment)
{
/* convert the environment data back into the form used
* by your functions
*/
int *value = environment;

/* Increment */
(*value)++;

/* Return the result */
return (*value);
}

int main()
{
/* Create the two closures */
closure my_counter = make_counter(2);
closure my_other_counter = make_counter(3);

/* Run the closures */
printf("The next value is %d\n", ((generic_function)my_counter.function)
(my_counter.environment))
printf("The next value is %d\n", ((generic_function)my_other_counter.function)
(my_other_counter.environment));
printf("The next value is %d\n", ((generic_function)my_counter.function)
(my_counter.environment));

return 0;
}
We're missing some clean-up here, but if you use garbage collection as explained in the article "Inside Memory Management" (see Resources), it won't be an issue.

Many API's perform closures differently than described here. Instead of defining a closure structure which contains both the function pointer and the environment pointer, whenever a closure is needed the function pointer and the environment pointer are passed separately in the function call. Either way, faking closures in C gets more difficult to manage as the environments being passed become more complex. The Scheme language is set up to do all the work for you, so why not take advantage of that?

Next, let's take a look at the relationship between closures in Scheme and objects in object-oriented languages.

Tutorial Pages:
» Using functions for such higher order purposes as arguments, function-generating functions, and anonymous functions
» Creating anonymous functions
» Functions as function arguments
» Using functions as arguments
» Building functions at runtime
» Functions and object-oriented programming
» Of the highest order
» Resources


First published by IBM DeveloperWorks


 | Bookmark
Related Tutorials:
» How to Install PHP 5 on Linux
» How to Install Apache 2 on Linux
» How to Install MySQL 5.0 on Linux
» SMB Caching
» Mound --Bind
» Tar Wild Card Interpretation

Advertise with Us!


Tutorials Scripts Web Hosting Developer Manuals
Resources