You know what I think is fun? Before I answer that, let me place some additional emphasis on the fact that I find the following fun. Since you're reading this post I'm hoping you too will have some fun while you're here because, after all, learning is fun. Sorry that came out so lame, please don't leave.
With that out of the way, what I find fun is finding ways to accommodate features into a language whose design philosophy does not inherently allow for these features. Don't call this feature envy, call it feature curiosity.
This whole post started with Haskell. For those of you not familiar with Haskell, it's a wonderful functional programming language with quite a few interesting features you're not likely to find in modern programming languages. One of these features, and the basis for this post, is lazy evaluation of function arguments. Unlike Haskell, C# evaluates arguments eagerly. But why can't we have our cake and eat it too? Let's see if we can...
Eager Evaluation
Eager evaluation means that arguments are evaluated prior to their use in a function. Given that, a few things are obvious:
- Arguments are evaluated whether or not they are actually used.
- The evaluation of an argument may not finish, effectively halting the program.
- If the argument is used more than once, it is only evaluated once.
A quick example includes a function which adds together two integers, returning the result...
int Add(int x, int y)
{
return x + y;
}
... and an invocation of this function:
int result = Add(2 + 1, 1);
Although extremely straightforward, the invocation of the Add function includes a subtlety you may take for granted. What value for x are we passing to Add? The obvious answer is 3. But I was actually asking a leading question, so I also would have accepted; "Under which evaluation strategy?" In C#, our only option in the example above (without modifying the code) is to pass 3 under call by value semantics. To summarize a rather short paragraph, the expression 2 + 1 is evaluated prior to the invocation of the Add function. Let's take a look at the flip side:
Lazy Evaluation
Lazy evaluation means that arguments are not evaluated prior to their use in a function. The value of an argument is evaluated within the function as it is needed. Again, a few things are obvious:
- An argument may or may not be evaluated.
- An argument may have to be evaluated each time it is used within the function.
To get a feel for how lazy evaluation works, think of both 2 + 1 and 1 as expressions not evaluated at the call-site, and then substitute these expressions for occurrences of x and y in Add:
int Add(...)
{
return (2 + 1) + (1);
}
A constant expression may not properly represent the lazy nature of a programming language such as Haskell, but its simplification helps get the point across. A more accurate example will show the value of an argument as the result of a more complex expression, such as a function call.
We can further subdivide this category into the call-by-name and call-by-need evaluation strategies (and irrelevantly even further). The main difference between these two strategies is defined by how each implements #2 above. The call-by-name evaluation strategy evaluates some expression each time it used, whereas the call-by-need strategy memoizes some lazily evaluated expression to avoid redundant calculations. For a refresher on memoization, you can read this post from my archives.
Interestingly enough, C# does have a special case in which arguments are evaluated with call-by-name semantics. Later on I'll cover this special case along with a custom implementation. In the meantime, see if you can identify this special case without reading ahead.
Lazy Evaluation Strategies in C#
Now that we know a few lazy evaluation strategies, we can implement them in C#. Call-by-name has a straightforward implementation once we identify a wrapper class to aid in delaying the evaluation of an expression:
public class LazyExpression<T>
{
Func<T> thunk;
public LazyExpression(Func<T> Thunk)
{
thunk = Thunk;
}
public T Evaluate()
{
return thunk();
}
}
The LazyExpression class quite simply contains an expression that does not get evaluated until the Evaluate method is invoked. For those of you not familiar with the term thunk, it is used in this context to represent a delayed computation. We can create instances of LazyExpressions like such:
var x1 = new LazyExpression<int>(() => 2 + 1);
var y1 = new LazyExpression<int>(() => 1);
Although similar to our previous examples, our constant expressions now make up the body of two functions and are not evaluated until the Evaluate method is invoked. A slight modification to our Add function allows for the use of these LazyExpression instances:
int Add(LazyExpression<int> x, LazyExpression<int> y)
{
return x.Evaluate() + y.Evaluate();
}
Although the Add function is completely unnecessary, I wanted to deliver some type of bind operation to complete the monadic trio.
Before you get pedantic, please note that this may not adhere to a specific implementations definition of call-by-name, but my goal of clearly and concisely delivering the concept has been accomplished. Did you happen to figure our where C# uses this strategy yet? If not, be patient a bit longer.
Call-by-need is similar in implementation to call-by-name; we just need to memoize the result of some lazy expression and store it for later use:
public class LazyMemoizedExpression<T>
{
bool thunked;
T value;
Func<T> thunk;
public LazyMemoizedExpression(Func<T> Thunk)
{
thunked = false;
thunk = Thunk;
}
public T Evaluate()
{
if(!thunked)
{
value = thunk();
thunked = true;
}
return value;
}
}
With the addition of a generic type member and a flag to keep track of whether or not the expression has been evaluated, we have implemented the call-by-need evaluation strategy (please note that this implementation is not thread safe). Take for example the function Double which takes one parameter and returns its value doubled.
int Double(LazyMemoizedExpression<int> x)
{
return x.Evaluate() + x.Evaluate();
}
You can immediately see the benefit in memoization. Since the difference between these two evaluations strategies is completely defined within the classes themselves, it would be smart to code against an abstraction such as an interface. Since this interface is both off topic and trivial, I'll leave it out of this post.
With the obvious performance improvement call-by-need evaluation has over call-by-name, you may be questioning the usefulness of the latter. Take for example a conditional expression driving the iteration of a loop. Since the value of the conditional determines whether or not to keep looping, it's important to re-evaluate the conditional upon each iteration. If the conditional was represented by a lazily evaluated parameter, it would be vital to the correctness of your program to ensure call-by-name evaluation.
Did you figure out where call-by-name evaluation current exists in C#. If you haven't, take the following for example:
if (CheapFunction() || ExpensiveFunction())
{
...
}
In the event CheapFunction returns true, ExpensiveFunction will never get invoked due to short-circuit evaluation, which is a specialized implementation of call-by-name evaluation. The OR operator does not require the right-hand side to be evaluated in the event the left-hand side evaluates to true. Given our LazyExpression class from before, we can implement our own conditional operators and even short-circuit evaluation:
public static class LazyBoolExpression
{
public static bool And(LazyExpression<bool> LHS, LazyExpression<bool> RHS)
{
return LHS.Evaluate() && RHS.Evaluate();
}
public static bool Or(LazyExpression<bool> LHS, LazyExpression<bool> RHS)
{
return LHS.Evaluate() == true ? true : RHS.Evaluate();
}
}
The implementation of the And operator is nothing more than a simple binding function, but our Or function benefits from the lazy nature of the LazyExpression class to implement short-circuit evaluation:
var exp1 = new LazyExpression<bool>(() => true);
var exp2 = new LazyExpression<bool>(() => true || false);
if (LazyBoolExpression.And(exp1, exp2))
{
Console.WriteLine("lazy and");
}
if (LazyBoolExpression.Or(exp1, exp2))
{
Console.WriteLine("lazy or");
}
Are you likely to use what I've written about today in your daily grind? Probably not. And that's fine with me, that wasn't a goal of mine anyways. There's a big world outside of the C# bubble, and it's fun when those worlds collide.
Perhaps it would be fun to take these lazy expression classes and turn them into futures. I do think that would be fun, but fun for another day...