Confession time. I am a true fan of functional programming and its concepts. Maybe that’s why I was curious lately how the prominent .NET compilers deal with tail recursion. So I started investigating.
TL;DR:
It’s complicated.
Despite the runtime supporting TCO with a special IL instruction
, tail-recursive function calls are never optimized nor requested to do so by the C# compiler. So in general, if you recurse too deep, expect the infamous StackOverflow
exception. That is in contrast with F# where any tail calls are always eliminated. Both compilers have their own reasons for behaving differently.
Ready for some insights? Read on!
Tail Call Optimization
As you may know, source code targeting the .NET runtime isn’t directly compiled into native machine code as understood by the CPU. Rather, the compiler emits dynamic libraries containing so called MSIL (Microsoft Intermediate Language) instructions. Those assembly-like instructions are later translated into real machine code by the CLR (Common Language Runtime). Code optimizations are performed at each stage by both the compiler and the CLR.
One of those optimizations is tail call optimization. A tail call is a function call occuring as the last instruction of a function before returning. If this call is a recursive call to the function itself, the function is tail-recursive. This makes a difference because one problem with recursion in general is that the current execution context, or stack frame, must be preserved for when the recursive call returns. Therefore, a new stack frame is allocated for each recursive call, running out of space if the recursion gets too deep at some point - the infamous StackOverflow
(left side in the picture). This can be avoided in tail recursive functions by effectively transforming the recursion into an iterative loop, or, with (non-recursive) tail calls in general, by converting the tail call to a cross-method jump reusing the current stack frame (right side in the picture). This is referred to as Tail Call Optimization (TCO) or Tail Call Elimination.
Performing TCO can be requested from the CLR by emitting a special IL instruction
. The F# compiler makes use of this for tail calls and additionally transforms tail-recursive functions into iterative loops by itself. The C# compiler does neither.
What? Let’s have a look.
The tail.
opcode
The following C# code declares and calls a very simple tail-recursive function summing up all numbers until n
. I compiled it with dotnet build -c Release
and examined the resulting IL for the function.
static int Sum(int n, int acc = 0)
{
if (n <= 0)
return acc;
return Sum(n - 1, n + acc);
}
Sum(30_000);
.method assembly hidebysig static
int32 '<<Main>$>g__Sum|0_0' (
int32 n,
[opt] int32 acc
) cil managed
{
...
IL_0000: ldarg.0
IL_0001: ldc.i4.0
IL_0002: bgt.s IL_0006
IL_0004: ldarg.1
IL_0005: ret
IL_0006: ldarg.0
IL_0007: ldc.i4.1
IL_0008: sub
IL_0009: ldarg.0
IL_000a: ldarg.1
IL_000b: add
IL_000c: call int32 Program::'<<Main>$>g__Sum|0_0'(int32, int32)
IL_0011: ret
}
Even without much knowledge about IL we are quickly able to recognize the recursive call
at IL_000c
as it remained unchanged without any attempts for TCO made by the C# compiler. And indeed, when we execute our little program we are greeted with a StackOverflow
exception. Ouch.
$ ./tco.exe
Stack overflow.
Repeat 32119 times:
--------------------------------
at Program.<<Main>$>g__Sum|0_0(Int32, Int32)
--------------------------------
at Program.<Main>$(System.String[])
What the compiler could have done here to get around that problem - apart from re-writing the function as an iterative loop itself - is asking the CLR to eliminate the tail call by emitting the tail.
opcode I already mentioned above. The instruction could have been placed directly above the recursive call
instruction.
Let’s play compiler and smuggle that in ourselves (I used dnSpy
for editing the IL and writing it back to the DLL)
.method assembly hidebysig static
int32 '<<Main>$>g__Sum|0_0' (
int32 n,
[opt] int32 acc
) cil managed
{
...
IL_0000: ldarg.0
IL_0001: ldc.i4.0
IL_0002: bgt.s IL_0006
IL_0004: ldarg.1
IL_0005: ret
IL_0006: ldarg.0
IL_0007: ldc.i4.1
IL_0008: sub
IL_0009: ldarg.0
IL_000a: ldarg.1
IL_000b: add
IL_000c: tail.
IL_000e: call int32 Program::'<<Main>$>g__Sum|0_0'(int32, int32)
IL_0013: ret
}
You see the small but important difference at IL_000c
? Now, when running the program again - it works. The CLR respected the tail.
instruction and successfully eliminated the tail call for us upon execution.
But why wasn’t the compiler emitting the instruction in the first place then if this would’ve helped in avoiding a program crash?
Well, before we address this question and see why the C# compiler is so adamant about not performing tail call optimizations or having them performed, let’s take a look at how the F# compiler compares.
Wait, it’s all loops? Always have been!
Here’s the same code snippet as above but written in F# with the resulting IL produced by the F# compiler for the sum
function.
let rec sum n acc =
if n <= 0 then
acc
else
sum (n - 1) (n + acc)
sum 30_000 0 |> ignore
.method public static
int32 sum (
int32 n,
int32 acc
) cil managed
{
...
IL_0000: nop
IL_0001: ldarg.0
IL_0002: ldc.i4.0
IL_0003: bgt.s IL_0007
IL_0005: ldarg.1
IL_0006: ret
IL_0007: ldarg.0
IL_0008: ldc.i4.1
IL_0009: sub
IL_000a: ldarg.0
IL_000b: ldarg.1
IL_000c: add
IL_000d: starg.s acc
IL_000f: starg.s n
IL_0011: br.s IL_0000
}
No recursive call this time. The F# compiler effectively transformed the tail-recursive function into a classic iterative loop instead. As a consequence the program does not crash on execution due to running out of stack space.
As a bonus, we can show with the following little example of a non-recursive tail call that the F# compiler also emits the tail.
instruction to request TCO from the CLR.
.method public static
!!b f<a, b> (
class [FSharp.Core]Microsoft.FSharp.Core.FSharpFunc`2<!!a, !!b> g,
!!a n
) cil managed
{
...
IL_0000: ldarg.0
IL_0001: ldarg.1
IL_0002: tail.
IL_0004: callvirt instance !1 class [FSharp.Core]Microsoft.FSharp.Core.FSharpFunc`2<!!a, !!b>::Invoke(!0)
IL_0009: ret
}
Time to clear things up on why these compilers behave oh so differently!
Choose your poison
While C# and F# are both general-purpose, multi-paradigm programming languages, …
-
… from the F# point of view
as a functional-first language, iterative loops are discouraged for mutating state in favor of recursion. And when recursion is the way to go, it has to be optimized in order to produce robust and efficient programs. So F# wants you to use recursion, therefore, the compiler guarantees TCO.
-
… from the C# point of view
as a more imperative language, iterative loops are conceptionally fine and (optimizing) deep recursion makes debugging harder as stack traces are destroyed. Thus, C# wants you to write loops instead and the compiler never wants TCO to happen.
So in the end it boils down to a question of principles and influences. Regardless of what camp you’re in, I believe it is important for .NET developers to have an understanding about the behavior of the compiler regarding TCO in their favorite languages to write robust and performant programs.
Hope that article helped you with that!
Objektkultur is a german software service provider, partner of Microsoft and expert in .net development