Dmitrii Nosov
3 min read

Categories

  • csharp
  • reflection
  • il
  • performance
Table of contents:

Problem

Do you want to invoke action million times? Delegates are slow - they cannot be inlined. Not inlined methods have a call overhead - jumps and reduntant data movement. Sure, you can just inline manually or replace delegates with methods, but there are a lot of boilerplate code.

However there are a trick. C# dynamic methods allow to generate IL code at runtime. You can emit call instruction using MethodInfo which will be inlined by compiler (or not - it’s depends on size of method and compiler settings).

Dynamic methods

Instance Action

Let’s create method to call Action<int>

First, create instance:

DynamicMethod meth = new("CallFun", typeof(void), new[] {typeof(int), typeof(object?)});


Second, obtain IlGenerator:

ILGenerator il = meth.GetILGenerator();


And finally, emit IL code:

il.EmitLdArg(1);                        // load instance
il.EmitLdArg(0);                        // load int argument
il.Emit(OpCodes.Call, original.Method); // call MethodInfo of delegate
il.Emit(OpCodes.Ret);                   // return;


Let’s invoke our dynamic method:

Action<int,object?> generated = meth.CreateDelegate<Action<int,object?>>();
generated(7653, original.Target);

Static Action

If you have an error in previous section, that’s probably because you used static methods. All anonymous methods are instance (even if they have static modifier), but local functions and normal methods can be static.

You can check it using original.Method.IsStatic.
To call static method, you don’t need instance, so remove second argument:

DynamicMethod meth = new("CallFun", typeof(void), new[] {typeof(int)});
ILGenerator il = meth.GetILGenerator();
il.EmitLdArg(0);                        // load int argument
il.Emit(OpCodes.Call, original.Method); // call MethodInfo of delegate
il.Emit(OpCodes.Ret);                   // return;

Loops

Methods above are not so efficient - you just create another delegate that call’s method. If you want to invoke original delegate in a loop efficiently - emit a loop in DynamicMethod.

You can write required code and use Rider IL viewer to view il code you need to emit

The required code for for (int i = 0; i < end; i++) {} will be:

// set end to 1000
IL_0000: ldc.i4       1000
IL_0005: stloc.0

// set i to 0
IL_0006: ldc.i4.0
IL_0007: stloc.1

// goto if (i >= end) break;
IL_0008: br.s         IL_0018
// start of loop
    // loop body
    // ...

    // i++
    IL_0014: ldloc.1      // i
    IL_0015: ldc.i4.1
    IL_0016: add
    IL_0017: stloc.1      // i

    // if (i >= end) break;
    IL_0018: ldloc.1      // i
    IL_0019: ldloc.0      // end
    IL_001a: blt.s        IL_000a
// end of loop


So, our loop method will be:

// setup
Action<int> original = i => Console.WriteLine($"[{i}] Hello!");
// the arguments are: (iterations, instance)
// iterations argument is not a argument from original delegate
DynamicMethod meth = new("CallFun", typeof(void), new[] {typeof(int), typeof(object)});
ILGenerator il = meth.GetILGenerator();

// init i
il.DeclareLocal(typeof(int));           // declare i
il.Emit(OpCodes.Ldc_I4_0);              // load 0 as int
il.Emit(OpCodes.Stloc_0);               // store to i
// define loop labels
Label testLabel = il.DefineLabel();
Label execLabel = il.DefineLabel();

// loop start
il.Emit(OpCodes.Br, testLabel);
il.MarkLabel(execLabel);
// loop body
il.Emit(OpCodes.Ldarg_1);               // load instance
il.Emit(OpCodes.Ldloc_0);               // load i
il.Emit(OpCodes.Call, original.Method);	// call
// i++
il.Emit(OpCodes.Ldloc_0);               // load i
il.Emit(OpCodes.Ldc_I4_1);              // load 1 as int
il.Emit(OpCodes.Add);
il.Emit(OpCodes.Stloc_0);               // store to i
// loop test (if (i < iterations) goto exec;)
il.MarkLabel(testLabel);
il.Emit(OpCodes.Ldloc_0)                // load i
il.Emit(OpCodes.Ldarg_0);               // load iterations
il.Emit(OpCodes.Blt, execLabel);        // goto exec if i < iterations

// return
il.Emit(OpCodes.Ret);

// complete and invoke
Action<int,object?> generated = meth.CreateDelegate<Action<int,object?>>();
generated(10, original.Target);

Batches

Check instruction support before execution! Many processors doesn't support all of the instruction sets

JIT compiler doesn’t vectorize code (like c++), but you can manually vectorize code using System.Runtime.Intrinsics:

int iterations = 1000;
int batchSize = 4;
int end = iterations & ~(batchSize-1);
for (int i = 0; i < end; i+=batchSize) {
    // simd code
}
for (int i = end; i < iterations; i++) {
    // simple code
}

Benchmarks

Formula:

  • simple: *v = 3 * *v
  • Sse41 : *(Vector128<int>*)v = Sse41.MultiplyLow(i->extraData, *(Vector128<int>*)v)

Setup:

  • lib: Benchmark.Net
  • cpu: AMD FX(tm)-8300
  • logical cores: 8
  • physical cores: 4
  • OS: Arch linux

This scale is logarithmic