How slow is LINQ?

A real-life example of LINQ queries optimization.

Oleksandr Redka
C# Programming

--

Note: given article is written for .Net 5 and does not include performance improvements from .Net 6 and .Net 7. Article will be adjusted to latest .Net version soon.

Disclaimer

Write clean code. Measure. Optimize.
Optimization of LINQ is necessary only when it is the root of the problem.

If you make an optimization and don’t measure to confirm the performance increase, all you know for certain is that you’ve made your code harder to read.
Martin Fowler

Introduction

Project

I’m working on an open-source pet project called QueryNinja that provides dynamic query building. Changes defined in this article are related to version v1.1.0 and will be applied in future releases. The main execution path includes creating the query using Asp.Net Core ModelBinding and translating the query to Expression Trees.

LINQ vs. ‘for’ loop

‘For’ loops is robust and straightforward. No heap allocations are needed to execute the loop. Also, accessing the Array or List element by the index is a constant time operation.

LINQ has two sources of additional memory allocation:

However, LINQ is much easier to read and maintain in the long run. Compare these two code pieces that will give Ids of all customers that have at least two orders of more than 1000$ in total.

LINQ version of the query
‘For’ loop example

We can see that the ‘For’ option is dramatically more complicated than LINQ.

The Problem

As an owner of the QueryNinja, I assume that Performance is one of the main features. I’m aimed at minimization the impact that my package makes on the application performance.

Let’s focus on the model binding part.

The goal is to deserialize query parameters into a Query Components list and then join them in a single Query instance. Let’s take a look at the part of QueryNinjaModelBinder class.

Create IQueryComponent for all suitable query string parameters

A few points to mention here:

  • QueryNinjaExtensions is a part of the Extensibility model that allows adding own Query Components.
  • IQueryComponentFactory knows how to create IQueryComponent from query string parameter.
  • This method is also an Iterator.

Optimization

As a starting point, I’ve already created several benchmarks with BenchmarkDotNet. They cover several parts of my package's main execution path but let’s focus on the ModelBinderBenchmarks.

ModelBinder benchmark

Benchmark measures the whole model binding process. A few notes here:

  • ‘Scenario.Context’ has mocked ModelBindingContext containing prepared Query Parameters and other information needed during ModelBinding.
  • Benchmark has multiple scenarios, but we will use only one of them that focuses on the GetQueryComponents method.

I will create a copy of the ModelBinder and make him a baseline for our tests. This is how our benchmark is looking now:

Benchmarks with Baseline

Let’s refactor our GetQueryComponents method using ‘For’ loops instead of LINQ. One small change at a time is a rule of thumb for optimizations. So, I’ve replaced FirstOrDefault with the ‘For’ loop and removed the Iterator part.

GetQueryComponents with ‘For’ loop

It’s not a surprise as the method gained a few code lines and got a bit messy because of indexes. However, let’s take a look at the benchmarking results.

We saved about 11% of the execution time and got rid of more than 20% of allocated memory! As a result of reduced allocations, we lowered the average GC calls for Gen0 and Gen1.

This is a definite sign of a good performance change. Let’s see how our results will change if we will remove all LINQ from ModelBinding.

Things measured:

  • 10 Filter Query Parameters
  • 10 OrderingRule Query Parameters
  • 10 Select Query Parameters
  • 30 mixed Query Parameters
  • No Query Parameters
ModelBinder Benchmark with all senarios

A summary on benchmarking:

  • Decreased allocation up to 50% for complicated cases.
  • Fewer GC invocations take place as well.
  • Empty Query takes a bit more time to execute for some reason.

If you are interested to see other changes related to performance improvements, visit the Pull-Request page.

Conclusion

Simple LINQ queries can have performance similar to ‘For’ lops. However, they require much more allocations to operate.

On a large scale, allocations will lead to more frequent GC calls and performance decrease across all application layers.

LINQ is the best option for most cases. However, for performance-critical path usage of loops might be a better decision.

Looking back at the disclaimer, we need to find a way to measure our actual outcome and compare it with expected before getting rid of LINQ.

References

I’m not affiliated with any of the tools except QueryNinja. All recommendations here are tools that I’m using in my work.

Source Code

Articles

Tools Used

--

--