LINQ is the first thing to come along in quite an while that has had a significant impact on how I write and architect code. I really like the power and flexibility of its list processing and how it can improve the clarity and understanding of the purpose of code. That said, it’s not all lilies and roses. How you write LINQ code can make a significant impact on the performance, especially where databases are involved.
Consider the following query. “data” is our input structure, containing the web form entry fields, and we’re searching the database for matching records. Since this is a search form, we don’t know which fields will have input data, so we need to test for either null or matching values.
var intermediate = db.Search.Where(p =>
(data.permit == null || p.permit.Contains(data.permit)) &&
(data.apn == null || p.apn.Contains(data.apn)) &&
(data.addrno == null || p.addrno.Contains(data.addrno)) &&
(data.prefix == null || p.prefix.Contains(data.prefix)) &&
(data.street == null || p.street.Contains(data.street)) &&
(data.comm == null || p.comm.Contains(data.comm))
).OrderBy(o => o.permit).Take(1000);
// execute sql
results.resultlist = intermediate.ToList();
This is the common sort of search query you’ll see when you have a pile of inputs and any or all of them could contain values. For each parameter in our input data structure “data” we test for either null or a match by contains. Matching on ‘Contains’ rather than explicit equals means that we get a ‘fuzzy’ match on our search. The benefit of this LINQ statement is that it’s compact, clear and relatively easy to see what’s going on.
Here’s the same logical construct, but done in a very different style:
IQueryable<JSONPermitLookupSearchResult> intermediate = db.Search.Where(p => 1 == 1);
if (null != data.permit)
{
intermediate = intermediate.Where(p => p.permit.Contains(data.permit));
}
if (null != data.apn)
{
intermediate = intermediate.Where(p => p.apn.Contains(data.apn));
}
if (null != data.addrno)
{
intermediate = intermediate.Where(p => p.addrno.Contains(data.addrno));
}
if (null != data.prefix)
{
intermediate = intermediate.Where(p => p.prefix.Contains(data.prefix));
}
if (null != data.street)
{
intermediate = intermediate.Where(p => p.street.Contains(data.street));
}
if (null != data.comm)
{
intermediate = intermediate.Where(p => p.comm.Contains(data.comm));
}
intermediate.OrderBy(o => o.permit).Take(1000);
// execute sql
results.resultlist = intermediate.ToList();
It’s much more difficult to follow what’s going on here, even though both code fragments do the same thing. In this case, each input parameter is tested for a value, and if the parameter has a value the parameter is included in the LINQ statement. Each parameter is included separately with an individual call to the .Where() method, and parameters without values are not included in the LINQ statement.
The question is, which one performs better? Using SQL Profiler, I can capture the SQL generated for each LINQ statement, and then run them individually and look at the statistics for each. Given a single input value to search for, the first LINQ query produces the following SQL Statement:
exec sp_executesql N'SELECT
[Extent1].[permit] AS [permit],
[Extent1].[person] AS [person],
[Extent1].[apn] AS [apn],
[Extent1].[casetype] AS [casetype],
[Extent1].[permittype] AS [permittype],
[Extent1].[description] AS [description],
[Extent1].[addrno] AS [addrno],
[Extent1].[prefix] AS [prefix],
[Extent1].[street] AS [street],
[Extent1].[unit] AS [unit],
[Extent1].[comm] AS [comm],
[Extent1].[status] AS [status],
[Extent1].[statusdate] AS [statusdate]
FROM [dbo].[vw_PermitLookupSearch_2] AS [Extent1]
WHERE ((@p__linq__0 IS NULL) OR ([Extent1].[permit] LIKE @p__linq__1 ESCAPE N''~''))
AND ((@p__linq__2 IS NULL) OR ([Extent1].[apn] LIKE @p__linq__3 ESCAPE N''~''))
AND ((@p__linq__4 IS NULL) OR ([Extent1].[addrno] LIKE @p__linq__5 ESCAPE N''~''))
AND ((@p__linq__6 IS NULL) OR ([Extent1].[prefix] LIKE @p__linq__7 ESCAPE N''~''))
AND ((@p__linq__8 IS NULL) OR ([Extent1].[street] LIKE @p__linq__9 ESCAPE N''~''))
AND ((@p__linq__10 IS NULL) OR ([Extent1].[comm] LIKE @p__linq__11 ESCAPE N''~''))',
N'@p__linq__0 nvarchar(4000),@p__linq__1 nvarchar(4000),@p__linq__2 nvarchar(4000),
@p__linq__3 nvarchar(4000),@p__linq__4 nvarchar(4000),@p__linq__5 nvarchar(4000),
@p__linq__6 nvarchar(4000),@p__linq__7 nvarchar(4000),@p__linq__8 nvarchar(4000),
@p__linq__9 nvarchar(4000),@p__linq__10 nvarchar(4000),@p__linq__11 nvarchar(4000)',
@p__linq__0=N'S990008L',@p__linq__1=N'%S990008L%',@p__linq__2=NULL,@p__linq__3=NULL,
@p__linq__4=NULL,@p__linq__5=NULL,@p__linq__6=NULL,@p__linq__7=NULL,@p__linq__8=NULL,
@p__linq__9=NULL,@p__linq__10=NULL,@p__linq__11=NULL
It’s long and a bit messy, but since the computer’s handling everything I don’t particularly care. Here’s the important bit:
Time Statistics
Client processing time 1
Total execution time 1141
Wait time on server replies 1140
With clean buffers, the total execution time is 1141 milliseconds for this query.
The second LINQ query, the one assembled from pieces and parts instead of one single LINQ statement, surprisingly produces a much simpler SQL query:
exec sp_executesql N'SELECT
[Extent1].[permit] AS [permit],
[Extent1].[person] AS [person],
[Extent1].[apn] AS [apn],
[Extent1].[casetype] AS [casetype],
[Extent1].[permittype] AS [permittype],
[Extent1].[description] AS [description],
[Extent1].[addrno] AS [addrno],
[Extent1].[prefix] AS [prefix],
[Extent1].[street] AS [street],
[Extent1].[unit] AS [unit],
[Extent1].[comm] AS [comm],
[Extent1].[status] AS [status],
[Extent1].[statusdate] AS [statusdate]
FROM [dbo].[vw_PermitLookupSearch_2] AS [Extent1]
WHERE [Extent1].[permit] LIKE @p__linq__0 ESCAPE N''~''',N'@p__linq__0 nvarchar(4000)',
@p__linq__0=N'%S990008L%'
When run with clean buffers, the results are equally surprising:
Time Statistics
Client processing time 0
Total execution time 862
Wait time on server replies 862
This query completed in 862 milliseconds.
The conclusion? The complex, messy, manually assembled from bits and pieces LINQ statement performed significantly better on the database than the simple, easy to understand LINQ statement. At a 1141 milliseconds versus 862 milliseconds, the performance of the second LINQ statement is noticably better on the client.
The difference is in how the LINQ process generated the SQL statement, and it’s easy to see that assembling LINQ statements to only include the necessary parameters can produce much simpler and faster SQL queries.
Read Full Post »