Recently I got the chance to do some query tuning for a client. The client had contacted us because their users were experiencing particularly slow performance running reports from their BI Dashboard. Their internal teams thoroughly reviewed their network and SSRS packages and couldn’t find anything untoward. When they dug deeper into one particularly slow report, they realised that this report was quite heavily SQL-based and they asked us to investigate. After reviewing the server activity and CPU / memory performance and not finding anything to be concerned about, I turned my attention to the main query within the report and found two really juicy targets for refactoring: LEFT OUTER JOINs and DATETIME conversions in the WHERE clause.
SARG stands for Searchable ARGuments, which in the SQL world refers to conditions which can be evaluated via indexes. It can often be helpful to have indexes over filtering columns, i.e. columns that frequently appear in WHERE clauses which helps SQL server optimise the query path. However, if you attempt to convert or transform these column inside the WHERE clause then you prevent SQL Server from being able to use an index, and it has to compare each row individually.
Below is an example with a DATETIME conversion and the comparative query plans.
use [AdventureWorks2012]; go -- CREATE INDEX to help SQL Server search for OrderDate create nonclustered index idx_orderDate1 on Sales.SalesOrderHeader (OrderDate, CustomerID); -- Query 1: non-SARGable Filter on OrderDate select CustomerID from Sales.SalesOrderHeader where DATEPART(yy, OrderDate) >= '2008'; -- Query 2: improving performance by removing the DATEPART() transformation select CustomerID from Sales.SalesOrderHeader where OrderDate >= '2008-01-01';
Below is the query plan for these two “equivalent” queries. Notice that Query 2 is three times more efficient than Query 1, and the benefit is magnified in larger, more complex queries:
Optimising Joins by Removing LEFT OUTER JOINs
This one surprised me a little. I hadn’t really considered the relative efficiency of different join methods before. Obviously a CROSS JOIN, or cartesian product, is massively data-intensive – but I hadn’t really stopped and thought about comparing an INNER JOIN with an OUTER JOIN before. So I was perhaps a little lucky to stumble on this while tuning the problem query and I decided to experiment. Below is an example of a LEFT OUTER JOIN, rewritten into and equivalent INNER JOIN:
USE [AdventureWorks2012]; go -- Query 1: -- SELECT all customers who have never placed an order select CustomerID from Sales.Customer except select CustomerID from Sales.SalesOrderHeader; -- Query 2: -- Compare all customer with SalesOrder, -- returning those Customers who have never placed an order select C.CustomerID, H.OrderDate, H.CustomerID from Sales.Customer as C left outer join Sales.SalesOrderHeader as H ON H.CustomerID = C.CustomerID where H.OrderDate IS NULL;
In this case, notice that Query 1 (INNER JOIN) is two times more efficient than Query 2 (LEFT OUTER JOIN):
Reflecting on this, it isn’t too surprising as the LEFT OUTER JOIN has to “juggle” more data before eventually collapsing to the equivalent result. But I thought it was interesting.