Today’s anonymous submitter worked for a “large, US-based, e-commerce company.” This particular company was, some time back, looking to save money, and like so many companies do, that meant hiring offshore contractors.
Now, I want to stress, there’s certainly nothing magical about national borders which turns software engineers into incompetents. The reality is simply that contractors never have their client’s best interests at heart; they only want to be good enough to complete their contract. This gets multiplied by the contracting firm’s desire to maximize their profits by keeping their contractors as booked as possible. And it gets further multiplied by the remoteness and siloing of the interaction, especially across timezones. Often, the customer sends out requirements, and three months later gets a finished feature, with no more contact than that- and it never goes well.
All that said, let’s look at some SQL Server code. It’s long, so we’ll take it in chunks.
-- ===============================================================================
-- Author : Ignacius Ignoramus
-- Create date: 04-12-2020
-- Description: SP of Getting Discrepancy of Allocation Reconciliation Snapshot
-- ===============================================================================
That the comment reinforces that this is an “SP”, aka stored procedure, is already not my favorite thing to see. The description is certainly made up of words, and I think I get the gist.
ALTER PROCEDURE [dbo].[Discrepency]
(
@startDate DATETIME,
@endDate DATETIME
)
AS
BEGIN
Nothing really to see here; it’s easy to see that we’re going to run a query for a date range. That’s fine and common.
DECLARE @tblReturn TABLE
(
intOrderItemId INT
)
Hmm. T-SQL lets you define table variables, which are exactly what they sound like. It’s a local variable in this procedure, that acts like a table. You can insert/update/delete/query it. The vague name is a little sketch, and the fact that it holds only one field also makes me go “hmmm”, but this isn’t bad.
DECLARE @tblReturn1 TABLE
(
intOrderItemId INT
)
Uh oh.
DECLARE @tblReturn2 TABLE
(
intOrderItemId INT
)
Oh no.
DECLARE @tblReturn3 TABLE
(
intOrderItemId INT
)
Oh no no no.
DECLARE @tblReturn4 TABLE
(
intOrderItemId INT
)
This doesn’t bode well.
So they’ve declared five variables called tblReturn
, that all hold the same data structure.
What happens next? This next block is gonna be long.
INSERT INTO @tblReturn --(intOrderItemId) VALUES (@_ordersToBeAllocated)
/* OrderItemsPlaced */
select
intOrderItemId
from CompanyDatabase..Orders o
inner join CompanyDatabase..OrderItems oi on oi.intOrderId = o.intOrderId
where o.dtmTimeStamp between @startDate and @endDate
AND intOrderItemId Not In
(
/* _itemsOnBackorder */
select intOrderItemId
from CompanyDatabase..OrderItems oi
inner join CompanyDatabase..Orders o on o.intOrderId = oi.intOrderId
where o.dtmTimeStamp between @startDate and @endDate
and oi.strstatus='backordered'
)
AND intOrderItemId Not In
(
/* _itemsOnHold */
select intOrderItemId
from CompanyDatabase..OrderItems oi
inner join CompanyDatabase..Orders o on o.intOrderId = oi.intOrderId
where o.dtmTimeStamp between @startDate and @endDate
and o.strstatus='ONHOLD'
and oi.strStatus <> 'BACKORDERED'
)
AND intOrderItemId Not In
(
/* _itemsOnReview */
select intOrderItemId
from CompanyDatabase..OrderItems oi
inner join CompanyDatabase..Orders o on o.intOrderId = oi.intOrderId
where o.dtmTimeStamp between @startDate and @endDate
and o.strstatus='REVIEW'
and oi.strStatus <> 'BACKORDERED'
)
AND intOrderItemId Not In
(
/*_itemsOnPending*/
select intOrderItemId
from CompanyDatabase..OrderItems oi
inner join CompanyDatabase..Orders o on o.intOrderId = oi.intOrderId
where o.dtmTimeStamp between @startDate and @endDate
and o.strstatus='PENDING'
and oi.strStatus <> 'BACKORDERED'
)
AND intOrderItemId Not In
(
/*_itemsCancelled */
select intOrderItemId
from CompanyDatabase..OrderItems oi
inner join CompanyDatabase..Orders o on o.intOrderId = oi.intOrderId
where o.dtmTimeStamp between @startDate and @endDate
and oi.strstatus='CANCELLED'
)
We insert into @tblReturn
the result of a query, and this query relies heavily on using a big pile of subqueries to decide if a record should be included in the output- but these subqueries all query the same tables as the root query. I’m fairly certain this could be a simple join with a pretty readable where
clause, but I’m also not going to sit here and rewrite it right now, we’ve got a lot more query to look at.
INSERT INTO @tblReturn1
/* _backOrderItemsReleased */
select intOrderItemId
from CompanyDatabase..OrderItems oi
inner join CompanyDatabase..orders o on o.intorderid = oi.intorderid
where oi.intOrderItemid in (
select intRecordID
from CompanyDatabase..StatusChangeLog
where strRecordType = 'OrderItem'
and strOldStatus in ('BACKORDERED')
and strNewStatus in ('NEW', 'RECYCLED')
and dtmTimeStamp between @startDate and @endDate
)
and o.dtmTimeStamp < @startDate
UNION
(
/*_pendingHoldItemsReleased*/
select intOrderItemId
from CompanyDatabase..OrderItems oi
inner join CompanyDatabase..orders o on o.intorderid = oi.intorderid
where oi.intOrderID in (
select intRecordID
from CompanyDatabase..StatusChangeLog
where strRecordType = 'Order'
and strOldStatus in ('REVIEW', 'ONHOLD', 'PENDING')
and strNewStatus in ('NEW', 'PROCESSING')
and dtmTimeStamp between @startDate and @endDate
)
and o.dtmTimeStamp < @startDate
)
UNION
/* _reallocationsowingtonostock */
(
select oi.intOrderItemID
from CompanyDatabase.dbo.StatusChangeLog
inner join CompanyDatabase.dbo.OrderItems oi on oi.intOrderItemID = CompanyDatabase.dbo.StatusChangeLog.intRecordID
inner join CompanyDatabase.dbo.Orders o on o.intOrderId = oi.intOrderId
where strOldStatus = 'RECYCLED' and strNewStatus = 'ALLOCATED'
and CompanyDatabase.dbo.StatusChangeLog.dtmTimestamp > @endDate and
strRecordType = 'OrderItem'
and intRecordId in
(
select intRecordId from CompanyDatabase.dbo.StatusChangeLog
where strOldStatus = 'ALLOCATED' and strNewStatus = 'RECYCLED'
and strRecordType = 'OrderItem'
and CompanyDatabase.dbo.StatusChangeLog.dtmTimestamp between @startDate and @endDate
)
)
Okay, just some unions with more subquery filtering. More of the same. It’s the next one that makes this special.
INSERT INTO @tblReturn2
SELECT intOrderItemId FROM @tblReturn
UNION
SELECT intOrderItemId FROM @tblReturn1
Ah, here’s the stuff. This is just bonkers. If the goal is to combine the results of these queries into a single table, you could just insert into one table the whole time.
But we know that there are 5 of these tables, so why are we only going through the first two to combine them at this point?
INSERT INTO @tblReturn3
/* _factoryAllocation*/
select
oi.intOrderItemId
from CompanyDatabase..Shipments s
inner join CompanyDatabase..ShipmentItems si on si.intShipmentID = s.intShipmentID
inner join Common.CompanyDatabase.Stores stores on stores.intStoreID = s.intLocationID
inner join CompanyDatabase..OrderItems oi on oi.intOrderItemId = si.intOrderItemId
inner join CompanyDatabase..Orders o on o.intOrderId = s.intOrderId
where s.dtmTimestamp >= @endDate
and stores.strLocationType = 'FACTORY'
UNION
(
/*_storeAllocations*/
select oi.intOrderItemId
from CompanyDatabase..Shipments s
inner join CompanyDatabase..ShipmentItems si on si.intShipmentID = s.intShipmentID
inner join Common.CompanyDatabase.Stores stores on stores.intStoreID = s.intLocationID
inner join CompanyDatabase..OrderItems oi on oi.intOrderItemId = si.intOrderItemId
inner join CompanyDatabase..Orders o on o.intOrderId = s.intOrderId
where s.dtmTimestamp >= @endDate
and stores.strLocationType <> 'FACTORY'
)
UNION
(
/* _ordersWithAllocationProblems */
select oi.intOrderItemId
from CompanyDatabase.dbo.StatusChangeLog
inner join CompanyDatabase.dbo.OrderItems oi on oi.intOrderItemID = CompanyDatabase.dbo.StatusChangeLog.intRecordID
inner join CompanyDatabase.dbo.Orders o on o.intOrderId = oi.intOrderId
where strRecordType = 'orderitem'
and strNewStatus = 'PROBLEM'
and strOldStatus = 'NEW'
and CompanyDatabase.dbo.StatusChangeLog.dtmTimestamp > @endDate
and o.dtmTimestamp < @endDate
)
Okay, @tblReturn3
is more of the same. Nothing more to really add.
INSERT INTO @tblReturn4
SELECT intOrderItemId FROM @tblReturn2 WHERE
intOrderItemId NOT IN(SELECT intOrderItemId FROM @tblReturn3 )
Ooh, but here we see something a bit different- we’re taking the set difference between @tblReturn2
and @tblReturn3
. This would almost make sense if there weren’t already set operations in T-SQL which would handle all of this.
Which brings us, finally, to the last query in the whole thing:
SELECT
o.intOrderId
,oi.intOrderItemId
,o.dtmDate
,oi.strDescription
,o.strFirstName + o.strLastName AS 'Name'
,o.strEmail
,o.strBillingCountry
,o.strShippingCountry
FROM CompanyDatabase.dbo.OrderItems oi
INNER JOIN CompanyDatabase.dbo.Orders o on o.intOrderId = oi.intOrderId
WHERE oi.intOrderItemId IN (SELECT intOrderItemId FROM @tblReturn4)
END
At the end of all this, I’ve determined a few things.
First, the developer responsible didn’t understand table variables. Second,they definitely didn’t understand joins. Third, they had no sense of the overall workflow of this query and just sorta fumbled through until they got results that the client said were okay.
And somehow, this pile of trash made it through a code review by internal architects and got deployed to production, where it promptly became the worst performing query in their application. Correction: the worst performing query thus far.

Utilize BuildMaster to release your software with confidence, at the pace your business demands. Download today!
Source: Read MoreÂ