Robert was diagnosing a problem in a reporting module. The application code ran a fairly simple query- SELECT field1, field2, field3 FROM report_table
– so he foolishly assumed that it would be easy to understand the problem. Of course, the “table” driving the report wasn’t actually a table, it was a view in the database.
Most of our readers are familiar with how views work, but for those who have had been corrupted by NoSQL databases: database views are great- take a query you run often, and create it as an object in the database:
CREATE VIEW my_report
AS
SELECT t1.someField as someField, t2.someOtherField as someOtherField
FROM table1 t1 INNER JOIN table2 t2 ON t1.id = t2.id
Now you can query SELECT * FROM my_report WHERE someField > 5
.
Like I said: great! Well, usually great. Well, sometimes great. Well, like anything else, with great power comes great responsibility.
Robert dug into the definition of the view, only to find that the tables it queried were themselves views. And those were in turn, also views. All in all, there were nineteen layers of nested views. The top level query he was trying to debug had no real relation to the underlying data, because 19 layers of abstraction had been injected between the report and the actual data. Even better- many of these nested views queried the same tables, so data was being split up and rejoined together in non-obvious and complex ways.
The view that caused Robert to reach out to us was this:
ALTER VIEW [LSFDR].[v_ControlDate]
AS
SELECT
GETDATE() AS controlDate
--GETDATE() - 7 AS controlDate
This query is simply invoking a built-in function which returns today’s date. Why not just call the function? We can see that once upon a time, it did offset the date by seven days, making the control date a week earlier. So I suppose there’s some readability in mytable m INNER JOIN v_ControlDate cd ON m.transactionDate > cd.controlDate
, but that readability also hides the meaning of control date.
That’s the fundamental problem of abstraction. We lose details and meaning, and end up with 19 layers of stuff to puzzle through. A more proper solution may have been to actually implement this as a function, not a view- FROM mytable m WHERE m.transactionDate > getControlDate()
. At least here, it’s clear that I’m invoking a function, instead of hiding it deep inside of a view called from a view called from a view.
In any case, I’d argue that the actual code we’re looking at isn’t the true WTF. I don’t like this view, and I wouldn’t implement it this way, but it doesn’t make me go “WTF?” The context the view exists in, on the other hand, absolutely does. 19 layers! Is this a database or a Russian Honey Cake?
The report, of course, didn’t have any requirements defining its data. Instead, the users had worked with the software team to gradually tweak the output over time until it gave them what they believed they wanted. This meant actually changing the views to be something comprehensible and maintainable wasn’t a viable option- changes could break the report in surprising and non-obvious ways. So Robert was compelled to suffer through and make the minimally invasive changes required to fix the view and get the output looking like what the users wanted.
The real WTF? The easiest fix was to create another view, and join it in. Problems compound themselves over time.
Keep the plebs out of prod. Restrict NuGet feed privileges with ProGet. Learn more.
Source: Read MoreÂ