Chris‘s company has an unusual deployment. They had a MySQL database hosted on Cloud Provider A. They hired a web development company, which wanted to host their website on Cloud Provider B. Someone said, “Yeah, this makes sense,” and wrote the web dev company a sizable check. They app was built, tested, and released, and everyone was happy.
Everyone was happy until the first bills came in. They expected the data load for the entire month to be in the gigabytes range, based on their userbase and expected workloads. But for some reason, the data transfer was many terabytes, blowing up their operational budget for the year in a single month.
Chris fired up a traffic monitor and saw that, yes, huge piles of data were getting shipped around with every request. Well, not every request. Every insert operation ended up retrieving a huge pile of data. A little more research was able to find the culprit:
<span class="hljs-keyword">SELECT</span> last_insert_id() <span class="hljs-keyword">FROM</span> some_table_name
The last_insert_id
function is a useful one- it returns the last autogenerated ID in your transaction. So you can INSERT
, and then check what ID was assigned to the inserted record. Great. But the way it’s meant to be used is like so: SELECT last_insert_id()
. Note the lack of a FROM
clause.
By adding the FROM
, what the developers were actually saying were “grab all rows from this table, and select the last_insert_id
once for each one of them”. The value of last_insert_id()
just got repeated once for each row, and there were a lot of rows. Many millions. So every time a user inserted a row into most tables, the database sent back a single number, repeated millions and millions of times. Each INSERT
operation caused a 30MB reply. And when you have high enough traffic, that adds up quickly.
On a technical level, it was an easy fix. On a practical one, it took six weeks to coordinate with the web dev company and their hosting setup to make the change, test the change, and deploy the change. Two of those weeks were simply spent convincing the company that yes, this was in fact happening, and yes, it was in fact their fault.
Source: Read MoreÂ