SQL Dumbass

Fighting dumbasses, one query at a time…

Lets join on two fields, wait, lets send those two fields to a functino to scrub them first, then join

Clients db is runing slow.  I hear about it thru the normal grapevine.  I look into Spotlight to see what is going on.  Sure enough, seems like its getting hit hard.  But before i can investigate more, it clears itself up. 
Few minutes later, slow again.  Clears up again.  Something up…
So start digging.  Notice a job is always running when slowness is reported.  Start asking questions.  Turns out some new code was rolled just the other day, that is called from this job. 
Lets look…
First proc i look into has an odd query.  Seems we are joining an invoice_no field between tables, but we dont know if this data has been scrubbed, so its being sent to a function to scrub it.
Wow, nice code there… lets do a table scan… and this is done in multiple places, in multiple procs, which are all called from this job, that runs every minute… wooohooo

rollback.  rewrite.

(I know what you are saying now… how did this get by me?  well, we are not part of the code review process.  We just roll it.  ARGH!!)

I need the install key

I’m minding my business.  Working on this and that.  And behind me, near me, i hear someone asking for the install key for sql 2005.    I think to ask whats going on, but don’t and get engaged again in what i was doing.
Later that day, i hear someone else asking for the same thing.  Again, i get distracted with real work.
Later even, i overhear someone else asking for it via email and IM.  Then i finaly get an IM from someone, and in passing, they ask for a key…  I ask them why they need it?  Has your install prompted you for the key?  Im my memory, ive never put in a key… I know i have a key on the cd i have at home…  but my memory is fuzy.  ITs been a while since i installed it myself… So i ask the silly question.  Has the installation prompted you for a key?
The dumbfounded response was, “No, I have not yet tried the installation.  I just assumed…”
how many man hours spent on this today?

Change control process and roll documents

Developer creates a roll document.  Listing out all objects that need to be rolled.
When its time to roll to production, we review the document with multiple groups, assuring ourselves and our process that we have covered all the bases.  DBA takes roll document, and performs a sync between UAT environment and Production, syncing only the objects that are listed in the roll document.
Fast forward 1 day….
Production support realizes, thru a belated testing, that some functionality is not present.  Tempers flare.  Developers are blamed.  DBA’s brought in too help troubleshoot the problem….
Fast forward some time….
The roll doc that the DBA used was not the correct version.  While on the group call, the developer made last minute changes to the roll doc.  These changes never made it to the one the DBA used.  Some objects were missed. 
Yeah.  Another dollar.  Another broken process that is blamed on someone other than the creators of the problem.  Makes me want to truncate all data, and delete all users.  Fix that one…

Database log is full…hey database log is full… hey, hey… did you hear me??!!?!?

Database server has little space on 2 drives.  Database is rather large.  Backups can bee way too big.  We use Litespeed to shrink the database backups down.  Even so, backup files are large, compared to the size available.
Something happens and the log file grows way too big.  Backups are kept for 6 days, and hard drive fills up.  Of course this happens Sunday around 10am…
I spend some time removing files, freeing up space.  Enough to get the server limping to Monday.
Monday, i start freeing up space again.  Truncating the log file.  Getting the db ready for the day. 

Production support individual puts in a ticket.  Then stops by my desk.  Tells me, guess what…the log file is full, could i look into it. 

What do i do?  I snap at him.  Inform him ive been working on it all morning… and was working on it Sunday….
Then i go back to cleaning up the mess….

I love replication. I love senseless tickets.

We have a system that has some simple replication going. Used mainly for reporting.  Some data is gathered from 4 dbs, and stored in a reporting database.  Some of the things that are stored are quantities of different doc types.  These are displayed on the main menu.  For some reason, some of the menus are from the real db, while others are from the report db.  This can lead to a disconnect in the values, when replication has been turned off…

Fast forward.  System needs an upgrade of the product and db.  So replication is disabled.  Upgrade proceeds.  This takes a day plus of multiple man hours.  Luckily, the DBA is not needed.  So he goes on to other tasks. 
A day or so passes, and production support individual creates a ticket that the numbers are off… please investigate.  This finally reaches the DBA, and the DBA is reminded that replication was disabled, and never reenabled after the upgrade.  For some reason, this step is always skipped by the upgraders…
DBA reminds that the upgraders need to tell him when they are done, and replication can be enabled.
DBA reenables the replication.  Asks prod support to review to verify accurateness.  While DBA is requesting review… guess who puts in another ticket… yup… prod support guy lets us know again… numbers are off…

Oh, and this is about the 10th time this has happened.  Exact sequence.  Each time, DBA requests further knowledge of completion, so replication can be reenabled.  Never occurs.  Repeat.

Please help me… something needs to be done…but im not giving you any more information…

Production Support person creates a ticket for DBA.  Ticket tells DBA that system needs some process looked into and fixed.  Ticket is received by DBA at 5:45pm.  Ticket is a level 3 priority.  Production support person is long gone home.  Poor DBA looks at ticket, thinking to himself/herself, I wonder which DB Server this ticket refers too… which client it refers too… 
DBA happens to support 13 clients, some of which have 10+ DB servers. 
Thanks Prod Support person… this ticket will have to wait further information from you… completely counteracting the ticketing system and any hope of a timely solution from the DBA. 

Please Fix

The other day I get a frantic email from a developer. Seems that our team was asked to migrate a table from production to test yesterday and in doing so we did not preserve an identity column. As such, their load process was failing. So, rather than notify us that we had made a mistake, the developer decides to play “Junior DBA” and re-add the identity himself. And after it ran for about two and a half hours an error message appeared:

The log file for database ‘databasename’ is full. Back up the transaction log for the database to free up some log space.

At the end of the email were two words: Please fix.

Okay, we should be able to do that. Of course, we have other issues now. See, the log drive for this box was about 70Gb. He did not just fill up his log, he filled up the disk. Do you know what happens to other databases that cannot use the disk to write their logs? Yeah, I know you do. Too bad he did not. Of course, this junior DBA figured that with the database in simple recovery mode that the log would not grow, and assumed we should be able to fix things in about two minutes.

Why the hell not call us at the start? As a result, the server was out of service for a few hours as we tried to get things back in order. All the while he complains that if we had done things right to begin with, none of this would have happened. My thinking? Yeah, if we had never hired you to begin with, then none of this would have happened.

Anyway, this snowballs into more problems. Now people want to know why we are not monitoring the available free space on our disk drives. Huh? We are! We knew the drive had filled up! “Well, why didn’t you do anything?” Do what? Go to the box and start killing spids? I am not the one that kicked this thing off, you did. How about having some understanding of the basic technology you are being asked to work with? Once we saw the issue we starting working to fix it. Any idea how long it takes to undo a transaction that filled up a 70Gb drive? It takes more than a few minutes.