T/SQL Tuesday–A Day in the Life

Erin Stellato (blog|twitter) handed out this month’s T-SQL Tuesday assignment, which was to blog about what a typical day is like for a DBA. Since I have been wanting to start blogging for a long time and this seemed like an easy assignment, I took up the challenge and tracked a typical day for me last week.

First, a little background: I work on an agile development team for Autodesk, a software company that has recently rolled out a new cloud-based software product in a market where hosted solutions are not the norm. I’m responsible for setting up and maintaining our environment as well as assisting with development efforts. Of course my life would not be complete if I didn’t also get to spot areas where security could be tightened up, design improved or configurations tweaked. I also run the PASS user group for Southeast Michigan, and I am a PASS regional mentor for the Heartland region.

I use the time before our morning scrum meeting to finish up a couple of things from the day before. We’ve added some new servers to our environment, and most of them are only accessible via IP address. This morning I go about editing my hosts file in an effort to make it simpler for me to use Remote Desktop when that’s necessary. SQL Configuration Manager will handle the rest.

Next I meet up with the rest of my small sub-team to do a quick recap of the prior day and discuss today’s goals. On everyone’s mind this morning is some work toward supporting a move from our current hardware setup to hosting our environment on Amazon Web Services (AWS.) AWS does not support clustering or native disk-based replication for high availability, and our product creates too many databases to use features like replication or DB mirroring (since launching the product in March, we now manage over a thousand databases and expect to continue sustained growth) to cover our need to maintain availability. After evaluating some products from other companies, we’ve decided to test some home-grown strategies. Our sysadmin has created some iSCSI targets on our AWS node and we’re going to try putting our DBs on it, then moving that storage to another AWS node and bringing them up there. It’s the first step toward making our own high availability solution in such a way that’s less vulnerable to the recent high-profile AWS outages.

I notice on this particular day that I have two interviews scheduled. I’m the sole DBA on the team right now, and we’re looking to hire another so I have been interviewing a lot of candidates lately. My first interview is scheduled shortly after the scrum meeting. This is the first job I have held where I am in a gatekeeper role as far as interviews go. It’s a good deal of stress, knowing that you hold the key to someone’s future in your hands. Brent Ozar (blog|twitter) had a great article on interviewing, and his weekly newsletter has linked to some other great resources on interview techniques and how to find the right candidate. As part of my first-round interview questions, I always include a couple of myths from Paul Randal (blog|twitter). For this candidate, that meant that after discussing locking, I ask about lock escalation. I’m always interested to see if folks will try to tell me that locks escalate from row level to page level. (They don’t.) I even had a candidate once tell me that locks escalated from row to page to table to a database level lock. I’m happy to report that my morning candidate was able to correctly answer, which is always satisfying. It shows that they’ve had some exposure to Paul’s blog, or that they’ve read the BOL entries on lock escalation carefully at some point. It is by no means a litmus test of whether a candidate moves to the next level. I also try to give every candidate some time to tell me about the things they like and are proud of in their job. Expressing their finest moments gives the candidate a chance to shine and show me something my questions might have missed.

With the interview finished, it’s about 11:30 and time for lunch. I’ve taken up a health kick and started eating salad for lunch almost every day at work. This means I eat at my desk so I can browse the web while I eat, because eating salad is very monotonous for me and I need something to distract me from the repetitive stab-munch-stab of food that is tolerable, but not terribly enjoyable. So while munching I read an article on sparse columns in SQL Server that someone linked me to. (I apologize for not noting who it was at the time.) Sparse columns are an interesting new feature and something worth considering if you’re getting pressure to use a NoSQL platform because of its schema flexibility.

I also take the opportunity during lunch to correct a tweet that comes out about my local user group event, scheduled for Thursday. I keep the MetroTwit client up and running most of the day, watching for interesting stuff to crop up. Someone tweeted that the meeting is tonight (Tuesday) and Tim Ford (blog|twitter), our speaker, caught it and did a virtual double-take, immediately DMing me that he could not make it tonight. Luckily I caught it and corrected it almost immediately, so the bad info wasn’t out there influencing anyone’s plans for the evening for more than a couple of minutes. If you’ve read Tim’s blog by now, you know that we had to cancel the meeting due to hardware problems where he works, which was sad for everyone involved. We’re working toward a reschedule.

After lunch, the junior sysadmin tells me he has moved the iSCSI target drives from one of our AWS instances to another one, and he would like me to check it out and start up SQL. I check the drives, which are all there and writable, so I fire up SQL. Uh-oh. I have 1800 DBs on this server, and every last one of them failed to start up. When I check why, it seems the folder containing my transaction logs has disappeared. Talking with the junior sysadmin, it seems he saw a folder called “logs” taking up a lot of space, so he decided they were not needed. UGH! This is a load test system, so I haven’t set up regular DB backups. Well, maybe I can use the attach_rebuild_log feature of CREATE DATABASE..FOR ATTACH to get the log rebuilt. A quick script later to loop through all the DBs, and …. Nope! Looks like most of the DBs were left in a transactionally inconsistent state and so the log cannot be rebuilt. I’ll have to resort to yet another blog post from Paul where he details the process of trying to recover a DB when this has happened. Luckily these are just test databases, and a little data loss won’t hurt this environment. However, another issue has cropped up with one of our production DBs, so this will have to wait.

At around 1:30 one of our developers who has support responsibilities notifies me that we have an issue with a client’s production DB. The developer and I work through a script that will fix their database. We work until about 3pm on the script, when I have another interview scheduled. This candidate doesn’t get the escalation question right, explaining to me that lock escalation goes from row/page/table. Sometimes I correct this misconception if the candidate seems otherwise qualified, and there are times when I just let it go and move on. I had a candidate insist that she didn’t have a lot of experience with “theoretical” questions but was great with day-to-day stuff. I had to wonder what that means, since lock escalation is by no means theoretical and probably happens on your database on a pretty regular basis unless you’ve disabled it.

At 4pm, I contact my boss and ask him to move our scheduled appointment out until I can deploy the bug fix script the developer and I had been working on. We deploy it to a test environment, and there we verify that the app no longer delivers an error and the issue seems resolved, so we immediately deploy to production as well.

At 4:20 I go to my boss’s office for the first of 3 meetings to give him some info on things he might need to do while I am on vacation. We spend the next 40 minutes discussing various ways to troubleshoot our production cluster, including using SQLSentry Performance Advisor as well as sp_whoisactive. We spend the majority of time looking at the various features of SQL Sentry PA and getting used to its interface, and probably only 10 – 15 minutes on sp_whoisactive since it is a much more straightforward.

With that done, it’s quitting time! I’m looking forward to my vacation.

As of this writing, we still have an unfilled DBA position on my team. If you’re interested in working in an exciting environment with great benefits and are in the Southeast Michigan or Toronto area, send me an e-mail or leave a comment below.