Segmenting Environments in SQL Server

In any environment where you have multiple DBAs and a heavy workload, there exists the chance for mistakes to be made. When those mistakes are made on the production system, bad things can happen. Depending on what your production system does, whole shifts of people could be idled while the problem is corrected.

Everybody makes mistakes. Even the most careful people can run a query on the wrong server because they are tired and overworked. We don’t always have the option to sleep or step away from the computer. To help avoid mistakes, blocking avenues for them makes the most sense. Below I am going to outline a few ideas for segregating your DEV and PROD environments, and I would welcome any comments with additional suggestions.

Separate accounts for production access

The accounts your DBAs (or others with elevated access) use to access production server should be different and – importantly – non-overlapping with the accounts they use for other environments (both should be Windows domain accounts.) This will prevent the AD account used on your PC from having direct access to production servers. You have to take a second step (run as…) in order to connect to prod. This affords some protection, but I suggest pairing it with…

Access production servers via RDP only

Whether your RDP directly into the production box, or into an intermediate box , this offers protection against those who might choose to leave a Run As… session of SSMS (or Visual Studio) open on their desktop at all times for quick access to production. A firewall rule to block TCP access on the SQL ports, but allowing RDP port access should help lock this down tight.

Separate Service Accounts

Best practice is to have a separate service account for each instance of SQL server running in your environment. I have yet to work for a company that does this, but at a minimum the accounts used for your production servers should be different than the ones for your other tiers. This helps in a couple of ways: First, if you’re installing a new dev or test tier server and bungle the password a couple of times, you don’t lock out your production service account. Second, it prevents you from accidentally targeting production with a SQL Service from another tier, like a SQL Agent job or Replication (which also uses SQL Agent Jobs).

The ultimate lockdown: HOSTS files/Network Segmentation

Preventing any connections from your other tiers to your production tiers can be taken a step further by adjusting the HOSTS file entries on your dev/qa/preprod tiers so that any connections to production servers are re-routed into a black hole. In a small environment, this might be all that is necessary. In a larger environment, it is much easier to set up separate VLANs and implement rules preventing traffic initiation from your lower tiers to production. This prevents someone from setting up a linked server connection to production from a lower tier and mucking about with data that way. Not that anyone would ever do that.

Considerations

Introducing this level of prevention in your environment can cause headaches beyond just the hurdles needed to access production. What about when you want to try to compare schemas between production and development? Or do data comparisons? Isn’t this just a huge pain?

Yeah, it’s a pain. But it doesn’t have to be huge. Some schema comparison tools will let you run comparisons against backup files instead of live DBs. Some of them will also create schema snapshots, so you can archive those and compare schemas offline. Data comparisons get a little trickier, but I would argue that those tricks help keep your environment much less vulnerable to an oops, and oopses can be VERY costly.

Less is More

Sometimes it really is true.  Let me tell you a story about a really powerful server.  We’ll call him Bob.  Bob is impressive.  RAM? Yeah, he’s got that, to the tune of a terabyte.  He has no problems with premature ejection of data from his cache.  Brains?  He’s like a rocket scientist on steroids, with 8 physical processors each sporting 10 cores.  Truly a dizzying intellect.  All those brains aren’t very useful without some data to back them up, so he snuggled up to a SAN sporting more spindles than you can shake a stick at.  Yes, you might say Bob was pretty happy with himself.  He was more than ready to take on the workload from our demanding SQL Server (let’s call her Sheila), thinking he wouldn’t break a sweat.

He was right.  Heck, he had the performance counters to prove it: He never went above 50% CPU.  Sheila, on the other hand, wasn’t happy.  She wanted more CPU, but try as he might, Bob just couldn’t seem to get her to take his massive CPU offering.  You see, Sheila was a perfect 10.0  — her @@Version, that is.  Windows took the 80 logical cores Bob had to offer and split them up into 2 processor K-Groups, each with 40 cores.  Since Sheila was so old school, she just wasn’t ready to be involved in a multiple K-Group relationship.  Something had to be done.  Bob’s peers in his “cluster” of friends suggested coaxing Sheila into 10.5 territory, where she’d be more open to doubling up on K-Groups, but Bob knew that trying to bring Sheila more current would likely mean she’d be leaving him for different hardware, and he just couldn’t stand the thought of losing her.

So Bob racked his BIOS looking for an answer to his problems.  It was there that he found the answer.  He had to be gentler with Sheila.  Rather than showing her everything he had to offer, why not tone it down a little?  In his BIOS he activated a feature to disable processor cores, telling each processor to behave as if it only had 8 cores instead of 10.  With only 64 cores visible to Windows, it saw no need to split them into K-Groups.  Ironically, by showing her less, Sheila was able to see more of what Bob had to offer.  Bob and Sheila would have more time together after all.  He might be sweating a little bit more now, but the sweaty part of the relationship is often the most fun.

Sometimes, Fewer is Better
Bob’s first big task with Sheila, given their new and empowered relationship status, was to see how much of an improvement all of this available brain power would have on Sheila’s CHECKTABLE performance.  So she queued up her largest database and got to work.  Something strange happened, though.  When presented with even more processing power, Sheila had problems bringing it all to bear and delivering the kind of performance she knew she was capable of with Bob.  Firing up her perfmon toolbox, she could see she was getting a dismal rate of throughput from all the shiny spindles Bob was so friendly with.  But it wasn’t the spindles that were the problem.  Latency was low – less than 5ms on average– so the spindles were delivering whenever they were asked.  They were delivering on target, too, with the right block size and alignment.  What could be the problem?
Maybe she just needed to try more things at once?  Instead of doing a single CHECKTABLE, she fired up multiple sessions of DBCC CHECKTABLE to try to spice things up.  It worked –disk throughput improved– but it was a bit unwieldy.  Bob had a script to refresh the SAN volumes presented to Sheila, attach them, and kick off a series of CHECKTABLE commands for each database.  When that script was done, a report was e-mailed to the DBA team detailing the results.  It would take a lot of work to revamp that process to spin up multiple sessions, aggregate the data accurately, and handle the timing of the volume refresh.  What’s a girl to do?  Checking this database was taking her 4 DAYS, and –love your data or not– that is clearly too much time with a static data set.  After 4 days, the SAN spindles ached to refresh their volumes, and Bob was wondering what was going on – Sheila was just not demanding CPU like he expected.  In a desperate attempt to save their relationship, Sheila started down the road of parallel operations, even going so far as to do some fancy partition switching to break up her largest tables into smaller chunks to process in parallel.

It was a novel idea, but she wasn’t sure it was going to guarantee consistency.  After all, she was changing the schema ever so slightly, which meant the data she was analyzing wasn’t really the same as the data in production.  To sanity check, her DBA got in touch with Paul Randal (blog | twitter), who gave some free advice.  One of those pieces of advice was to read this blog post detailing some experiments he did with CHECKDB and MAXDOP.  It turns out that trying to bring all 64 cores to bear during a CHECKTABLE is not that effective.  So Bob and Sheila conducted some experiments together, and slowly but surely they were able to find her sweet spot.  It turned out that Sheila had a pretty small sweet spot, but she was happiest at MAXDOP 4, where she saw between a fifteen and twenty-fold improvement in throughput.  Instead of 4 days, she was able to confirm her consistency in less than 2, enough of an improvement that her crazy partition-switching plans were shelved.

Bob and Sheila learned an important lesson about their relationship this week.  Quality really does top quantity, and sometimes you have to stop trying to go full bore if you want to have a better experience.

Got an opinion on this post?  Share it below!  If you thought “Wow, this is the kind of work I’d like to be doing!” and are interested in working for a company that wins awards for employee engagement, retention, and enrichment, drop me a line.  We’re looking for a DBA to add to our team.  We also have openings for .NET programmers. I love my job and the people I work with.  I’ll bet you would, too.

The Cab Ride–Part 6

Back in July, Gill Rowley (blog | twitter) blogged about the year that had unfolded for four folks who happened to share a cab back to the airport after Summit — one of those four being yours truly.  Looking back on it, it has been one heck of a year.  Erin, Gill, and I each changed jobs (Erin and Gill’s announcements).  Jes changed jobs twice (first to Blue Door Consulting and then to Brent Ozar PLF).  I had thought Jes was just the outlier (if you’ve met Jes, you definitely know she’s a recipe-wielding, marathon-running, coffee-guzzling, SQL-sharpening outlier) of the group, but that has not proved to be the case.

In January of this year I joined Autodesk, a software company primarily known for engineering products like AutoCAD.  It was an amazing opportunity, and it came with a hefty pay increase and some of the best fringe benefits I’ve seen in Michigan.  Alas, though, it was not meant to be.  Within a few months, my department had taken on a new direction – a completely different RDBMS — and I was left with a really tough choice to make:  follow the direction and continue my career there, or begin looking for other options.

Recently, I saw Buck Woody (LinkedIn | Twitter) live-tweeting some professional development advice from a seminar he was giving at the University of Washington.  One of the things he mentioned was having goals for your career.  It reminded me of one of my favorite presentations from the 2010 PASS Summit.  Buck gave a lightning talk entitled “Your career is your fault”.”  At the time, it caused me to examine my job performance and my attitude toward my job and take some action, but I fell short of really putting thought into where I wanted my career to go.  Once I figured out what my goals were, the choice I had to make was pretty clear – even if it meant leaving a pretty amazing place.

When I formulated my career goals I had a couple of things in mind.  First, I wanted to be somewhere I could learn a lot and share that learning with others.  Second, I wanted to be in one of two positions:  either working toward a leadership position or working as a consultant.  Both of those options ended up available to me.  I had to do a lot of soul searching and discussing with my family, and finally we came to the decision that the travel demands of the consulting life while the kids were still so young (5 and 9) would put too much stress on us all.  I did something I never thought I would do:  I declined an offer from an amazing company, working with people whose names are instantly recognizable in the community. That’s the bad news.

The good news is that I accepted an offer to work at a local company (Plex Systems) with some amazing people who are relative unknowns in the national community, but where I will feel like Keanu Reeves in Johnny Mnemonic with all the heavy knowledge that’s going to be downloaded to my brain.  There are also opportunities for leadership positions there, so I’ll have plenty to keep me busy and some opportunities for growth.  For now, at least, I have postponed living the glamorous life of a consultant in order to lead the fulfilling life of a family man.

I really want to thank Buck Woody, though.  His lightning talk came at a pivotal point for me and really connected.  I’ve carried that tagline – “Your career is your fault” – since then and try to revisit it occasionally to see if I feel like I should be praised for my career so far, or blamed for it….either way it’s my fault after all.  If you’re not thinking about your career that way, as something you plan instead of something that simply unfolds, I would encourage you to start.  Maybe you’ll be an outlier, too!