Fixing problems Part 1: Attitude Adjustment required

April 17th, 2009

As DBAs, software developers, Homo Sapiens, and lovers we have to solve problems.  There is a common misconception that support is for the more junior folks on a team and thus being good at it is a sign of “being a little baby”.  While support is a great way to learn a software ecosystem and organization and thus “grow” a junior person into a senior one a lot of problem-solving falls to the rock star programmers or wizard DBAs in most organizations.

Yet some of these people aren’t any good at it.  In fact, they are awful.  I’ve seen people that are very good at doing very hard technical things - creating something from nothing, thinking of all the things that could go wrong, refactoring and integrating a large subsystem, etc. - fail at simply fixing a problem with an existing system. You can take your rock star and send them off to fix a support issue and they will return with a confused look, eight hours of wasted effort, and an STD.  Working with some truly gifted problem solvers I’ve witnessed some differences in Attitude, Practices, and Skills that separate the junior and senior problem solvers.  Let’s start with Attitude.

First, there IS a problem

Never deny that there is a problem.  If someone is at your desk, on the phone, flooding your email with red exclamation points, or outside your house knocking on the window there is clearly a problem.  The problem might be that they don’t understand something and the problem might not be your fault, but the issue should be treated with respect as a real problem if they took the time to contact you.  Don’t deny it or argue.  Why would you deny it anyway, because you see support as negative.

Don’t see support as a negative shameful thing or a junior task

As long as things keep changing, there will always be problems.  (This next part is hard to put down in writing) If you aren’t writing bugs you aren’t writing software.  If you aren’t changing systems you aren’t working.  A support issue is not an insult, a bug is not a breakup.  Yes, you should make sure that you don’t write infinite loops, and yes you should make sure that you test the latest SQL Server upgrade before you install it in production at 10 am on the last day of the month.  But in most organizations there is a constant to and fro of creating new things and then fixing issues that crop up with them, so don’t pretend as if a problem is an anomaly.  True root cause and issue prevention are topics for another post, but don’t act surprised that software systems don’t always work as expected.

Confidence

In my home office I have a fortune cookie taped to one of my monitors that says: “You have the ability to analyze and solve any problem”, and over time I have started to believe it (by looking at it 27 times a minute).  The fact is that most people that are good at support are good because they believe that given enough time they could fix any issue.  ANY issue.  Given 20 years and enough coffee they could learn C/C++, reverse engineer SQL Server, learn about cross-platform multi-threading, and fix that bug.

The ability to not freak out and lose it

Something is broken but don’t be scared (its just moving electrons for the highest bidder).  The fact that someone is at your desk and not someone else’s is a good thing, don’t panic and freak out and yell things that you’ll regret later (Aside: yelling is always regretted - only thing I’ve not regretted yelling “OMG ITS MILEY”). Don’t blame people or come off as condescending; assume that they are your desk because they know you can fix it, not because they think you caused it.  Figuring out root cause and a long-term solution are separate things; as are fixing and blaming.  You are in charge of fixing for now, so just focus on that.  Besides, over time a calm person is going to be relied upon more while those who freakout will only end up on reality shows. (They don’t make reality TV shows about guys sitting at keyboards fixing complex technical issues - YET)

Next post - Practices that improve your ability to fix complex technical issues.

peopleware

Review of Yammer, a private corporate Twitter application

April 16th, 2009

So a few months ago a bunch of folks at work (lead by cutting-edge Antonio Yon) installed Yammer and now nobody is using it anymore. Why?

First, Yammer is pretty much a private Twitter for use in a corporate environment; its basic features are:

  • Basic twitter bits - follow people, short messages, etc.
  • Org chart build-out features (you tell Yammer your boss and who works for you and it invites them and builds out your corporate structure)
  • Can easily form groups such as QA, Development, The Party for the Overthrow of QA, Party Planning Committee, Party Pooping Committee
  • TweetDeck-like UI or web interface, also BlackBerry, etc.
  • More features here: Yammer Features

Good things

  • You can say things you can’t over Twitter such as “Lunch is here” or “Client XYZ makes me want to cut off my fingers” or even “I wish I was a lumberjack”
  • You can ask very very context-heavy questions - if you create a group for your team, it is basically an open IRC channel in which you can ask questions like “in this stored procedure, what does this mean?”

Problems

  • Yammer now can export their data to recruiters if/when they go out of business
  • Yammer is an immature software implementation of a good idea. The 2nd day that 15 people signed up for it we all got 15×15 emails notifying us that each person was following us - henious shameful bug (Cartesian FAIL).
  • The moment a “Vice President of Anything” joins the conversation cleans up and gets more useless or starts getting directed at that person (A VP of anything is more likely to join Yammer than Twitter given its org chart viral features)
  • The verb of Yammer is “Yam” which let’s admit it sounds gross

Why it didn’t work here

  • People used it to say stupid things “Holy crap look at that bird”
  • Given that some people are on Twitter and other social networking sites, the fact that it is yet another client makes people get very frustrated if the context isn’t rich
  • It spammed us
  • The client crashed occasionally
  • I’m not sure, but I think it got somebody pregnant

If Twitter or some other service supported and could be trusted with this functionality I think that having a private one integrated would be a good idea given:

  • Twitter isn’t bought by Google (At this point Google knows everything but my safe word: “Knight Rider”)
  • It is obvious that you are talking in a private channel
  • You could easily install it locally like various other social bits - most companies don’t run a wiki externally, they install Sharepoint or MediaWiki on a spare server
  • The “corporate twitter” is either supported by management (and your boss sends out updates and posts stuff there) or is completely outside the corporate realm (just your dev/DBA team uses it as a private IM group).

tools

Exception (Mis)Handling

March 24th, 2009

Exception handling was originally created to try to give developers a way to separate out error handling so that it wouldn’t clutter up and distract from the core functionality trying to be accomplished. When done well, exception handling can provide a clean way to instrument and separate truly exceptional conditions from the core flow of your methods as well as a way to prevent nasty crashes and untraceable bugs. When done badly, exception handling can be misused for decision making, masking errors and bugs, and distracting developers from doing real work.

Go read these when you have time, and you have time if you are reading this – admit it.
Chapter 19 of CLR via C# and Guidelines: Exception Handling.

There is not a lot that can be said that isn’t said here about .NET exception handling that isn’t in the resources above, but I’ve picked up on a common anti-pattern that stems from a misunderstanding of what exception handling is. I call it the “safety-net catch”, and it goes a little something like this:

public bool Import()
{
    bool isSuccess = false;

    try
    {
        if (SomethingIsWrong())
            throw new ApplicationException("Start panic sequence now.");

        bSuccess = DoSomething(WithThis, AndThis);

        foreach (string databaseThing in theDatabase)
        {
            if (bSuccess) bSuccess = DoSomethingUseful(databaseThing);
        }
    }
    catch (Exception ex)
    {
        PublishException(ex);
        isSuccess = false;
    }
    return isSuccess;
}

In this example, DoSomethingUseful, DoSomething, and most likely PublishException all have this same pattern of a ‘catch all’ at the end. We aren’t getting much benefit from this style of exception handling as it merely serves to make sure that this method always returns bSuccess so that execution should continue with any failure.

I think it would be easier on everyone if this code was changed to simply not handle exceptions that it can’t actually handle. The top level threads of your application should have a “catch all” that publishes the exception, and then displays a nice message for the user in the case of a web application. So if you want to fail, just let it fail if you can’t recover from it. The contract of the method Import above doesn’t say that it doesn’t throw exceptions – if it can’t do its job it should throw or allow an exception to bubble up.

There are cases where you don’t want the code to break out in the case of any exception, but these are rare. In the case of row level handling where we are parsing a file and you don’t want row to ruin the whole file you can simply move exception handling down to that import piece. In this case non- exception based mechanisms can be used or you can throw a custom exception type or an exception with well understood semantics like InvalidOperationException or ArgumentException.

Please do:

  • Throw an exception in a method if you can’t do the method’s job. If you don’t know what the method should do because it does 23 things, 18 of which can happen if it isn’t given a valid value for AccountNumber, refactor until you have 23 methods, one of which throws InvalidArgumentException when it gets an invalid AccountNumber.
  • Only do a ‘catch’ if you are going to handle it (maybe the syntax should say handle). Just publishing or setting a variable doesn’t count in the pattern above.
  • If you for some reason catch and rethrow (like if you want to add to the exception object), do a throw and not a throw origEx to keep the call stack.
  • Remember that your most common ‘exceptional condition’ might be a database timeout or other database exception (System.Data.SqlClient.SqlException) and is not an ApplicationException but most likely can’t be recovered from anyway.

Please don’t do this:

try
{
    bc.Save(o);
    DB.UpdateProcessFlag(conn, Util.GetInt32(dr, "Key"), 1);
}
catch
{
    //Error on save, so mark this one as errored.
    DB.UpdateProcessFlag(conn, Util.GetInt32(dr, "Key"), -1);
}

There are many reasons that this .Save() could fail, and a SqlException or (ObjectReferenceException on dr) are going to be badly mishandled here. If you find yourself feeling like you need to use the try/catch mechanism for this sort of stuff perhaps take a step back and think about how to handle it in terms of design.

And please don’t do this unless you like hearing the sound of me dying inside. (Use the TryParse pattern instead)

try { o.Something = -aNumba; }
catch { o.Something = 0;  }
try { amt = Util.GetDecimal(dr, "FieldName"); }
catch { amt = 0; }

if (amt == 0) return;

.net, antipattern, c#

Triggers Part 3: FAQ and FOP continued

March 12th, 2009

How many triggers should you have per table?

Ideally zero. If you have any then there should be one. There is no guarantee on the ordering of trigger firings, they normally fire based on their age – newly-added triggers fire last. So if you have two triggers that both run on update, you could get into a recursion situation.
If you do have more than one, you have to set the trigger order via a call to sp_settriggerorder to avoid this recursion, or rewrite the trigger.


-- Test trigger recursion

if object_id('TestTriggerRecursion', N'U') > 0
    drop table TestTriggerRecursion

if object_id('tr_TestTriggerRecursionUpdatedDate', N'TR') > 0
    drop trigger dbo.tr_TestTriggerRecursionUpdatedDate

if object_id('tr_TestTriggerRecursionUpdatedBy', N'TR') > 0
    drop trigger dbo.tr_TestTriggerRecursionUpdatedBy
go

create table TestTriggerRecursion
(
    KeyID int primary key clustered
    , Payload varchar(max) NOT NULL
    , LastUpdatedDate datetime NULL
    , LastUpdatedBy sysname NULL
)
go

create trigger dbo.tr_TestTriggerRecursionUpdatedDate on dbo.TestTriggerRecursion
for insert, update
as

print 'tr_TestTriggerRecursionUpdatedDate ran'
update t set LastUpdatedDate = getdate()
from INSERTED i
join TestTriggerRecursion t on t.KeyID = i.KeyID

go

create trigger dbo.tr_TestTriggerRecursionMaintainAuditField on dbo.TestTriggerRecursion
for insert, update
as

print 'tr_TestTriggerRecursionUpdatedBy ran'
update t set LastUpdatedBy = SUSER_NAME()
from INSERTED i
join TestTriggerRecursion t on t.KeyID = i.KeyID

go

-- delete from TestTriggerRecursion
insert into TestTriggerRecursion values (1, 'test1', null, null)
insert into TestTriggerRecursion values (2, 'test2', null, null)
insert into TestTriggerRecursion values (3, 'test3', null, null)

select * from TestTriggerRecursion

-- fails with
/*

(0 row(s) affected)
Msg 217, Level 16, State 1, Procedure tr_TestTriggerRecursionUpdatedBy, Line 6
Maximum stored procedure, function, trigger, or view nesting level exceeded (limit 32).

*/

-- exec sp_settriggerorder @triggername = 'tr_TestTriggerRecursionUpdatedBy', @order='First'

How can I get a list of triggers and which tables they apply to?

select so.name as TableName
, st.name as TriggerName
from sys.triggers st
join sys.objects so on so.object_id = st.parent_id

What order do constraints and triggers run?

Constraints, then triggers. In addition it should be noted that the DML operation and the trigger code have separate execution plans, poor code that causes recompiles inside a trigger does not cause outside stored procedures to change. With statement-level compilation/caching in 2005 this is even more true. In addition it should be noted that constraint code is independent of DML operations as well.

Can you say “only fire this trigger if THIS column is updated”?

IF UPDATE(Field1) OR UPDATE(Field2)
BEGIN

END

Do triggers try to run ON UPDATE even if now rows have been affected?

Yes, which is why you should always bail if no rows are affected like so:

IF @@ROWCOUNT = 0
RETURN

As you see below, your entire trigger will fire even on a failed update.


-- Test trigger update behavior with zero rows
if object_id('TestTriggerUpdateBehavior', N'U') > 0
    drop table TestTriggerUpdateBehavior

if object_id('tr_TestTriggerUpdateBehaviorUpdatedDate', N'TR') > 0
    drop trigger dbo.tr_TestTriggerUpdateBehaviorUpdatedDate

create table TestTriggerUpdateBehavior
(
    KeyID int primary key clustered
    , Payload varchar(max) NOT NULL
    , LastUpdatedDate datetime NULL
    , LastUpdatedBy sysname NULL
)
go

create trigger dbo.tr_TestTriggerUpdateBehaviorUpdatedDate on dbo.TestTriggerUpdateBehavior
for update
as

print 'tr_TestTriggerUpdateBehaviorUpdatedDate ran'
update t set LastUpdatedDate = getdate()
from INSERTED i
join TestTriggerUpdateBehavior t on t.KeyID = i.KeyID

go

insert into TestTriggerUpdateBehavior values (1, 'test1', null, null)

update TestTriggerUpdateBehavior set Payload = 'testImpossibleUpdate'
where 1 = 2

select * from TestTriggerUpdateBehavior

So, when should you use triggers?

You should use triggers:

  • When you have a clear understanding of how they work
  • You have no other option
  • You have performance tested your code thoroughly
  • You have informed your storage folks, DBA folks, and your mother
  • You have prayed about it

In all honesty I’ve only seen a few clean uses for triggers:

  1. “audit trigger”: Audit mechanism for straight up insert/delete/update calls. You have tableA, and you want to log all changes to auditTableA - a “copy trigger” does this quite well.
  2. “refactor trigger”: A temporary bridge between two phases of a database refactor project. You are migrating data from schema A to schema B, but that last pesty bit of code hasn’t been changed. This release changed 80% of the code and put in a trigger to maintain the data or log to new tables for later testing the remaining 20%. The discipline required to push through the 20% and remove the trigger is rare, so this is sometimes dangerous.
  3. “trap trigger”: In a crisis, log where updates are coming from to a specific table. Remove trigger quickly thereafter.
  4. “evil trigger”: A trigger created for evil.

Where can I find more information about triggers?

Books online: Triggers.

sqlserver

Triggers Part 2: Facts and Frequently-Occuring-Problems (FOP)

March 12th, 2009

When are triggers fired, and how many times?

For your standard missionary position trigger, they are fired once per batch. So if you have an update statement that affects 57 rows, then the DML AFTER trigger fires *once* but the deleted and inserted magic tables have 57 rows in them. One common anti-pattern is not handling multiple values in the inserted/deleted tables.  Code to prove the firing behavior of triggers:

-- Test trigger firing cardinality
if object_id('TestTriggerFiring', N'U') > 0
    drop table TestTriggerFiring

if object_id('TestTriggerAudit', N'U') > 0
    drop table TestTriggerAudit

if object_id('tr_TestTriggerFiring', N'TR') > 0
    drop trigger dbo.tr_TestTriggerFiring

create table TestTriggerFiring
(
    KeyID int primary key clustered
    , Payload varchar(max) NOT NULL
    , LastUpdatedDate datetime NULL
    , LastUpdatedBy sysname NULL
)

create table TestTriggerAudit
(
    TriggerName sysname
    , DateFired datetime NOT NULL
    , ActionType varchar(10) NOT NULL
    , KeyID int NOT NULL
    , Stamp uniqueidentifier
)

go

create trigger dbo.tr_TestTriggerFiring on dbo.TestTriggerFiring
for insert, update
as

declare @MyGuid uniqueidentifier
set @MyGuid = newid()

insert into TestTriggerAudit (TriggerName, DateFired, ActionType, KeyID, Stamp)
select 'tr_TestTriggerFiring', getdate(), 'inserted', i.KeyID, @MyGuid
from INSERTED i

go

insert into TestTriggerFiring
select top 3 object_id, name, null, null
from sys.objects

go

insert into TestTriggerFiring values (-1, 'test4', null, null)
waitfor delay '00:00:001'
go
insert into TestTriggerFiring values (-2, 'test5', null, null)
waitfor delay '00:00:001'
go
insert into TestTriggerFiring values (-3, 'test6', null, null)
waitfor delay '00:00:001'
go

select * from TestTriggerAudit

What are the inserted/deleted magic tables? How do they work?

The inserted and deleted tables hold the date being changed. Depending on what you are doing they hold different data:

Insert
Inserted – holds the new data
Deleted – is empty

Update
Inserted – holds new data
Deleted – holds old data

Delete
Deleted – data being deleted
Inserted – is empty

The inserted and deleted tables are not indexed, so take care in querying them in the wrong way.

Given the logic above, if possible it is normally cleaner to not combine an insert/update trigger. If you have to, the below if my template for doing so:

IF OBJECT_ID('tr_Example_Update', 'TR') > 0
    DROP TRIGGER dbo.tr_Example_Update
GO

CREATE TRIGGER dbo.tr_Example_Update ON dbo.Test
FOR UPDATE, INSERT
AS 

IF @@ROWCOUNT = 0
RETURN

SET NOCOUNT ON
IF UPDATE(Field1) OR UPDATE(Field2)
BEGIN
    -- Update
    IF (SELECT COUNT(1) FROM DELETED) > 0
    BEGIN
        print 'Update logic'
    END
    ELSE
    BEGIN
        PRINT 'Insert logic'
    END
END

SET NOCOUNT OFF
GO

Are triggers fired on bulk insert?
No by default, but you can turn them on. Controlling Trigger Execution When Bulk Importing Data

Tune in for Triggers Part 3: FAQ and FOP continued

sqlserver

Triggers Part 1: Introduction to madness, plus whores

March 12th, 2009

According to the standard developer canon about databases you should avoid triggers like you avoid wearing your “I love goto statements” t-shirt at the company Christmas party (again).  Knowledge of triggers is verboten, almost as bad as saying you love cursors.  Why are they such a bad idea, and why does the DBA let the air out of your Oldsmobile Firenza tires whenever you write one?  Let’s explore these ideas, and leave you being grown up and driving a Oldsmobile for another day.

First, a brief overview of triggers if you aren’t familiar with them at all (lucky):

Triggers are blocks of T-SQL that are run upon the firing of certain events within SQL Server, the most common one being an insert, update, or delete to a rows in a table. They can also fire upon login, the creation of tables, and other system events. Triggers have certain constraints that they have to live with such as not being able to return results or perform certain operations such as CREATE DATABASE, and they are fed only certain input - @@ROWCOUNT, the inserted and deleted tables, the ambient transaction. They are typically complained at and about by DBAs and developers as being slow and hard to deal with.

Where do triggers come from?

When a whore and a demon love each other they make a baby, and that baby is a trigger with a cursor in it that handles it own transactions and savepoints whilst trying to send email via a call to the smtp service using xp_cmdshell.

Are triggers in the original relational model?

No, but neither are isolation levels, recovery models, or stored procedures. But never fear, very little that you have heard of is in the original relational model unless you married your sister and eat German chocolate bars in which case you are familiar with joined relations and relvars.

Why are triggers given such a bad name - aren’t they just a certain type of stored procedure that happens at a certain time?

Triggers are given a bad name for a few reasons:

What they are typically used for: They are quite powerful and it is easy to overuse them for things that are normally better supported by the DBMS in other places. The classic example of this is constraint or foreign key enforcement - a trigger can stop the insert/update from occurring or check the updated data right before the modification. Business logic can live in a trigger, so that it runs “just in time”. Another common pattern is to use a trigger to keep two separate databases in sync – every update to one causes an insert into another across linked server, etc. Triggers are dark-alleyed shortcuts and there are normally better paths to build the functionality you want.

How hard they are to work with:  Most developers don’t know how to properly code a trigger - there I said it.  In addition, they are added T-SQL surface area that doesn’t show up in most tracing mechanisms for laymen (Activity Monitor, syscomments, indirectly in execution plans, a lot of source control systems, etc).

They are slow: Triggers are inside a transaction doing an insert, update, delete.  This is the essential core of the database system - very close to the DBMS kernel in terms of your application usage.  Adding a trigger that takes half a second can bring your system to a messy halt.  In addition, in the past triggers populated the inserted and deleted tables via building a view from reading the transaction log.  This lead many DBAs to see non sequential behavior in their storage mechanisms that were built for sequential writes leading to even greater slowdowns.  As of SQL Server 2005 triggers are implemented via row versioning which thus makes use of the tempdb instead of reading the log, but DBAs never forget.  In either event, the behavior of triggers is simply different than other types of statements, leading DBAs to be even more unhappy - and they are pretty much unhappy in DisneyWorld.

Tune in for Triggers Part 2 Facts and Frequently-Occuring-Problems (FOP)

sqlserver

Read / Write by server and database

March 10th, 2009

Modified from Jason Massie’s post, here are queries to tell you your read/write ratio per server and database.  Interesting results if you are maintaining multiple databases/application and are acting under the assumption that they behave similarly:

SELECT
    CAST(SUM(user_seeks+user_scans+user_lookups) AS decimal)
    /
    CAST(SUM(user_updates) + SUM(user_seeks+user_scans+user_lookups) AS decimal)
AS ReadPercent
    , CAST(SUM(user_updates) AS decimal) /
    CAST(SUM(user_updates)
    + SUM(user_seeks + user_scans + user_lookups) AS decimal)
AS WriteRatio
FROM sys.dm_db_index_usage_stats

SELECT db.name
, CAST(SUM(user_seeks+user_scans+user_lookups) AS decimal)
/ CAST(SUM(user_updates)+SUM(user_seeks+user_scans+user_lookups) AS decimal)
* 100 AS ReadPercent
, CAST(SUM(user_updates) AS decimal)
/
CAST(SUM(user_updates)
    + SUM(user_seeks + user_scans + user_lookups) AS decimal)
    * 100 AS WriteRatio
FROM sys.dm_db_index_usage_stats t
JOIN sys.databases db on db.database_id = t.database_id
WHERE (user_updates + user_seeks + user_scans + user_lookups) > 0
GROUP BY db.name
ORDER BY db.name

sqlserver

Out of space error when moving tempdb

February 26th, 2009

Quirk in SQL Server to do with sizing tempdb that I ran into today: SQL Server checks the current location when you move tempdb to see if there is enough space not the new location.

From: Configuring Database Files for Optimal Performance

The below script will move TempDB from its current location to a folder on the T drive. Change the drive letter and folder location to suit your system. The script only uses a 1gb file size because of an odd behavior in SQL Server that checks the current file location to see if there’s enough space - instead of checking the new file location. If the user specifies a 100gb TempDB data file on the T drive (which does have 110gb of free space), SQL Server checks the current location (C) for 100gb of free space. If that space doesn’t exist, the script will fail. Therefore, use a small 1gb file size first, then after SQL Server restarts, alter the file to be the full desired size.

sqlserver

Barista

February 24th, 2009

Coding for fun not profit.

#!/d/perl/bin/perl.exe
# (void*)
# If I ever own a cafe it will be called
# 0xBADDECAF or just 0xCAFE, but what else could it be?

while (<>) {
   my @words = split;
   foreach my $word (@words) {
      next if $word =~ /[^a-fA-F]/;
      next if $seen{$word};
      $seen{$word} = 1;
   }
}
foreach my $word (sort (keys %seen)) {
   print $word, " ";
}

tate$ hexword books/bibleKJV.txt books/dictionary96.txt books/warAndPeace.txt
A AB Abbe Abda Add Added B BE Baca Bad Be C D De Dead Ebed F Feed a ab abaca abb
e accede ace ad add added b babe bad bade be bead bed bee beef c cab cd ce cf d
da daff de dead deaf deed e ebb efface f facade face faced fade faded fed fee fe
ed

That’s it? Hmmm. Faded beef cafe? Feed da dead cafe? I guess I should just face da dead beef, I will never own a cafe.

poetry

School

February 24th, 2009
[tate]$ cat school.pl
while($time)
{
        study until $summer;
        goto sleep unless $school;
        my $time_here = undef and exists $the{exit};

        do {
        	bless $all
        	, accept $all
        	, connect $all
        	, join $all
        } until last BREATH;

        while(left) {
                sin;
                bless@ed;
        }
}
return ??;
[tate]$ perl -c school.pl
school.pl syntax OK

poetry

?>?>