DBA

Have you checked your datatypes lately?

Earlier this week I was working through an issue that I’ve seen one too many times lately and thought Id write about it.  For various reasons *cough* Legacy *cough*  I often seem to run across columns in tables that are defined with what would appear to be the wrong data type or worse, multiple data elements stuffed into one column!

In this case the table held various account codes in a char(06) column.  These codes were things like ‘001’, ‘002’,’A’, ‘B’ etc

Each account code was for use in a different system.  so, on system X it was an integer and system Y the codes were character based, etc

This data was used to join many different tables.  Initially these joins were small and straightforward with predicates restricting the code table results to only include the proper data for the system being used.

pseudo code looks roughly like this

UPDATE DATA

SET DATADESC = CODE.MORE_DATADESC

FROM DATA

JOIN CODE

ON DATA.CODE_KEY = CODE.CODE_KEY

JOIN METADATA

ON META_KEY = CODE.CODE_KEY

AND DATESTAMP > = ‘9999-12-31’

WHERE DATA IS NULL

AND CODE.DATATYPE = ‘ACCT’

This is the execution plan from a typical update of 150Million rows

typical_plan

Notice the Metadata table and Codes table are joined before the data table so the resulting codes data is only going to include the 16 rows of smallint data which passes the implicit convert to smallint

As the rowcounts increased and joins changed the implicit data type conversions continued to work until one day they didn’t.  Instead they started returning this error message

Msg 245, Level 16, State 1, Line 2
Conversion failed when converting the varchar value ‘B     ‘ to data type smallint”

After some digging it turns out the optimizer had started using a different plan where it joined the tables together in a different order.

Heres that execution plan of the same update

atypical_plan

Notice the Metadata table join is reordered and happens after the codes table is joined to the data table.  When the data table is joined to the code table the error above occurs as character data in codes cant be converted to smallint during the join

To overcome this problem there are a couple of choices (admittedly there are more than a couple so this list isnt inclusive)

Best –> Change the source table to account for having different source systems account code types in different columns with the correct datatypes.  Benefits are straightforward, the BIG drawback is the extra work for system redesign

Better –> Change the queries to use an explicit conversion on the joined columns changing the datatype on the data table from smallint to char(06).  The benefit is the small amount of change needed, the drawback is the extra bytes involved per row joined

Good OK –> Add a query hint to force the join order. The benefit is the small amount of change needed, the drawback is in my mind we’re now forcing the optimizer to take a different path than it would if left to its own.  I try to avoid hints unless there arent other options and in this case there are.

Other things that could be considered as solutions include temp tables, CTE’s, etc. all of which were more change then we wanted to pursue testing.

The 3 most important words for a DBA

Occasionally I’m asked what the most important thing for a DBA to know is.  It seems some people want to know what 1 thing to focus on in order to be successful long term as a DBA.  I’ve pondered this question for quite a long time and I’ve given various answers over the years.

Reality intervenes

Ive come to the conclusion that there really is no one special thing or “skill” that its important for a DBA to know.  Sure, there are lots of qualities that tend to make a successful DBA.  Things like attention to detail, thirst for knowledge, and an uncanny love of BACON…  The problem is as huge as the SQL Server product is becoming it is also becoming increasingly difficult to be an expert in everything.  I am actually of the opinion that it is impossible to be an expert in everything SQL Server.

Knowing what you dont know

The most important thing for a DBA to know can be summed up in 3 simple words ” I Dont Know”  Why is that important?  Because, knowing what you don’t know, and being able to admit it to those around you is by far the most valuable skill that you can poses as a DBA at any skill level.

When in doubt its always good to remember that its ok to say “I don’t know” The Corollary to this statement is of course “Ill find out, or find someone who does know”.  Ive seen many small issues blow up into large problems over the years because someone didn’t really understand what was going on and they were afraid to let their coworkers (or boss) know that they don’t understand.  In many cases if we’ll just learn to say “I don’t know” some further pain can easily be avoided.

Go to Top