Collation in SQL Server and fixing collation issues.

Collation refers to a set of rules that determines how your data is sorted and compared. It’s very often important with regards to internazionalization, e.g. how do you sort japanese kanji?

COLLATE is not a SQL command, it’s a SQL clause.

 

Collation defines how you sort and compare string values.It’s generally a good idea to have a single, unique collation used throughout your database – don’t use different collations within a single table or database.

For example, it defines how to deal with

  • accents (äàa etc)
  • case (Aa)

Note: – Different languages will have different sort orders.

Normally we have 4 type of sensitivity on SQL Server (Case, Width, Accent, kanatype) .  If you want to force any one of the sensitivity as goven above then you have to use COLLATE.

Here, I would like to explain about “Case Sensitive”…

One small question for you..

How will you validate the data based on the Case sensitive on the column / expression ?

i.e: “A” is differs from “a” correct ?

If so, How will you do that ?

Here, you have one small workarround…

DECLARE @A      VARCHAR(10)

SELECT @A = 'a'

IF (@A = 'A')

PRINT 'Match'

ELSE

PRINT 'No Match'

Normally the result should be No Match correct ?

But, you will get the result as Match . Because, The sql server engine conpares the expression as Case-Insensitive

So, How will you force to validate Case sensitive ?

DECLARE @A      VARCHAR(10)

SELECT @A = 'a'
IF (@A = 'A' COLLATE SQL_Latin1_General_CP1_CS_AS)

PRINT 'Match'

ELSE

PRINT 'No Match'

Now, you will get the result as No Match

Here, I have validated the expression with Case sensitive collation (SQL_Latin1_General_CP1_CS_AS)

Case Sensitivity

If A and a, B and b, etc. are treated in the same way, then it is case-insensitive. A computer treats A and a differently because it uses ASCII code to differentiate the input. The ASCII value of A is 65, while a is 97. The ASCII value of B is 66 and b is 98.

Accent Sensitivity

If a and A, o and O are treated in the same way, then it is accent-insensitive. A computer treats a and A differently because it uses ASCII code for differentiating the input. The ASCII value of a is 97 and A 225. The ASCII value of o is 111 and O is 243.

Kana Sensitivity

When Japanese kana characters Hiragana and Katakana are treated differently, it is called Kana sensitive.

Width Sensitivity

When a single-byte character (half-width) and the same character when represented as a double-byte character (full-width) are treated differently then it is width sensitive.

 

You can force which collation by using the COLLATE clause. (Generic way)

select R.UserName from DB1.dbo.Users R

join DB2.dbo.SY01400 U on U.USERNAME collate DATABASE_DEFAULT = R.UserName collate DATABASE_DEFAULT

Collation conflicts are common when joining tables between two databases or servers, especially if the version of the DB is different.

You can check what collations each column in your table(s) has by using this query:

SELECT
    col.name, col.collation_name
FROM 
    sys.columns col
WHERE
    object_id = OBJECT_ID('YourTableName')

What is the use of COLLATE in sql server?

Collation controls the way string values are sorted. The default collation sorts alphabetically using the standard Latin character set. Other collations will sort in different orders.

 
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s