SQL/NoSQL Public Diary: 2013

Saturday, December 21, 2013

Does clustered Index Store Records in Physical Order???

Does Clustered Index Store Records in Physical Order???

I don't know about you but frankly speaking I heard from everyone that "Clustered Index Physically sorts the data inside Table" .Today I'm going to do some R & D and will see that it's true or just a myth.

To do so I'm going to create a database called DBTEST and then I'll create a table as TestTable with
column ID with primary key and name with char. As we all know that by default primary key creates clustered index on that table. Below is the script for same.

IF EXISTS (SELECT name FROM master.dbo.sysdatabases WHERE name = N'DBTEST')
DROP DATABASE[DBTEST]

CREATE DATABASE [DBTEST]
CREATE TABLE TESTTABLE(ID INT PRIMARY KEY, NAME CHAR(30))
INSERT INTO TESTTABLE VALUES(11,'AMIT'),(12,'BILLU')

Now using DBCC IND will get the pageID as below:
DBCC IND ('DBTEST','TESTTABLE', -1);

Now using DBCC PAGE command we will see the page information as below:

DBCC TRACEON(3604)
DBCC PAGE('DBTEST',1,154,2)

Look at the result in below screen shot:

Now I'm going to insert some more records and as per our assumption is should sort the data in physical order as it has clustered index on it.

INSERT INTO TESTTABLE VALUES(1,'AMIT'),(6,'BILLU'),(3,'SACHIN'),(2,'SUJAN')

Now using DBCC IND we will see the pageID as below:

DBCC IND ('DBTEST','TESTTABLE', -1);

Now using DBCC PAGE command we will see the page information as below:

DBCC TRACEON(3604)
DBCC PAGE('DBTEST',1,154,2)

From this screen shot we can see that the location of records has not been change. I mean now the row 4 and 5 has the data (11,'AMIT'),(12,'BILLU') and the location is still 96 and 137.

So,The short answer to this myth is there is something called "Row offset Array" in a PAGE which manages the order of rows in a PAGE and data resides in order in which they are dumped into memory.

Lets see the memory dump for same.

From above screen shot it is clear that the records are still in the order in which they are dumped.

Please comment here If I'm missing anything over here as I also don't have much idea about Internal memory structure of SQL Server.

Thanks!!!!!

Friday, September 27, 2013

Google 15th B'day..HighScore!!!!

Google 15^th Birthday Game High Score:

Let’s do some Fun....

Go to www.google.com.

Click to play option and then Right Click and select inspect element. As below Figure

After selecting inspect element a new window with source code will appear. Now click on Source tab:

Now click on the corner tab as shown in below fig:

After clicking it will be expanded as shown below. Now select bday13.js.

Now click on stick pointed by blue circle in below figure .The game will be automatically paused and the control will automatically goes to line no 1744. Or just go line no 1744 Select that line. After selecting line 1744 go to left hand side in closure tab and search for S: Please see the image below

Change the value of S: by "100".

And click on resume button as below:

it will be continue. I have just taken a screen shot at 1724 that does not mean it will not go further.

Thanks!!!

Thursday, September 26, 2013

IDENTITY Column Property

IDENTITY Column:

We can implement identifier columns by using the IDENTITY property. This enables us to specify both an identity number for the first row inserted into the table (Identity Seed ) and an increment (Identity Increment) to be added to the seed to determine successive identity numbers.

When values are inserted into a table that has an identifier column, the Database Engine automatically generates the next identity value by adding the increment to the seed.

If the table is already created and you want to add identifier then the identity numbers are added to the existing rows of the table, with the seed and increment values applied in the order that those rows were originally inserted. Identity numbers are also generated for any new rows that are added.

Now we are going to create a table with an identity column.

CREATE TABLE IDENTIFIER (ROLL_NO INT IDENTITY(1,1),NAME CHAR(12),ADDRESS VARCHAR(23))

INSERT INTO IDENTIFIER VALUES('VIKASH','GHAZIABAD'), ('AMIT','GHAZIABAD'),('VIKRAM','NOIDA'),('AJAY','DELHI')

Inserting value using row terminator (,) will only work in 2008 version or later if you are using SQL server 2005 then it will not work and you have to write whole statement.

Notice one think that we are not explicitly providing value for identity column in the insert statement. If we will do so we will get such a beautiful error as follows:

So from error it is clear that to do so we must have INDENTITY_INSERT ON.

How to do that we will see later in this article. First let’s see the output for the table which we have created that is IDENTIFIER.

SELECT * FROM IDENTIFIER

A table can have only one column defined with the IDENTITY property, and that column must be defined by using a decimal, int, numeric, smallint, bigint, or tinyint data type.

The seed and increment can be specified. The default value for both is 1.

The OBJECTPROPERTY function can be used to determine whether a table has an IDENTITY column, and the COLUMNPROPERTY function can be used to determine the name of the IDENTITY column.

Now let see how we can set identity_insert On. We can use the command

SET IDENTITY_INSERT [IDENTIFIER] ON

Now let’s see the tabel records

SELECT * FROM IDENTIFIER

It is good practice that you should set OFF it after using this. i.e.

SET IDENTITY_INSERT [IDENTIFIER] OFF.

Note:

If an identifier column exists for a table that has frequent deletions, gaps can occur between identity values. Deleted identity values are not reused. To avoid such gaps, do not use the IDENTITY property. Instead, you can create a trigger that determines a new identifier value, based on existing values in the identifier column, as rows are inserted.

To reuse the identity value you can also truncate the table but you never want to do this except some special cases.

This is one major difference between DELETE and TRUNCATE that if we will perform DELETE on identity column the Deleted identity values are not reused but if we will perform TRUNCATE then it will reuse the value .

In below figure first we have deleted the record for Roll_no=3. After that we inserted a new record. See the output in figure:

Let’s see the output after truncating and then inserting the records into table.

Now insert new records in same table and see the values of Identity column.

insert into identifier values('Ajit','NEW DELHI'),('ajay','NEW DELHI'),('hAPPY','NEW DELHI')

SELECT * FROM IDENTIFIER

So, this was all about identity column.

Please write in comment box to share your knowledge about IDENTITY Property.

Thanks!!!!

Wednesday, September 25, 2013

Ranking Functions

Ranking Functions

Ranking functions return a ranking value for each row in a partition. Depending on the function that is used, some rows might receive the same value as other rows.

T-SQL provides the following ranking functions:

1) RANK()

2) DENSE_RANK()

3) ROW_NUMBER()

4) NTILE()

Now let’s first discuss about Rank Function:

The Rank() Function:

It assigns rank to each record starting with 1. If it encounters two or more records to have the same ORDER BY <columns> values, it is said to be a tie and all these records get the same rank.

When it finds a tie it assigns the same rank to the all the records in the tie, but still keeps incrementing the record counter so the next record will get a new rank which would be (previous rank + no of records in the current tie + 1).

Syntax: RANK () OVER ([<partition_by_clause>] <order_by_clause>)

OVER clause is required in all the ranking functions and with that you specify the partitioning and ordering of records before the ranking functions are evaluated. If you don't specify it, you will get an error similar to "Incorrect syntax near 'ROW_NUMBER', expected 'OVER'." The Over () clause allows you to combine aggregated and non aggregated data.

PARTITION BY clause is not mandatory and if you don't specify it all the records of the result-set will be considered as a part of single record group or a single partition and then ranking functions are applied.

Example: In below figure we have 4 records in the current tie and the previous rank is 4. So, next Rank will be 4+4+1= 9.

We have a table Rank_test in our Database. First let’s see all the records from this table as follows using select statement:

Table schema : CREATE TABLE Rank_Test (EmpId INT IDENTITY(1,1), NAME VARCHAR(23), salary MONEY);

We have 10 records in this table.

Now let’s use the Rank() Function and see the output:

SELECT *, RANK() OVER(ORDER BY salary) AS Rank_Display FROM dbo.Rank_Test

The Dense_Rank() Function:

The Dense_Rank() Function is similar to Rank() function except when tied values are present in result set. The difference between the RANK and DENSE_RANK functions is in how values are assigned to rows following a tie. The Dense_Rank () count ties and uses sequential number sequencing (no skipping, no gaps). Dense ranking closes the number gap caused by multiple row having the same rank number. Confused?????

Let’s see the below figure then it will be more clear.

SELECT *, RANK() OVER(ORDER BY salary) AS Rank_Function,

DENSE_RANK()OVER(ORDER BY salary) AS Dense_Ranking FROM dbo.Rank_Test

Syntax:

DENSE_RANK () OVER ([<partition_by_clause>] <order_by_clause>)

From above figure it is clear that there is no gap in case of dense_rank. But in case of rank() some rank are skipped because of tie.

The Row_Number() Function:

Unlike Rank () and Dense_rank(), the Row_Number() function ignores ties in the result set. Because of this, the numbering sequence values of ROW_NUMBER( ) are all continuous and unique ( no ties means no skipped numbers and no repeat values).

Syntax: ROW_NUMBER () OVER ([<partition_by_clause>] <order_by_clause>)

Let’s take a view at below figure:

SELECT *, RANK() OVER(ORDER BY salary) AS Rank_Function,

DENSE_RANK()OVER(ORDER BY salary) AS Dense_Ranking,

ROW_NUMBER()OVER(ORDER BY salary) AS ROW_NUMBERR FROM dbo.Rank_Test

So, this is how we can use the function ROW_NUMBER ().

The NTILE( ) Function:

The NTILE ( ) is a SQL function that distributes the rows into a specific number of groups. It ignores values, including ties. All records are handed in order and assigned to their respective groups.

For NTILE( ) you must specify a lower number of groups/tiles/pieces than your record count. If you are tiling 50 records then highest number of groups you can have is 50.

If groups don’t divide it in equal with NTILE () then the lowest group have fewer members. For example if you have 10 records and you perform an NTILE (3) then the first group will have 4 records and the last group will have three records, you can see this in below figure.

SELECT *, RANK() OVER(ORDER BY salary) AS Rank_Function,

DENSE_RANK()OVER(ORDER BY salary) AS Dense_Ranking,

ROW_NUMBER()OVER(ORDER BY salary) AS ROW_NUMBERR ,

NTILE(3)OVER(ORDER BY salary) AS Tiling_data FROM dbo.Rank_Test

In this figure you have the result set for all four function numbered as 1, 2 3 and 4.

Now we have completed all four functions.

One common interview questions is asked to candidate and that is “ How to find the n’th highest salary”. Although there are so many ways you can find it and that are using TOP or using Sub queries etc. Let’s see how can we solve this problem using Rankin function.

Suppose we want to know the 5^th highest salary. To do that we will use Dense_Rank() function . Let’s see the query for same:

SELECT * FROM (SELECT *, DENSE_RANK() OVER(ORDER BY salary)AS Highest_Salary FROM dbo.Rank_Test)

AS RankS WHERE Highest_Salary=5

You can also create a store procedure for same. Every time you just need to give the value for n and you will get the result.

We can create a store procedure as follows:

CREATE PROCEDURE HIGH_SALARY @NUM INT

BEGIN

SELECT * FROM (SELECT *, DENSE_RANK() OVER(ORDER BY salary)AS Highest_Salary

FROM dbo.Rank_Test) AS RankS WHERE Highest_Salary=@NUM

END

Now Lets execute our store procedure that is HIGH_SALARY. Now we just have to give the value for n. See the below figure for same.

Thanks For reading this article.

Tuesday, September 24, 2013

Configure Database Mail in SQL Server

How to Configure Database Mail in SQL Server

Database Mail is one of the interesting features of MS SQL Server. Database mail is not configured to be “on” by default. You must enable Database mail before you can configure and use the capabilities it provides. There are three different ways you can enable database mail in SQL Server, which are as follows:

The Database Mail Configuration Wizard

This is a graphical wizard, it can be launched from SSMS that will enable database mail and configure the basic capabilities of the service. This is what we will use in this article.

sp_configure

Apart from SSMS you can also configure database mail via store procedure that is the sp_configure system to enable or disable database mail. You specify the option „Database Mail XPs‟ (an advanced option in sp_configure) to 1 to enable database mail, or 0 to disable database mail.

Policy Based Management

In SQL Server explorer under management node you can see “Policy Based Management”. You can specify the Facet “@DatabaseMailEnabled” and set the value to True, then add this facet configuration to a policy, and then apply that policy to the specific SQL Server you want to work with. This feature is available to 2008 and later versions.

How to Enable Database Mail SSMS

To get started with database mail, connect to SQL Server Management Studio (SSMS), and then expand the Management folder, and right-click on the Database Mail entry. Select the option to “Configure Database Mail”, and the database mail configuration wizard will launch, displaying an introduction to the wizard. Click next, and you will see the options available in the wizard, as below.

As we know this is the first time the database mail configuration wizard has been run, so you will need to keep the option to set up Database Mail.

Now Click Next, and a new window will appear as below:

You will then be presented the New Profile dialog. Enter a profile name and description, as below:

And then click on the Add button a new window will appear as below. Now we have to fill the account name and description as below:

The server name field may be either the DNS name or the IP address of the SMTP mail server you wish to use for this specific mail account. Here we are using Gmail account so we will have to give Gmail server name that is “smtp.gmail.com”. The default port number is 25. The port which we are using for Gmail is “587”. Then fill up the authentication using your Gmail id and password.

Ø You can also use Windows Authentication which will come to the SMTP server as the service account used by the SQL Server Database Engine service (mssqlservr.exe).

Ø Anonymous authentication means no authentication.

Now Click Ok button and a new window will come up as below:

You can create additional accounts in the wizard here to be associated with the mail profile i.e. Admin_vimal. If the first account fails to connect to the mail server and send email for some reason, the second account in the list will be used. You can have multiple accounts associated with a single mail profile. But for now we will only use a single SMTP account. Click on Next Button.

A new window will come with the option to make the mail profile you have created a “public” profile. Public” profile means allow all authorized database mail users to send email using this mail profile. Another option is private. In Private you would have to associate the profile with a user id from the msdb database that is system database. Since this is the first profile, make it the default public profile

Now we are finished with configuring our initial database profile. Now click Next Button. Now you will be presented with the “Configure System Parameters” dialog, which allows you to change system-wide settings for database mail.

The fields are easy to understand. The one which may not clear to us is logging level. There are three possible logging levels – Normal, extended, or verbose.

Normal: It logs only errors.

Extended: It logs errors, warnings and Informational Messages

Verbose: If you are in position to troubleshooting a problem or want to see as much information about database mail as possible, set the logging to verbose. You can then review the database mail logs in SSMS (or thesysmail_event_log view in the msdb database)

Default logging level is extended. For this article, accept the defaults, click Next, and then Finish to create the mail profile and account specified on the profile page.

Click to Finish button.

We have successfully configured the database mail. Now what???? The Next Steps, Once the wizard is complete; there are a few steps to confirm that the database mail is properly configured. First, we will send a test email. Then, we will review the logs to see the basic information about that test email that was recorded by database mail.

on “Send Test E-mail..” after that fill up the window as below:

Now click to “Send Test E-mail”

Review the Database Mail log:

Connect to msdb database and run the below statements to view the logs.

SELECT * FROM sysmail_log

SELECT * FROM dbo.sysmail_profile

SELECT * FROM dbo.sysmail_sentitems

SELECT * FROM dbo.sysmail_send_retries

SELECT * FROM dbo.sysmail_mailitems etc....

Or you can also see it via SSMS by clicking “view database mail log” as below.

Thanks for reading this article!!!