Thursday, February 27, 2014

What Actually Happens Inside SQL Server When a simple select Query is Executed?

Hi All... In this blog I'm going to explain that "What Actually Happens inside SQL Server When a simple select Query is Submitted?"

When you execute any query via SSMS or through application within a second we get the result but do you really know how things work behind the scene?????



Suppose a user want to book a movie ticket. So, First he will see the number of seats available. To do so he will fill the request form to full fill his request.

So from here the STORY begins:

He has filled the request form to see the number of available seats. And click on "OK".


Now, application will communicates with the Database Engine, the application programming interfaces (APIs) exposed by the protocol layer formats the communication using a Microsoft-defined format called a tabular data stream (TDS) packet. The SQL Server Network Interface (SNI) protocol layer on both the server and client computers encapsulates the TDS packet inside a standard communication protocol, such as TCP/IP or Named Pipes or shared memory or VDI.

In simple words his request puted into an envelope. So that No one else can see that request. Security reasons :)

The request goes to SQL server relational engine. Relational engine have 3 main components:
    1)  CMD Parser
    2)  Query Optimizer
    3)  Query Executor

First The cmd parser processes T-SQL language events sent to SQL Server. It checks for proper syntax and spelling of keywords. If you type SEELCT instead of SELECT, then parsing stops and SQL Server returns an error to the query source. After a query is parsed, a binding component performs name resolution to convert the object names into their unique object ID values.Here the table name alias are replaced with actual value.
After the parsing and binding is done, the command is converted into an internal format that can be operated on. This internal format is known as a query tree. If object name can’t be resolved, an error is immediately raised that identifies where the error occurred.

Then it search for execution plan in plan cache if it already exist then it will be reused via Query executor. 

If it does not exist in Plan cache then the query tree will be passed to query optimizer.

The Query Optimizer determines what are the possible ways to execute query which will be best.

In other words, the optimizer figures out how best to implement the request represented by the T-SQL query you submitted. It decides if it can access the data through indexes, what types of joins to use and much more. The decisions made by the optimizer are based on what it calculates to be the cost of a given execution plan, in terms of the required CPU processing and I/O. Hence, this is a cost-based plan.

It compiles the entire command batch and optimizes queries that are optimizable. The query optimization 
and compilation result in an execution plan.

The Query Optimizer also uses pruning heuristics to ensure that optimizing a query doesn’t take longer than required to simply choose a plan and execute it. Suppose If optimizer spends more than a minute to select a best plan then you can think the performance of that query.."More than worst".. :( To avoid this pruning heuristics algorithm is used.

                                                                                        Image Source: Quest Software (Kevin Kline)


Query execution:
Once the optimizer has generated an execution plan, or retrieved one from cache, the action switches to the storage engine, which usually executes the query according to the plan. It will store the newly generated plan in plan cache for future use. The relational engine communicate to storage engine via OLE DB.the storage engine is composed of three main areas:
     Access methods, 
    Locking and transaction services, 
   Utility commands

Access Methods:
Now SQL Server needs to locate data, it calls the access methods code, which sets up and requests scans of data pages and index pages and prepares the OLE DB rowsets to return to the relational engine.

The access methods code contains components to open a table, retrieve qualified data, and update data. It doesn’t actually retrieve the pages; instead, it makes the request to the buffer manager, which ultimately serves up the page in its cache or reads it to cache from disk. When the scan starts, a look-ahead mechanism qualifies the rows or index entries on a page. The retrieving of rows that meet specified criteria is
known as a qualified retrieval. The access methods code is used not only for SELECT statements but also for qualified UPDATE and DELETE statements.

So if the data is not available in data cache then data will brings to disk and we will go ahead. Sql server do not work directly with disk in case of data file, it does all changes in cache. In case of log file it directly write to disk. 

So, finally rowsets will be returns to relational engine and the required data sent back to the user using same network protocol.

So, the four important steps which are performed inside sql server are as follows:

----->----Parse-------->----Bind-->--------Optimize--->-------Execute

Please correct me in case of missing things....Comments and questions are welcome..... :)

Thanks You!!!!!!

References: 
SQL Server Internals
SQL Server Execution Plans










Sunday, February 2, 2014

How To Recycle Error Log in SQL Server

SQL SERVER –How To Recycle Error Log

In this short post I am going to talk about Error logs which SQL Server maintains and how can we manage it.The SQL Server error log contains user-defined events and certain system events. We use this error log to troubleshoot problems related to SQL Server. The SQL Server Error Log is a great place to find information about what's happening on your database server.

To view the SQL Server error log:

1. In Object Explorer, expand a server, expand Management, and then expand SQL Server Logs.2. Right-click a log and click View SQL Server Log.

IMAGE: 1

MS SQL Server uses 7 files to store these messages. The first file serves as the current error log file, it records every new message to this. The other 6 files are archived files, and they contain previous messages.

Every time when SQL Server is restarted, it recycles the files. It means that after restarting the system the current log file is new log file which is created after restarting the system and the current log file which was there before restarting the system becomes Archive #1 and Archive #1 becomes Archive #2 and so on.. and the oldest log file ("Archive #6") is deleted.

If you want to increase the number of error log files you can change it to any number between 6 and 99. The way to do it is by right-clicking the "SQL Server Logs" folder in SSMS and choosing "Configure". A window will appear as follows:


IMAGE:2

Now check to the option and go ahead with your number.

The real problem comes when you are working with production databases where they do not restart the system everyday so the current log file size increases and It takes so much time when we try open that. The default location of the log file is : "C:\Program Files\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL\Log" (for a SQL Server 2008 R2 instance).

To make the error log file shorter we have two other options either we can set up Sql server agent job which will do the recycling based on the scheduled time or we can manually do it via system store procedure i.e. : sys.sp_cycle_errorlog.

Now to do it manually we can run the system procedure sys.sp_cycle_errorlog as follows:

Open a new Query window and run :

EXEC sys.sp_cycle_errorlog



Now you can see the new current log file has a new time stamp in below screen shot. You can match it with first image. This is how we can recycle it without restarting the system.


Thanks for reading this article. Suggestion and most welcome!!!!