Saturday, 13 June 2015

Performance Tuning - Statistics - The heart of Performance.

Last post we have seen how to read excecution plan .It was brief  post about execution plan.you may have question in your mind how this execution plan is created ?On what basis SQL Server engine generate  plan?  how many plan gets generated  and which plan SQL Engine select?
   I will try to make it as simple as .There should be something  in sql server which act as meta data to take decision  .Yes there is .That something is called as Statistics. 
  Now next question can come to in your mind how it gets created and who create it?
   Statistics get created in following cases and Sql server engine automatically creates it.
1.When we create any index .
2. Running query using where  or join clause so individual statistics will be created first time .
 It will be updated later point if auto_update_statistics is ON.

E.g.
CREATE TABLE  CREDITCARD
  (
     [Credit_Card_id] [INT] NOT NULL,
     [Name]           [VARCHAR](100) NULL,
     [Limit]          [INT] NULL,
     [Valid_from]     [INT] NULL,
     [Valid_To]       [INT] NULL,
 
  )

  Go
 select * from CREDITCARD
  Now we have created table and seeing statistics are created are not.
  You will see no statistics are created yet.

Lets run below query ,

SELECT *
  FROM   CREDITCARD
  WHERE  Credit_Card_id = 1
         AND Name = 'test'
         AND Limit = 1
         AND Valid_from = 20
         AND Valid_To = 30
Now if you refresh statistics you will see five statistics object are created .Lets see how to see  
what data is there in that statistics .


We will use DBCC   SHOW_STATISTICS to see data inside statistics. 

Note : For demo I am considering only one statistics object.Name of your statistic will be different as it creates dynamically.

  DBCC SHOW_STATISTICS('CREDITCARD',  _WA_Sys_00000001_0425A276)



Now we are seeing statistics but no data is there.Now lets do some insert.Lets  insert data and run once above command.

                insert into CREDITCARD 
               select 1,'Shrikant',25000,1,22
  
              Go
           SELECT *
                         FROM   CREDITCARD
                        WHERE  Credit_Card_id = 1
                       AND Name = 'test'
                       AND Limit = 1
                      AND Valid_from = 20
                    AND Valid_To = 30 
  
  DBCC SHOW_STATISTICS('CREDITCARD', _WA_Sys_00000001_0425A276).
Now you will see very important information.
  • The number of rows and pages occupied by a table's data
  • The time that statistics were last updated
  • The average length of keys in a column
  • Histograms showing the distribution of data in a column
  • String summaries that are used when performing LIKE queries on character data


In this third row is very important .In this we can see how many distinct values are there for that column .Ther is one column  called EQ_ROW which is very useful for calculating plan.
Lets go bit inside.I am inserting a 10000 record in above table.

  declare @lclId int =1
  while 1= 1
  begin
  if @lclId>10000
  break
  
  insert into CREDITCARD 
  select  @lclId,'Shrikant',25000,1,22
  set @lclId +=1
    end 
Now If you run following select statement and DBBC command
Go
  SELECT *
  FROM   CREDITCARD
  WHERE   
           Name = 'Shrikant'
         AND Limit = 25000
         AND Valid_from = 1
         AND Valid_To = 22  and Credit_Card_id =1
  GO
  DBCC SHOW_STATISTICS('CREDITCARD',_WA_Sys_00000001_0425A276 )
  
you can see vaues for update and EQ_Row.EQ row is used to calculate estimated row count.

Note : You can manually update statistics by using sp_updatestats procedure for entire database.
If you want to update statistics for a table in above case you can use

UPDATE STATISTICS CREDITCARD

If you want to update statistics for a specific statistics in above case you can use

UPDATE STATISTICS CREDITCARD     _WA_Sys_00000001_0425A276;


In next post we will try to understand how statistics are useful for performance tuning.
  

Saturday, 6 June 2015

Sql Tunining - Understanding execution Plan in Simple Way

  Last post we have seen how to clear data pages ,index pages and execution plan from buffer.Now next step in performance tuning is to understand execution plan.you may have question like how the execution plan is  generated? who generate it?how it is important? or you may have lot of question which I have not mentioned here.  Execution plan is visual representation of the operation performed by database engine in order to return data required by the query.If you cam read and understand execution plans then you can better understand how you query is executing internally.

How it will helpful to increase performance as I don't know how to read it ?
              I will answer this question like suppose you have one query and you have executed  and execution plan is generated .You have execution plan too.Reading execution is bit difficult first time but after some time you will find it easy. You may have question from which side I will start reading it? You always need too start from right side.Logic is simple we will have  few input  query ,which will merge at last and provide result  .Yo can see few different terms like merge join ,inner join,nested loop,RID look up,Stream Aggregated etc all this term are called operator in execution plan .So you will see what are my input queries which are generating result,which query is taking more time  and how better I can  tune query to get performance.
         Lets try to understand simple execution Plan,
Script :

CREATE TABLE  TEST3
  (
     [Credit_Card_id] [INT] NOT NULL,
     [Name]           [VARCHAR](100) NULL,
     [Limit]          [INT] NULL,
     [Valid_from]     [INT] NULL,
     [Valid_To]       [INT] NULL,
    
  )

select * from Test3

We have created script and lets run it ans see execution plan,

You can see above execution plan  and try to understand how it looks like.It shows two operator ie. Select and Table Scan.From where you will start reading it ? Off course you will start from right .you can understand there is table scan on Test3 table and finally select is running and giving output.Lets try to understand bit complex execution plan.

Script :

SELECT *
FROM   test3 Test1
       INNER JOIN test3 test2
               ON Test1.Credit_Card_id = test2.Credit_Card_id

  This query has two input as we are joining  two table .Lets  understand it with help of execution plan




 In above plan you can see three operator Select,Table Scan,Hash Join .We can clearly see there are two input Test1 and Test2  each one is contributing 13% so total time for forming input is 26%.
Now you can see 73% of time is for hash join which is because our input tables are not having any index if we create index on this table we can avoid hash join and gain performance.
Lets create index  and will see what will happen,


CREATE TABLE  TEST3
  (
     [Credit_Card_id] [INT] NOT NULL,
     [Name]           [VARCHAR](100) NULL,
     [Limit]          [INT] NULL,
     [Valid_from]     [INT] NULL,
     [Valid_To]       [INT] NULL,
   PRIMARY KEY CLUSTERED ( [Credit_Card_id] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
  )

Run below query,
SELECT *
FROM   test3 Test1
       INNER JOIN test3 test2
               ON Test1.Credit_Card_id = test2.Credit_Card_id

Execution Plan,


.



In above plan you can see comparison cost is 0% So we have increase performance of the query.Please let me know  if you have doubt.Please comment if you like it.


Friday, 5 June 2015

Sql Tuning Basic - Part 1

In previous post we have seen why performance is important for any organization  and how it related with fame and name in market.Now lets start our technical discussion from this post onwards.
Before start performance tuning concept we need to know some thing about  Buffer Pool.
             SQL Server buffer pool , also called an SQL Server cache ,is  a place in system memory that is used for table and index data pages as they are modified or read from disk also it contains execution plane .As this is basic post I will try to make as simple as I can do.So when we are doing performance testing we must clear  buffer cache.One of my friend ask me what will happen if we will not clear buffer pool.Answers was " It can mislead your testing as data (pages)  are in buffer so it will not fetch from disk and it will choose execution plan from cache  so you will see performance but it is not ".
 He asked me  again

How will you clear buffer pool i,e Index/Data pages and execution plan from memory?

Ans -->
      It is very simple , Sql server has DBCC command ,We will use following command

              DBCC dropcleanbuffers

    Note : This command is very useful it will help us to do our performance  testing without shut-down or restarting server.
     If your more intrested to know what is there in my buffer at that moment and if you want see it .
You can use following DMV which will give all detail .

        select  *  From sys.dm_os_buffer_descriptors

 If you run above command you will lose your data /index pages and execution plan from buffer pool.That what we want .Now you are ready to start your performance testing which will give you accurate result on test and production if it is clone.
  There is one more command ideally used before clearing cache its CheckPoint

CHECKPOINT :
   Writes all dirty pages for the current database to disk.Dirty pages are data pages that have been entered into the buffer cache and modified ,but not yet written to disk.Checkpoints save time during a later recovery by creating a point at which all dirty pages are guaranteed to have been written to disk.

Checkpoint
Go
DBCC DropCleanBuffers

Please use above above command before comparing stats ie. before and after.Next post we will start actual performance tuning.Please comment if you like this post.



Sql Tunning - Introduction

    Some of my colleague and friends asked me to write post  on performance tuning .I was busy on some development project so I didn't get time to write a post.  Performance is one measure on which we rate anything, suppose I have developed product it has all feature which satisfy any business need.It gives accurate prediction . I want to sell that product  in million dollar  but  If I say for accurate prediction you only need to wait .Then buyer will straight forward say "Sorry  not interested " .Now days life is fast no one wants to wait .They need result on the fly so if your product is so good but very low performance then  no one will buy it.
   So performance is key factor for any organization to get fame in the market.Suppose  company developed product  and sold to local  customer.Starting time customer was very happy with result ,response and services from the company.Now customer expands business to all over  India .So company will charge more money from them  for support.They will happily give it with some negotiation.As the business expands customer will start facing issue with response i.e. performance.They will say initial time we was getting response in millisecond now we need to wait  for  1 minute .So response time is increased by 60 - 100% .Then situation will become very worse for company .System was originally tested on database whose size varies between  1 - 2 GB. Now data size growing drastically i.e. it is flowing in TB  . So response time what there are getting is correct only interesting thing everything is working as expected .It will work as expected if they have proper deadlock management and blocking mechanism.
   Thing is how you face your customer now.They will force you give us solution to resolve performance issue .Now they are more dependant on you .So this is time to get fame and name in the market.Making customer happy is always healthier for company financial growth.
   This is time to think something bigger which you have not thought while system was developed.How will you make customer happy in this case , if they got good response  from application they will be happy.
  This is just introductory post I will start technical things  from next post which we can use to increase performance .