Learn MsSql Server
Monday, 16 November 2015
Tuesday, 1 September 2015
What is heap table?
Heap table:-
A table without cluster index is called Heap table. Now you are thinking why we are talking about this.
So as SQL developer we should aware and In next post I will explain which is related to maintenance how to Rebuild a heap table.
Now our next question is how to determine all the heap table from a database
so below is simplest query to determine all the heap tables from a database
SELECT T.Name ‘Heaptable’
FROM sys.indexes I
INNER JOIN sys.tables T
ON I.object_id = T.object_id
WHERE I.type = 0 AND T.type = ‘U’
I ran this query on Adventureworks database and 2 tables with heap as shown in below figure
What is extent?
Extents are basic unit in which space is allocated to table or indexes.An extent is 8 contiguous pages .Each page is having size of 8k.
Saturday, 13 June 2015
Performance Tuning - Statistics - The heart of Performance.
Last post we have seen how to read excecution plan .It was brief post about execution plan.you may have question in your mind how this execution plan is created ?On what basis SQL Server engine generate plan? how many plan gets generated and which plan SQL Engine select?
I will try to make it as simple as .There should be something in sql server which act as meta data to take decision .Yes there is .That something is called as Statistics.
Now next question can come to in your mind how it gets created and who create it?
Statistics get created in following cases and Sql server engine automatically creates it.
1.When we create any index .
2. Running query using where or join clause so individual statistics will be created first time .
It will be updated later point if auto_update_statistics is ON.
E.g.
CREATE TABLE CREDITCARD
(
[Credit_Card_id] [INT] NOT NULL,
[Name] [VARCHAR](100) NULL,
[Limit] [INT] NULL,
[Valid_from] [INT] NULL,
[Valid_To] [INT] NULL,
)
Go
select * from CREDITCARD
Now we have created table and seeing statistics are created are not.
You will see no statistics are created yet.
Lets run below query ,
SELECT *
FROM CREDITCARD
WHERE Credit_Card_id = 1
AND Name = 'test'
AND Limit = 1
AND Valid_from = 20
AND Valid_To = 30
I will try to make it as simple as .There should be something in sql server which act as meta data to take decision .Yes there is .That something is called as Statistics.
Now next question can come to in your mind how it gets created and who create it?
Statistics get created in following cases and Sql server engine automatically creates it.
1.When we create any index .
2. Running query using where or join clause so individual statistics will be created first time .
It will be updated later point if auto_update_statistics is ON.
E.g.
CREATE TABLE CREDITCARD
(
[Credit_Card_id] [INT] NOT NULL,
[Name] [VARCHAR](100) NULL,
[Limit] [INT] NULL,
[Valid_from] [INT] NULL,
[Valid_To] [INT] NULL,
)
Go
select * from CREDITCARD
Now we have created table and seeing statistics are created are not.
You will see no statistics are created yet.
Lets run below query ,
SELECT *
FROM CREDITCARD
WHERE Credit_Card_id = 1
AND Name = 'test'
AND Limit = 1
AND Valid_from = 20
AND Valid_To = 30
Now if you refresh statistics you will see five statistics object are created .Lets see how to see
what data is there in that statistics .
We will use DBCC SHOW_STATISTICS to see data inside statistics.
Note : For demo I am considering only one statistics object.Name of your statistic will be different as it creates dynamically.
DBCC SHOW_STATISTICS('CREDITCARD', _WA_Sys_00000001_0425A276)
Now we are seeing statistics but no data is there.Now lets do some insert.Lets insert data and run once above command.
insert into CREDITCARD
select 1,'Shrikant',25000,1,22
Go
SELECT *
FROM CREDITCARD
WHERE Credit_Card_id = 1
AND Name = 'test'
AND Limit = 1
AND Valid_from = 20
AND Valid_To = 30
DBCC SHOW_STATISTICS('CREDITCARD', _WA_Sys_00000001_0425A276).
Now you will see very important information.
- The number of rows and pages occupied by a table's data
- The time that statistics were last updated
- The average length of keys in a column
- Histograms showing the distribution of data in a column
- String summaries that are used when performing LIKE queries on character data
In this third row is very important .In this we can see how many distinct values are there for that column .Ther is one column called EQ_ROW which is very useful for calculating plan.
Lets go bit inside.I am inserting a 10000 record in above table.
declare @lclId int =1
while 1= 1
begin
if @lclId>10000
break
insert into CREDITCARD
select @lclId,'Shrikant',25000,1,22
set @lclId +=1
end
Now If you run following select statement and DBBC command
Go
SELECT *
FROM CREDITCARD
WHERE
Name = 'Shrikant'
AND Limit = 25000
AND Valid_from = 1
AND Valid_To = 22 and Credit_Card_id =1
GO
DBCC SHOW_STATISTICS('CREDITCARD',_WA_Sys_00000001_0425A276 )
you can see vaues for update and EQ_Row.EQ row is used to calculate estimated row count.
Note : You can manually update statistics by using sp_updatestats procedure for entire database.
If you want to update statistics for a table in above case you can use
UPDATE STATISTICS CREDITCARD
If you want to update statistics for a specific statistics in above case you can use
UPDATE STATISTICS CREDITCARD _WA_Sys_00000001_0425A276;
In next post we will try to understand how statistics are useful for performance tuning.
Saturday, 6 June 2015
Sql Tunining - Understanding execution Plan in Simple Way
Last post we have seen how to clear data pages ,index pages and execution plan from buffer.Now next step in performance tuning is to understand execution plan.you may have question like how the execution plan is generated? who generate it?how it is important? or you may have lot of question which I have not mentioned here. Execution plan is visual representation of the operation performed by database engine in order to return data required by the query.If you cam read and understand execution plans then you can better understand how you query is executing internally.
How it will helpful to increase performance as I don't know how to read it ?
I will answer this question like suppose you have one query and you have executed and execution plan is generated .You have execution plan too.Reading execution is bit difficult first time but after some time you will find it easy. You may have question from which side I will start reading it? You always need too start from right side.Logic is simple we will have few input query ,which will merge at last and provide result .Yo can see few different terms like merge join ,inner join,nested loop,RID look up,Stream Aggregated etc all this term are called operator in execution plan .So you will see what are my input queries which are generating result,which query is taking more time and how better I can tune query to get performance.
Lets try to understand simple execution Plan,
Script :
CREATE TABLE TEST3
(
[Credit_Card_id] [INT] NOT NULL,
[Name] [VARCHAR](100) NULL,
[Limit] [INT] NULL,
[Valid_from] [INT] NULL,
[Valid_To] [INT] NULL,
)
select * from Test3
We have created script and lets run it ans see execution plan,
You can see above execution plan and try to understand how it looks like.It shows two operator ie. Select and Table Scan.From where you will start reading it ? Off course you will start from right .you can understand there is table scan on Test3 table and finally select is running and giving output.Lets try to understand bit complex execution plan.
Script :
SELECT *
FROM test3 Test1
INNER JOIN test3 test2
ON Test1.Credit_Card_id = test2.Credit_Card_id
This query has two input as we are joining two table .Lets understand it with help of execution plan
In above plan you can see three operator Select,Table Scan,Hash Join .We can clearly see there are two input Test1 and Test2 each one is contributing 13% so total time for forming input is 26%.
Now you can see 73% of time is for hash join which is because our input tables are not having any index if we create index on this table we can avoid hash join and gain performance.
Lets create index and will see what will happen,
CREATE TABLE TEST3
(
[Credit_Card_id] [INT] NOT NULL,
[Name] [VARCHAR](100) NULL,
[Limit] [INT] NULL,
[Valid_from] [INT] NULL,
[Valid_To] [INT] NULL,
PRIMARY KEY CLUSTERED ( [Credit_Card_id] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
)
Run below query,
SELECT *
FROM test3 Test1
INNER JOIN test3 test2
ON Test1.Credit_Card_id = test2.Credit_Card_id
.
In above plan you can see comparison cost is 0% So we have increase performance of the query.Please let me know if you have doubt.Please comment if you like it.
How it will helpful to increase performance as I don't know how to read it ?
I will answer this question like suppose you have one query and you have executed and execution plan is generated .You have execution plan too.Reading execution is bit difficult first time but after some time you will find it easy. You may have question from which side I will start reading it? You always need too start from right side.Logic is simple we will have few input query ,which will merge at last and provide result .Yo can see few different terms like merge join ,inner join,nested loop,RID look up,Stream Aggregated etc all this term are called operator in execution plan .So you will see what are my input queries which are generating result,which query is taking more time and how better I can tune query to get performance.
Lets try to understand simple execution Plan,
Script :
CREATE TABLE TEST3
(
[Credit_Card_id] [INT] NOT NULL,
[Name] [VARCHAR](100) NULL,
[Limit] [INT] NULL,
[Valid_from] [INT] NULL,
[Valid_To] [INT] NULL,
)
select * from Test3
We have created script and lets run it ans see execution plan,
You can see above execution plan and try to understand how it looks like.It shows two operator ie. Select and Table Scan.From where you will start reading it ? Off course you will start from right .you can understand there is table scan on Test3 table and finally select is running and giving output.Lets try to understand bit complex execution plan.
Script :
SELECT *
FROM test3 Test1
INNER JOIN test3 test2
ON Test1.Credit_Card_id = test2.Credit_Card_id
This query has two input as we are joining two table .Lets understand it with help of execution plan
In above plan you can see three operator Select,Table Scan,Hash Join .We can clearly see there are two input Test1 and Test2 each one is contributing 13% so total time for forming input is 26%.
Now you can see 73% of time is for hash join which is because our input tables are not having any index if we create index on this table we can avoid hash join and gain performance.
Lets create index and will see what will happen,
CREATE TABLE TEST3
(
[Credit_Card_id] [INT] NOT NULL,
[Name] [VARCHAR](100) NULL,
[Limit] [INT] NULL,
[Valid_from] [INT] NULL,
[Valid_To] [INT] NULL,
PRIMARY KEY CLUSTERED ( [Credit_Card_id] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
)
Run below query,
SELECT *
FROM test3 Test1
INNER JOIN test3 test2
ON Test1.Credit_Card_id = test2.Credit_Card_id
Execution Plan,
.
In above plan you can see comparison cost is 0% So we have increase performance of the query.Please let me know if you have doubt.Please comment if you like it.
Friday, 5 June 2015
Sql Tuning Basic - Part 1
In previous post we have seen why performance is important for any organization and how it related with fame and name in market.Now lets start our technical discussion from this post onwards.
Before start performance tuning concept we need to know some thing about Buffer Pool.
SQL Server buffer pool , also called an SQL Server cache ,is a place in system memory that is used for table and index data pages as they are modified or read from disk also it contains execution plane .As this is basic post I will try to make as simple as I can do.So when we are doing performance testing we must clear buffer cache.One of my friend ask me what will happen if we will not clear buffer pool.Answers was " It can mislead your testing as data (pages) are in buffer so it will not fetch from disk and it will choose execution plan from cache so you will see performance but it is not ".
He asked me again
How will you clear buffer pool i,e Index/Data pages and execution plan from memory?
Ans -->
It is very simple , Sql server has DBCC command ,We will use following command
DBCC dropcleanbuffers
Note : This command is very useful it will help us to do our performance testing without shut-down or restarting server.
If your more intrested to know what is there in my buffer at that moment and if you want see it .
You can use following DMV which will give all detail .
select * From sys.dm_os_buffer_descriptors
If you run above command you will lose your data /index pages and execution plan from buffer pool.That what we want .Now you are ready to start your performance testing which will give you accurate result on test and production if it is clone.
There is one more command ideally used before clearing cache its CheckPoint
CHECKPOINT :
Writes all dirty pages for the current database to disk.Dirty pages are data pages that have been entered into the buffer cache and modified ,but not yet written to disk.Checkpoints save time during a later recovery by creating a point at which all dirty pages are guaranteed to have been written to disk.
Checkpoint
Go
DBCC DropCleanBuffers
Please use above above command before comparing stats ie. before and after.Next post we will start actual performance tuning.Please comment if you like this post.
Before start performance tuning concept we need to know some thing about Buffer Pool.
SQL Server buffer pool , also called an SQL Server cache ,is a place in system memory that is used for table and index data pages as they are modified or read from disk also it contains execution plane .As this is basic post I will try to make as simple as I can do.So when we are doing performance testing we must clear buffer cache.One of my friend ask me what will happen if we will not clear buffer pool.Answers was " It can mislead your testing as data (pages) are in buffer so it will not fetch from disk and it will choose execution plan from cache so you will see performance but it is not ".
He asked me again
How will you clear buffer pool i,e Index/Data pages and execution plan from memory?
Ans -->
It is very simple , Sql server has DBCC command ,We will use following command
DBCC dropcleanbuffers
Note : This command is very useful it will help us to do our performance testing without shut-down or restarting server.
If your more intrested to know what is there in my buffer at that moment and if you want see it .
You can use following DMV which will give all detail .
select * From sys.dm_os_buffer_descriptors
If you run above command you will lose your data /index pages and execution plan from buffer pool.That what we want .Now you are ready to start your performance testing which will give you accurate result on test and production if it is clone.
There is one more command ideally used before clearing cache its CheckPoint
CHECKPOINT :
Writes all dirty pages for the current database to disk.Dirty pages are data pages that have been entered into the buffer cache and modified ,but not yet written to disk.Checkpoints save time during a later recovery by creating a point at which all dirty pages are guaranteed to have been written to disk.
Checkpoint
Go
DBCC DropCleanBuffers
Please use above above command before comparing stats ie. before and after.Next post we will start actual performance tuning.Please comment if you like this post.
Sql Tunning - Introduction
Some of my colleague and friends asked me to write post on performance tuning .I was busy on some development project so I didn't get time to write a post. Performance is one measure on which we rate anything, suppose I have developed product it has all feature which satisfy any business need.It gives accurate prediction . I want to sell that product in million dollar but If I say for accurate prediction you only need to wait .Then buyer will straight forward say "Sorry not interested " .Now days life is fast no one wants to wait .They need result on the fly so if your product is so good but very low performance then no one will buy it.
So performance is key factor for any organization to get fame in the market.Suppose company developed product and sold to local customer.Starting time customer was very happy with result ,response and services from the company.Now customer expands business to all over India .So company will charge more money from them for support.They will happily give it with some negotiation.As the business expands customer will start facing issue with response i.e. performance.They will say initial time we was getting response in millisecond now we need to wait for 1 minute .So response time is increased by 60 - 100% .Then situation will become very worse for company .System was originally tested on database whose size varies between 1 - 2 GB. Now data size growing drastically i.e. it is flowing in TB . So response time what there are getting is correct only interesting thing everything is working as expected .It will work as expected if they have proper deadlock management and blocking mechanism.
Thing is how you face your customer now.They will force you give us solution to resolve performance issue .Now they are more dependant on you .So this is time to get fame and name in the market.Making customer happy is always healthier for company financial growth.
This is time to think something bigger which you have not thought while system was developed.How will you make customer happy in this case , if they got good response from application they will be happy.
This is just introductory post I will start technical things from next post which we can use to increase performance .
So performance is key factor for any organization to get fame in the market.Suppose company developed product and sold to local customer.Starting time customer was very happy with result ,response and services from the company.Now customer expands business to all over India .So company will charge more money from them for support.They will happily give it with some negotiation.As the business expands customer will start facing issue with response i.e. performance.They will say initial time we was getting response in millisecond now we need to wait for 1 minute .So response time is increased by 60 - 100% .Then situation will become very worse for company .System was originally tested on database whose size varies between 1 - 2 GB. Now data size growing drastically i.e. it is flowing in TB . So response time what there are getting is correct only interesting thing everything is working as expected .It will work as expected if they have proper deadlock management and blocking mechanism.
Thing is how you face your customer now.They will force you give us solution to resolve performance issue .Now they are more dependant on you .So this is time to get fame and name in the market.Making customer happy is always healthier for company financial growth.
This is time to think something bigger which you have not thought while system was developed.How will you make customer happy in this case , if they got good response from application they will be happy.
This is just introductory post I will start technical things from next post which we can use to increase performance .
Subscribe to:
Posts (Atom)