Missing dll file: Mitigate slow I/O during Checkpoint?

Hi,
We have a moderately sized (100GB) database which is running on a
dedicated SQL Server. Our application is a mixture of OLTP and OLAP,
and we've recently begun an effort to speed up the performance by
cleaning up some poorly written queries, reconfiguring indexes, etc.
One of the things we've noticed while running Performance Monitor is
that when SQL Server issues a checkpoint (every 80 seconds or so during
peak production times), the write queue on the disk array containing our
data files goes from an average length of about 0.2 to about 100 and
stays there for the duration of the checkpoint, typically about 15
seconds or so. During this time queries which normally execute very
quickly (50ms or so), will sometimes require a second or more to
complete.
We opened a case with Microsoft, and the first thing they had us do was
to move our log files to a different disk array, but that didn't seem to
have any effect. We still haven't moved the log files for tempdb,
master, etc., but I wouldn't think those would matter nearly as much as
the one for our datafile. Subsequent suggestions from them don't seem
to have helped either.
Is this normal behavior? Are we I/O bound and simply require a faster
disk subsystem? Is there something in our hardware configuration which
can/should be changed to help out? Are there any SQL configuration
settings we could change to either reduce or eliminate the problem? Any
suggestions would be greatly appreciated.
Our server configuration is as follows:
Dell 2650
Dual 2.6GHz Xeon
2GB RAM
(3) 73GB 10K drives RAID 5 (OS & Data File Logs)
Powervault 220 w/14 36GB 15K drives RAID 5 (Data Files & SQL System
Logs)
Windows Server 2003
SQL 2000 Enterprise w/SP3
Thanks,
Chris
If responding via e-mail, place a period between my first and last name.The only way I have ever been able to solve the checkpoint IO bottleneck was
to install SQL to a SAN with a decent sized write cache. Fibre channel is
full duplex so the writes do not eat all the bus bandwidth. A large cache
can absorb a checkpoint and write it to the disks in the background.
Typical SCSI controllers with 64MB or 128 MB of cache will always be
overwhelmed in a production environment by a sufficiently large checkpoint.
Geoff N. Hiten
Microsoft SQL Server MVP
Senior Database Administrator
"Chris Birk" <chrisbirk@.gmail.com> wrote in message
news:Xns961A7E02A1B16chrisbirkgmailcom@.2
07.46.248.16...
> Hi,
> We have a moderately sized (100GB) database which is running on a
> dedicated SQL Server. Our application is a mixture of OLTP and OLAP,
> and we've recently begun an effort to speed up the performance by
> cleaning up some poorly written queries, reconfiguring indexes, etc.
> One of the things we've noticed while running Performance Monitor is
> that when SQL Server issues a checkpoint (every 80 seconds or so during
> peak production times), the write queue on the disk array containing our
> data files goes from an average length of about 0.2 to about 100 and
> stays there for the duration of the checkpoint, typically about 15
> seconds or so. During this time queries which normally execute very
> quickly (50ms or so), will sometimes require a second or more to
> complete.
> We opened a case with Microsoft, and the first thing they had us do was
> to move our log files to a different disk array, but that didn't seem to
> have any effect. We still haven't moved the log files for tempdb,
> master, etc., but I wouldn't think those would matter nearly as much as
> the one for our datafile. Subsequent suggestions from them don't seem
> to have helped either.
> Is this normal behavior? Are we I/O bound and simply require a faster
> disk subsystem? Is there something in our hardware configuration which
> can/should be changed to help out? Are there any SQL configuration
> settings we could change to either reduce or eliminate the problem? Any
> suggestions would be greatly appreciated.
> Our server configuration is as follows:
> Dell 2650
> Dual 2.6GHz Xeon
> 2GB RAM
> (3) 73GB 10K drives RAID 5 (OS & Data File Logs)
> Powervault 220 w/14 36GB 15K drives RAID 5 (Data Files & SQL System
> Logs)
> Windows Server 2003
> SQL 2000 Enterprise w/SP3
> Thanks,
> Chris
> --
> If responding via e-mail, place a period between my first and last name.|||It appears you application is very write-intensive. Consider placing data
files on RAID 10.
Hope this helps.
Dan Guzman
SQL Server MVP
"Chris Birk" <chrisbirk@.gmail.com> wrote in message
news:Xns961A7E02A1B16chrisbirkgmailcom@.2
07.46.248.16...
> Hi,
> We have a moderately sized (100GB) database which is running on a
> dedicated SQL Server. Our application is a mixture of OLTP and OLAP,
> and we've recently begun an effort to speed up the performance by
> cleaning up some poorly written queries, reconfiguring indexes, etc.
> One of the things we've noticed while running Performance Monitor is
> that when SQL Server issues a checkpoint (every 80 seconds or so during
> peak production times), the write queue on the disk array containing our
> data files goes from an average length of about 0.2 to about 100 and
> stays there for the duration of the checkpoint, typically about 15
> seconds or so. During this time queries which normally execute very
> quickly (50ms or so), will sometimes require a second or more to
> complete.
> We opened a case with Microsoft, and the first thing they had us do was
> to move our log files to a different disk array, but that didn't seem to
> have any effect. We still haven't moved the log files for tempdb,
> master, etc., but I wouldn't think those would matter nearly as much as
> the one for our datafile. Subsequent suggestions from them don't seem
> to have helped either.
> Is this normal behavior? Are we I/O bound and simply require a faster
> disk subsystem? Is there something in our hardware configuration which
> can/should be changed to help out? Are there any SQL configuration
> settings we could change to either reduce or eliminate the problem? Any
> suggestions would be greatly appreciated.
> Our server configuration is as follows:
> Dell 2650
> Dual 2.6GHz Xeon
> 2GB RAM
> (3) 73GB 10K drives RAID 5 (OS & Data File Logs)
> Powervault 220 w/14 36GB 15K drives RAID 5 (Data Files & SQL System
> Logs)
> Windows Server 2003
> SQL 2000 Enterprise w/SP3
> Thanks,
> Chris
> --
> If responding via e-mail, place a period between my first and last name.

Monday, March 19, 2012

Mitigate slow I/O during Checkpoint?

No comments:

Post a Comment

Missing dll file

Blog Archive

About Me