I’ve been working on performance issues in the production database this week, taking it from a “server” level instead of database/table/index level and also adding some query changes. The database is working better, and I am continuing to keep it heading in that direction.

Adding OPTION (MAXDOP 1) to certain queries has increased performance (sometimes dramatically), while SERVER Degrees of Parallelism setting changes has contributed to an overall performance gain.

RESULTS:

By keeping an eye on the WAIT statistics in the database, I have been able to see evidence of performance improvements. The percentage shown is the amount of wait time for the Wait Type compared to Total Wait. It is suggested that the CXPACKET wait should account for less than 10% of wait time in a tuned-up system.

Wait Type June 3rd June 4th June 5th
CXPACKET 29.7% 11.8% 8.9%
PAGEIOLATCH 16.7% 7.0% 6.3%

Total for These 46.4% 19.8% 15.2%
OLEDB 25.9% 72% 77.4%

As you can see, the OLEDB wait has increased quite a bit over this time period. That’s because of the decrease in CXPACKET and PAGEIOLATCH waits in comparison. OLEDB is next on the list, and refers to our database waiting on other databases to give it information or other databases waiting on our database to be ready to accept more data.

DETAILS:

Within the last week, I’ve been researching and working on the performance issues we sometimes experience in our database. What I found when examining the history on various waits was that we had a couple things holding us up: CXPACKET and PAGEIOLATCH_ waits were much higher than they should be. We also had very high Average Disk Queue Length rates and high number of Context Switches/second.

CXPACKET:  These waittypes are all involved in parallel query execution. These waittypes indicate that the SPID is waiting on a parallel process to complete or start.

Context Switches/Second: The average rate per second at which context switches among threads on the computer. High activity rates can result from inefficient hardware or poorly designed applications.

PAGEIOLATCH_: Occurs when a task is waiting on a latch for a buffer that is in an I/O request. The latch request is in Shared mode. Long waits may indicate problems with the disk subsystem.

Average Disk Queue Length:  The average number of write requests that were queued for the selected disk during the sample interval. High disk queue length are caused by a lot of insert/update activity, as well as reindexing or anything else that changes database objects.

CXPACKET:

The first issue I’ve tackled has been the CXPACKET waits. When CXPACKET rates are high and Context Switches/Second are high, it means that queries are trying to take all of the processing resources they can get their hands on for use in parallel processing, and the system is giving each request everything she’s got. The system context will break apart threads and send it to various processors for work (we have 4), work on them a bit, and switch the threads around again and keep chugging. If other queries are waiting, it will try to throw them into the mix, too, but they have to (CXPACKET) wait in line to get broke apart and thrown to the various CPUs as they are freed up. This type of parallel processing seems very good on the surface, but in a system that is more OLTP (like ours), this causes traffic jams, leading to waits that are felt by people on the other end looking for data to pop up.

To address this, I’ve changed the server settings for parallel processing and set maximum degrees of parallelism for certain queries. Limiting parallel processing improves performance.

SERVER:

Our Max Degree of Parallelism setting on the server was set to 0 (unlimited). This means that any query for which a certain time threshold is expected in its plan can grab as much parallel processes to get things done as it thinks it needs. This setting has been changed to 2 (we have 4 processors). Allowing some degree of parallelism allows for faster indexing, etc, but limiting it to 2 has allowed two processors to stay freed up for other things.

The default Cost Threshold for Parallelism was set to 5. This setting refers to the amount of time expected by an execution plan that, when exceeded, calls for parallel processing. This setting has been changed to 80. This is kind of high, but we have other inefficiencies that require it. This setting allows us to keep the max degree of parallelism set to 2 while limiting the number of queries that will actually use that parallelism.

QUERIES:

I have added at the end of certain queries, below the ORDER BY statement, the following option:

SELECT FIELD
FROM TABLE
WHERE FIELD=VALUE
OPTION (MAXDOP 1) –<–That’s the option

This query hint specifies for the query that the Maximum Degree of Parallelism (MAXDOP) is set to 1. This option can be used for SELECT, UPDATE, DELETE, and INSERT statements. It overrides the server setting explained above. I’m watching performance to determine others that may need this option, as well.

PAGEIOLATCH:

PageIOLatch wait times have also decreased as a result of these changes. There is less contention and latches occuring during reads and writes, so these types of waits have decreased, as well. Our disk subsystem leaves a lot to be desired, but these changes have increased the efficiency with which I/O tasks operate.