FeedAgg.com Logo
Your Account | Sign In | Sign Up

Add Feed | Search | Home | Help | Contact | Blog

Feed: Coldfusion Muse - MS SQL Server - AggScore: 73.9



Summary: ColdFusion Muse - MS SQL Server


Musings and Other Things from CF Guru Mark Kruger

Hanging Jrun Threads and MS SQL Parallelism


Recently one of our systems started misbehaving. In this system we had 2 ColdFusion 8 servers connecting to a single MSSQL 2005 server. All the hardware was quite good - plenty of RAM, Fast disks, moderate traffic etc. The system had been in place for some time. But (and isn't this often the case) we moved a new design in place with some changes to the query code and suddenly our well-behaved system started acting like a sugar-laden toddler in the cereal aisle.

Watching the "running request" counter for ColdFusion I noticed that it was slowly accumulating requests. When that happens (threads slowing building up over time) you usually have to prepare for some frustrating troubleshooting ahead. When a server is "crashing" you can often pinpoint the error. Crashing servers tend to suddenly fill up running requests and the request queue and the log files will generally have some clues occurring right around that time. But this was different. In this case the request count climbed slowly and was seemingly random. And these threads did not show up in the list of "active requests" in the CF monitor either. Aha! I thought. This is my old networking issue! You might not remember this but a few years ago we discovered that auto-synching ports can sometimes cause phantom connections to hang on a DB intensive application (see this post).

But a quick checkup of network settings showed that this was not the case. Network connectivity was excellent and both DB and the 2 servers were connected through the same Cisco switch. So it was on to the database. Why the DB? Why not scour through JVM settings and fiddle with CF request settings? For one thing, 80% of the time it's not CF or the DB but some combination of the 2 (bad query writing, resource constrained DB, drivers etc). In this case the 2 common denominators were the database and the new code - but I believed the DB was our lowest hanging fruit.

Processor Usage

Sure enough a check of the database showed processor usage that did not look normal. Wait a minute Muse... don't you have any baseline numbers for that assumption? Nope, not at this point. I'm letting my experience guide me. When you have 4 cores and 2 of them are at a flat lined 50% you generally know something is wrong. In fact a quick check of the accumulating requests on CF showed a 20-25% per thread correlation. In other words, each of my hanging threads was using 20% or so of one core on the SQL server. Once it was hung that thread continued to use 20% of one SQL server core in perpetuity until CF was restarted.

The funny thing was that under regular load the DB processors was extremely underutilized until one of these threads was produced. The DB processors would stay at between 1 and 5 percent most of the time - practically idle. When one of these "special threads" came along, one proc out of the 4 of would "jump up" to 20 or 25 percent but the rest would idle along as before.

Finding the Problem

We tried a great many things. We patched and hot fixed, shrunk and optimized files, added and removed indexing etc - all of which was helpful and necessary, but none of which permanently "fixed" our problem). Finally, I was looking at the "activity monitor" in MSSQL05. The activity monitor "process info" view shows a list of connections along with some extra data, process ID, database, status etc. If you double click on an item in the row you will see the currently running query or task. You have to sort of "get lucky" to see it since most of them fly by pretty fast.

In any case I was watching this view (refreshing every 10 seconds) while there were no hanging threads and suddenly I saw something that made me scratch my head. A process ID was duplicated about 3 or 4 times. Each of the duplications had a "wait time" and a "wait type" of CXPACKET. So this process ID was spawning multiple threads under a single ID. And the wait time made me think that this might be our offending process. Looking at the processor utilization I noticed that sure enough, I had a 20% utilization on one core. Going back to my CF servers my suspicions were confirmed. We had a hanging thread on one of the servers - so this CXPACKET thing required some more investigation.

First however, I thought I might try to mitigate the problem from within the activity monitor by killing off this process ID. If I was successful I would have a new mitigation technique that would not involve any potential user disruption, with the exception of whoever was running the query that was locking up these threads (and they were probably tapping their fingers on the desk waiting anyway). So I tried the "kill process" button from the activity monitor, but I had to kill all of them individually and I couldn't catch them all before they re-spawned - or maybe I'm just too old. Turning to SQL Studio I ran the query KILL 55 (where 55 was the process ID in question). That did the trick and it was indeed a magic bullet. As soon as I "killed" that process ID - all the sub-processes were terminated as well. My CF server dropped the hanging thread and SQL Processor usage went back down to normal.

The Fix

Ok so now what? I could hire a temp to sit in front of the activity monitor all day and kill off any process ID with a CXPACKET Wait type that correlated to a CF hanging thread. I could probably write a complicated SQL script to find these threads and terminate them (I kind of liked that idea actually). In the end I chose to do a little research into CXPACKET wait types. I was fortunate to stumble onto this post by Pinal Dave. It turns out that a CXPACKET wait is related to parallelism. Now parallelism is how MS SQL chops up the work load of a query and makes full uses of your processors to get the work done. Much like cfthread splits work out and then joins it back together, SQL splits the work out and then an "organizing" thread "waits" for all the individual threads to complete. Once they have all completed it assembles the data for return to the client. Make sure and read the full article as well as the comment by Jonathan Kehayias at the bottom - excellent stuff!

In any case my SQL server was suffering from not being able to reassemble threads from this division of labor. I'm not sure why that might be (I have some ideas) but the long and short of it is that attempts at parallelism for query execution were causing hanging Jrun threads on my CF server. Following Pinal's guide (and a couple of MS resource pages) I tried setting the max degree to 2 and the threshold to 20, 25, 30... looking for a "sweet spot" where most of my queries would execute without parallelism, reserving it for the report or aggregation queries in the admin section of this site. Unfortunately that didn't work. The issue here was likely a specific query with some new joins in it that was always going to trigger parallelism and ofen fail to complete - causing our hanging thread issue.

Finally, I set the "max degree" to 1. Doing this meant that there would be a 1 to 1 relationship between threads, process IDs and queries. In other words, a given query would never use more than one core execution thread. Now you might think this is problematic because it doesn't make full use of SQL's tuning engine. Technically you are right. I would only say that in a typical web application the query traffic generally consists of dozens of very short queries where parallelism would actually add additional time to the process. So in a typical web application you lose very little by minimizing the degree of parallelism. And indeed that appeared to be the case in our web application. Our CXPACKET waits, hanging threads and egregious processor usage all went away and things have been functioning smoothly since then.

The Aftermath

The Muse knows his readers well. Some of you want to hammer me about not fixing the real problem - that specific query in the code. Not to worry. Using SQL's performance dashboard we teased out the worst offenders and set our ColdFusion developers to analyzing the code. But I suspect the version of SQL or something about the hardware, hyperthreading or NUMA to be a more likely culprit. I have never seen SQL's execution planner cause a problem when it turns to parallel execution before. Still - it's always a good idea to fine tune that query code.

Date Published: Nov 18, 2011 - 12:59 pm



Fun With SQL Server 2008 Login Properties


Most of you probably know you can run an instance of SQL server as the "local system" account and it works fine. You can also run an instance of SQL server as a domain or a local user account and (if the permissions are set correctly) that is also fine. But if you have never actually installed MSSQL Server 2008 you may not know of a change in how the installation routine "suggests" you run SQL server. The 2008 install really wants an account to run under. It no longer uses "local system" as the default account. Instead the account area is blank. You can, of course, specify the local system account but it's no longer obvious. So what sometimes happens is that the install user scratches his head and then uses what he knows.

[More]
Date Published: Jan 24, 2011 - 5:03 pm



Processor Usage in MSSQL


Here's a useful query for showing how much processor a given database is consuming compared to other DBs. The query returns a perctentage. Note, this is a percentage of the overall processor usage of MSSQL. If MSSQL is using 10% of your server proc usage and a given DB is using 50%, you should realize that the DB is, in reality, using 5% of the capacity - no cause for alarm. If, however, a SQL server is at or near capacity (70% or above) and a given DB is using the majority of the proc cycles, then of course, that DB is a good candidate for upgrading to a dedicated server (or at least one with more horsepower). Of cours, this assumes that you have examined the schema, indexing and caching to insure it scalable to begin with.

declare @tot AS decimal(18,0) select @tot = sum(cpu) from sysprocesses select t2.name as dbname, CAST(((sum(cpu)/@tot)*100) AS decimal(18,1)) as PercentUsage, sum(cpu) as totalCPU from sysprocesses t1 join sysdatabases t2 on t1.dbid=t2.dbid group by t2.name order by sum(cpu) desc
Date Published: Nov 26, 2010 - 10:05 am


Data Truncation Error: Migrating MySQL to MSSQL


I have one more tip as a follow up to my previous post on Migrating Between MySQL and MSSQL. It has to do with the dreaded "data trunction error". If you have used MSSQL you may have seen this error crop up from time to time. It is a common error and very easy to remedy. The error occurs when you have a character field with a length that is too short for the size of the string you are trying to insert. Check out this example....

[More]
Date Published: Jul 23, 2009 - 5:41 pm


Migrating Between MySQL to MSSQL


I recently did an emergency stint of troubleshooting for a site owner (a designer who owned a complex ColdFusion site) who was hit with the HTML injection issue on his site. He had done a good deal of work on his own and cleaned up the HTML as best he could. He was busy moving the sites to a more secure environment (a better hosting company, no more FTP, intrusion detection and solid VPN support). He had managed to travel a long way down the migration path before he ran into trouble. His new environment used MSSQL and his old environment used MySQL.

Now I love MSSQL and I think it is a wonderful choice (price notwithstanding), but had he contacted me before he decided to go this route I would have suggested that he stick with MySQL for the sake of compatibility. Unfortunately he had already "flipped the switch" before I got there and so there was a lot of "on the fly" changes to make just to get his site working correctly again. One of the biggest issues had to do with his choice for migrating the actual data. He had chosen to use an export tool to move the MySQL data into an Microsoft Access file. He then used Microsoft Access "upsize" wizard to send the data to the MSSQL server. The biggest flaw with this approach is that it resulted in missing dates which were not translated correctly from MySQL to Access to MSSQL. So we had to re-export the data in to SQL dumps, modify them and then run them against MSSQL.

The date problem is not a typical incompatibility with MSSQL, but there are several we ran into that we had to account for. Here they are in random order:

[More]
Date Published: Jul 13, 2009 - 6:02 pm


Cached Plans and Static Variables


Regarding Static Variables in SQL Statements

In my last post I indicated that even static variables passed to SQL statements should be bound using Cfqueryparam. My understanding was that the DB server could only create cached plans if all the variables in the statement were bound - so I believed that a statement like the following: SELECT fname, lname FROM users WHERE active = 1 ...Could not benefit from the execution plan cache. In the comments of the previous post a number of people disputed this idea, saying that if the variable is static it will cause the execution plan to be cached. Now, Chris Secord has given me a tip on how to prove that I am wrong.

[More]
Date Published: Dec 16, 2008 - 10:12 am


Migrating to MSSQL 2005 and UNION Queries


For the most part, migrating your ColdFusion site from Microsoft's SQL 2000 server to SQL 2005 is a snap. Import the databases from 2000 to 2005, re-point your data sources to the new instance using the ColdFusion Administrator and you are done. No muss, no fuss. There is very little query code that you will need to change. Sometimes you needn't change anything in your ColdFusion code at all. Here is one that I found recently however that you may run across - especially if you are a fan of UNION queries. Here's the skinny.

[More]
Date Published: Jul 18, 2008 - 10:51 am


Clustered Indexes Revisited


In my last post I talked about using a clustered index on some column other than the primary key. There are cases where this makes sense and it can have positive impact on performance. Recently however, CF Webtools own Jason Troy pointed out a consequence of altering your clustered index. You may recall that a "clustered" index really means that the actual data in the table will stored in the sort order specified by the index. Consider this example of a table called "emailer":

[More]
Date Published: Jun 03, 2008 - 11:15 am


Clustered Indexes Mia Culpa


I have for years espoused the benefits of clustered indexes on MS SQL. Unlike a regular index a "clustered" index represents the actual sort order of the table. It is, therefore, the fastest available type of index. It is my view that some thought should be given to which columns are added to the clustered index. Please note, any indexing plan should include performance metrics coupled with experience. Please don't think I am recommending wholesale changes to any given schema. Having said that, a misunderstanding or misuse of indexing is the one of the most common cause of performance related problems. Now back to our discussion of clustered indexing.

[More]
Date Published: May 28, 2008 - 2:59 pm


SSIS and DTS - Each Has Uses


You probably know that Microsoft replaced the venerable DTS with something called "SSIS" - which I gather stands for "SQL Server Integration Services". SSIS is immensely powerful and comes with a full featured scripting language and development environment that uses Visual Studio. Practically any kind of data migration and transformation is possible with SSIS. Unfortunately SSIS is also dizzyingly more complicated than the tried and true "Data Transformatin Services" (DTS). In SSIS I have trouble simply finding the list of tables and columns let alone doing transformations. For simple, one time migration tasks it is like using a 5 horsepower tiller to plow up your house plants.

Recently I was moving large datasets from an MS SQL 2000 (32bit) server to an MS SQL 2005 (64 bit) server and discovered that the SSIS package was importing dates incorrectly. It was somehow transforming them into completely different dates (probably due to a format difference or a difference in the way dates are stored). In addition the SSIS wizard did not automatically check the box for "enable identity insert". You might recall that DTS by default checks this flag for any table using the Identity feature. If you create your tables with the Identity property set ahead of time the DTS import will automatically work correctly without the need to edit the import properties of each table. In SSIS however, I have to go into the properties of each table in the wizard and specifically check the "enable identity insert" checkbox.

My brute force solution to these irritaing issues with SSIS is simple. Instead of "importing" using SSIS I "export" using DTS. This is my rule of thumb (at least until I can get SSIS to sing a new tune) - If you are doing straight forward migrations from SQL 2000 to SQL 2005 I recommend that you stick with DTS and keep SSIS for more complex integration needs. If you are interested in integrating the DTS wizard directly into the Server Management Studio, read on:

[More]
Date Published: May 27, 2008 - 6:42 pm


The Dreaded Mismatched Column or Data Type Error Revisited


This annoying error that occurs when using JDBC and MS SQL has been around for a while. The JDBC system in Coldfusion caches information about the tables you access. If you change a table (add a column for example) the column ids get out of whack and they no longer match the correct column. For Example, the first time you run a query you might get back the following:

[More]
Date Published: Apr 17, 2008 - 2:01 pm


Creating Views With CF Query: DDL Follies


Most queries in your Coldfusion code do one of four things - Select, Insert, Update or Delete. Maybe you did not know that, given the proper permissions, you can do just about anything that can be done on the DB server from within a query. You can backup and restore, drop users, even execute shell commands. That's why you should never create a datasource using the SA user. Instead you should define what you want a datasource to do and create a user for that purpose. Still, sometimes it is useful to be able to do other things using Coldfusion and Cfquery.

For example, I have a generic table with rows that look like "col1, col2" that holds form data. In this particular application the customer creates custom forms to collect data from specific clients. All the forms look different. One might have fullname, address, city, Postal code, and the next one might see first name, last name zip. When the data is submitted it is put in col1, col2, col3. But he has a reporting tool that allows him to query tables from the database and run reports for his customer. What can we do to make it easier for him to report? Surely "select col1, col2" isn't going to do it. The answer is to use T-SQL Data Definition Language (DDL) to create a view for each customer.

[More]
Date Published: Nov 14, 2007 - 6:35 pm


Using Rowcount in T-SQL


Here's a tip for limiting the number of rows returned from a query. Now I know you think "that's easy - just use TOP." Ah... but what if TOP is not an option? Consider this example. You have a reporting stored procedure that you want to use to return multiple results sets to your Coldfusion Page. Easy - right?

[More]
Date Published: Sep 07, 2007 - 1:45 pm


Handling Variable Form Data in a Stored Procedure


Lots of projects have a requirement that interaction with a database must be done using stored procedures only. Stored procedures are generally quite easy to write, but there are some things that are slightly more difficult that using a straight CFQUERY. For example, perhaps you have seen code that handles a search form. You might see a query that looks something like this: SELECT * FROM USERS WHERE userID IS NOT NULL AND UserName = '#form.username#' AND Address = '#form.address#' Please note, I'm use a UDF called "isEmpty" that simply trims the string and checks the length. Also keep in mind that I'm not adding the required Cfqueryparam to save room. How would you duplicate this code in a T-SQL stored procedure?

[More]
Date Published: Aug 13, 2007 - 9:49 pm


DTS' Other Uses


One of the nicest things about MS SQL is DTS. If you are a reader who has a visceral reaction to anything nice being said about Microsoft you should grab a paper bag so that you don't hyperventilate as we go forward. Shallow breaths... shallow breaths.... ok - ready? Here we go. I suppose that most developers are exposed to DTS (Data Transformation Services) as an import-export mechanism for Microsoft SQL Server. The most common use is during deployment, rehosting or setting up a development environment. There are a host of other things for which you can use DTS that perhaps you hadn't thought of. For example:

  • Moving Data from one DB platform or format to another - For example, with 2 ODBC connections to 2 Oracle servers you could import and export data from one Oracle server to another. I know that those of you with experience in Oracle will gasp with horror at this though - but migrating data around in Oracle is unnecessarily difficult (as is just about everything else about Oracle). You could transfer data from a DBase file to an access file. You could move information from a proprietary platform running on Cobol (as long as you had a driver) into a flat file or into MSSQL. It's a very nifty mechanism with all sorts of possibilities.
  • File Drop Import - We use DTS to "pickup" a file who's name may not be consistent. It is even possible to FTP a file for import.
  • Data Column Validation - With a minimal amount of scripting you can check values prior to importing them. So, for example, you can verify if a field is a number or is parsable into a date, or is populated or not.
  • Complex SQL Tasks - You can add SQL task and even COM tasks to a package extending it well beyond just transferring data.

For example, we have a task that imports stock data. It is set to run every weekday. On days that are holidays it would run and sometimes create problems with the Bid and Ask. We needed a way to check and see if it was a holiday. If it was a holiday we wanted to terminate the process. Here is what our DTS Guru came up with. Using the package designer, at the beginning of the package add an ActiveX task that looks like this (this is the generic version). Function Main() Dim cn Dim rs Dim Flag Set cn = CreateObject("ADODB.Command") Set rs = CreateObject("ADODB.Recordset") cn.ActiveConnection = "Provider=sqloledb;Data Source=database;Initial Catalog=database;User Id=username;Password=password;" cn.CommandType = 1 'adCmdText cn.ActiveConnection.CursorLocation = 3 'adUseClient cn.CommandTimeout = 60 'set to 1 min cn.CommandText = "select * from [table]" Set rs = cn.Execute() If rs.RecordCount > 0 Then Flag = "F" Else Flag = "S" End If Set rs = Nothing Set cn = Nothing 'msgbox "Result=" & Flag If Flag = "S" Then Main = DTSTaskExecResult_Success Else Main = DTSTaskExecResult_Failure End If End Function Make sure and set the task to "continue on success". The last few lines tell the story. If "Main" is set to the constant DTSTaskExecResult_Failure then the task will not continue. The result is a DTS task that is "smarter". We are able to put the execution of the task in the hands of the stakeholder of the site who only needs to be sure and maintain his table of trading holidays.

Date Published: Mar 29, 2007 - 5:01 pm


 
Visitor Rating: 7 (1) (Rate)

Story Clicks: 472

Feed Views: 756

Lenses (Add|?)

Comments (Log in to add)

Feed Details
Date Added: 01/17/2009
Date Approved: 01/17/2009
By: Anonymous
Search FeedAgg.com




3600 mp6940 serv 1.676 seconds to generate.