Interesting SQL Server Mirroring Problem
by bhartsock on Sep.29, 2009, under Uncategorized
I have been trying to narrow down a SQL Server connection timeout problem, that only occurs when a database is mirrored. I have yet to be able to really figure out the cause, but here is the code that describes the problem.
[TestFixture] public class SqlMirroringTests { const string connectionString = "Data Source=some_server;Initial Catalog=some_db; User ID=*****;Password=*****;"; private static Random rand = new Random(); private ArrayList times = ArrayList.Synchronized(new ArrayList()); private ArrayList erroredTimes = ArrayList.Synchronized(new ArrayList()); private void DbScenario(int identifier) { var sleepTime = Random(); WriteInfo(identifier, "Sleeping " + sleepTime + "ms"); Thread.Sleep(sleepTime); using (var conn = new SqlConnection(connectionString)) { var stopWatch = new Stopwatch(); stopWatch.Start(); bool isErrored = false; try { conn.Open(); } catch(Exception e) { Console.WriteLine(e); isErrored = true; throw e; } finally { var timespan = stopWatch.Elapsed; WriteInfo(identifier, "Open time: " + timespan); times.Add(timespan); if (isErrored) { erroredTimes.Add(timespan); } } //We have to actually query for something to get the error, and it needs to be a decent chunk of data var command = conn.CreateCommand(); command.CommandText = "select * from <Insert any table here with some data>"; command.ExecuteReader(); conn.Close(); } WriteInfo(identifier, "DONE"); } [Test] public void SlamDatabase() { var exceptions = new List<Exception>(); var action = new Action<int>(DbScenario); var results = Enumerable.Range(0, 250) .Select(i => action.BeginInvoke(i, null, null)) .ToList(); results.ForEach(ar => { try { action.EndInvoke(ar); } catch (Exception e) { exceptions.Add(e); } }); Console.WriteLine("Number of Exceptions: " + exceptions.Count); WriteTimeResults("All times - ", times); WriteTimeResults("Error times - ", erroredTimes); } private void WriteTimeResults(string prefix, ArrayList results) { Console.WriteLine(prefix + "Max Open time: " + results.Cast<TimeSpan>().Max()); Console.WriteLine(prefix + "Avg Open time: " + results.Cast<TimeSpan>().Average(t => t.TotalSeconds)); } private int Random() { return rand.Next(1000); } private void WriteInfo(object state, string message) { Console.WriteLine("Thread " + state.ToString() + " - " + message); } }
It throws two possible errors:
System.Data.SqlClient.SqlException: Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding. at System.Data.ProviderBase.DbConnectionPool.GetConnection(DbConnection owningObject) at System.Data.ProviderBase.DbConnectionFactory.GetConnection(DbConnection owningConnection) at System.Data.ProviderBase.DbConnectionClosed.OpenConnection(DbConnection outerConnection, DbConnectionFactory connectionFactory) at System.Data.SqlClient.SqlConnection.Open() at System.Data.ProviderBase.DbConnectionClosed.OpenConnection(DbConnection outerConnection, DbConnectionFactory connectionFactory) at System.Data.SqlClient.SqlConnection.Open() at System.Data.ProviderBase.DbConnectionPool.CreateObject(DbConnection owningObject) at System.Data.ProviderBase.DbConnectionPool.UserCreateRequest(DbConnection owningObject) at System.Data.ProviderBase.DbConnectionPool.GetConnection(DbConnection owningObject) at System.Data.ProviderBase.DbConnectionFactory.GetConnection(DbConnection owningConnection) at System.Data.ProviderBase.DbConnectionClosed.OpenConnection(DbConnection outerConnection, DbConnectionFactory connectionFactory) at System.Data.SqlClient.SqlConnection.Open()
And …
System.Data.SqlClient.SqlException: Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding. at System.Data.ProviderBase.DbConnectionPool.GetConnection(DbConnection owningObject) at System.Data.ProviderBase.DbConnectionFactory.GetConnection(DbConnection owningConnection) at System.Data.ProviderBase.DbConnectionClosed.OpenConnection(DbConnection outerConnection, DbConnectionFactory connectionFactory) at System.Data.SqlClient.SqlConnection.Open()
The odd thing is. Here are the results:
Number of Exceptions: 118 All times - Max Open time: 00:00:02.7301659 All times - Avg Open time: 0.0934956888 Error times - Max Open time: 00:00:02.7301659 Error times - Avg Open time: 0.0855163728813559
118 errors out of 250! After one happens, a ton of others happen. I am guessing the pool get’s jacked up and starts looking at the mirror to login temporarily or something, but I can’t figure out why. Increasing the Connect Timeout resolves the problem, but I HATE increasing timeouts. The funny thing is, according to my results, the average timed out connection is .08s. That 625x less than the connect timeout.
Finally, to make matters even more confusing, if you look at my connection string, I didn’t setup the connection for failover. Although I believe that failover still works, except on the initial connection.
Any ideas would be greatly appreciated.
November 3rd, 2009 on 2:01 pm
Brian,
We are having the same timout problem with a mirrored SQL 2008 server. Were you ever able to resolve this issue or did you simply set the connection timeout to 50 seconds?
Thanks!
January 25th, 2010 on 8:02 am
We are also having the same problem. I however have been unable to set the timeout up high enough. We are getting CXPacket messages in Activity monitor. Also we are on 64bit SQL 2008.
I’ve read about intra-query parrallel thread deadlocks. But the solution there (Add an index) isn’t working for me as I don’t know what index to add. I think I’m going to rewrite my query to use a function with cursors instead…..maybe
February 1st, 2010 on 2:02 pm
“I am guessing the pool get’s jacked up and starts looking at the mirror to login temporarily or something, but I can’t figure out why”
Does this mean that disabling mirroring fixes the problem?
February 4th, 2010 on 9:37 am
Yes. Disabling mirroring fixes the problem, but leaves you without HA.
March 31st, 2010 on 10:26 am
Disabling mirroring means what? Just removing “failover partner” from connection string? Or really remove the mirror?
Did you try the latest sqlncli.msi? e.g. From Post SP1 Cumulative Hotfix 7?