Some team members and I spent the better part of a few hours today researching a really weird bug we were seeing with our data access code built on NHibernate.
The basic symptoms were a duplicate key violation on a unique column in a table. This only happened under decent load, so reproducing in dev was pretty hard. To make matters more interested, it only happened with NHibernate 2.0. We started looking at MySQL logs and saw the following.
INSERT (value, unique_value) some_table VALUES ('some_value', 'some_unique_value');
-- Auto-increment ID column gets set to 10
SELECT LAST_INSERT_ID()
-- Returns 9
UPDATE unique_value='some_unique_value' and value='other_value' WHERE id=9
-- Duplicate key exception because unique constraint is violated
Yes, SELECT LAST_INSERT_ID() was returning the wrong value. After digging through the NHibernate source code, it became pretty apparent what was happening. We weren’t using transactions (because the DB is MyISAM) and we found the following code.
// (Some ConnectionManager code)
public void AfterTransaction()
{
if (IsAfterTransactionRelease)
{
AggressiveRelease();
}
else if (IsAggressiveRelease && batcher.HasOpenResources)
{
log.Info("forcing batcher resource cleanup on transaction completion; forgot to close ScrollableResults/Enumerable?");
batcher.CloseCommands();
AggressiveRelease();
}
else if (IsOnCloseRelease)
{
// log a message about potential connection leaks
log.Debug(
"transaction completed on session with on_close connection release mode; be sure to close the session to release ADO.Net resources!");
}
transaction = null;
}
private bool IsAggressiveRelease
{
get
{
if (connectionReleaseMode == ConnectionReleaseMode.AfterTransaction)
{
return !IsInActiveTransaction;
}
return false;
}
}
//Some AbstractSessionImpl code
protected void AfterOperation(bool success)
{
using (new SessionIdLoggingContext(SessionId))
{
if (!ConnectionManager.IsInActiveTransaction)
{
ConnectionManager.AfterNonTransactionalQuery(success);
}
}
}
//And finally the AdoTransaction code
public bool IsActive
{
get { return begun && !rolledBack && !committed; }
}
Basically, NHibernate will quasi-create an implicit transaction if it isn’t explicitly created. What’s weird is the scope of the transaction is just a single query. As you can see in the above code, I never called BeginTransaction() so tx.IsActive returns false, which indirectly makes the AbstractSessionImpl class think the transaction is over, which in turn indirectly triggers the ConnectionManager code to release the connection if the ConnectionReleaseMode is set to after transaction (which it is by default).
Whew!
Some outstanding questions though are why did this start happening with 2.0 and not 1.2? My guess is some code was refactored with how the last insert ID is retrieved, and in 1.2 it was impossible for it to have different connections. With 2.0, it is pretty easy to have different connection objects if concurrent operations are occurring.
Read more on ConnectionReleaseMode’s here.
(I hope this post made sense as I am super tired and have been learning NHibernate source code rapidly for the past few hours)