Monday, January 9, 2012

My encounter with a small bug in Hibernate

The problem

At work, I needed to use entities with @DiscriminatorColumn inheritance. It means all types are kept in the same table, with value in this column showing what type given row is of. It's not recommended way to handle inheritance, but for some reasons we needed to use it. In developement, locally, I was using PostgreSQL database. When I tried to store this entities, I was receiving strange errors. Saying I cannot store an entity because entity with such id is already in database. It was quite strange, I was trying to store vanilla new entity. Test case to show this error is very short, so I'll include it here:
//Parent entity
@Entity
@Inheritance(strategy = SINGLE_TABLE)
@DiscriminatorColumn(name = "CLASS_ID", discriminatorType = INTEGER)
public abstract class ParentEntity {
  @Id
  @GeneratedValue(strategy = IDENTITY)
  private Long id;
}

//Child entity with discriminator  
@Entity
@DiscriminatorValue("1")
public class InheritingEntity extends ParentEntity {
}

//Test
public class PersistChildEntitiesWithDiscriminatorTest extends BaseCoreFunctionalTestCase {
  
  @Test
  public void shouldPersistTwoEntities() {
    Session session = openSession();
    session.beginTransaction();
    InheritingEntity child1 = new InheritingEntity();
    InheritingEntity child2 = new InheritingEntity();
    session.save(child1);
    session.save(child2);
    session.getTransaction().rollback();
  }
}

The cause

This test throws exception on second save, but only on PostgreSQL. Why is that? Well, when you save new entity to persistence context, Hibernate issues SQL call to database instantly. Other queries, like updates, are cached, and sent to database on em.flush or em.commit. But inserting of new entities is not cached and there is a reason for that. When we save new entity, Hibernate needs to assign ID to it, and this is taken from database. Most databases return ResultSet with one row and one column after insert, and it contains newly assigned ID. However, PostgreSQL behaves a bit differently. It returns whole inserted row (of course, with ID filled in). In most cases it works, because ID is the first column in this row, so when Hibernate takes value from the first row and the first column, it is the correct one. However, in case of classes with discriminator, ID is not the first column. Discriminator is the first column. So first insert is correct, ID 1 is assigned to child1, but then when we try to store child2, Hibernate also tries to assign 1 to it's ID, and complaints that there already is another entity with it.

The solution

So there was a bug in Hibernate. Can I solve it? I asked this question to myself, but to answer I couldn't do anything else than try ;) So I forked hibernate repository (yes, it's on github!) and... I was quite overwhelmed by the mass of code there. First challenge was to try to open it in my IDE, with all the subprojects and their interdependencies configured correctly. Thankfully there is gradle task for creating project files for IntelliJ IDEA, the IDE I'm happy user of. Next task was configuring Hibernate tests to use my PostgreSQL database. It turned out quite easy after one or two emails on hibernate-dev list. Now I had to change the code assigning IDs to entities to take it not always from first column first row, but sometimes from column of given name. So I had to get the name of column keeping IDs, which I did with a little help from other developers on the dev list.

The contribution

Now I commited fix to my forked repository on github, issued a pull request, got some comments, fixed files formatting... We'll see if it's accepted. UPDATE: It is accepted :)