Monday, May 31, 2010

Transaction isolation level

Recently I have read very good explanation of transaction isolation levels in "Real World Java EE Patterns. Rethinking Best Practices." by Adam Bien.

Transactions were introduced for two reasons. To make complex operations atomic (all or nothing) and to give possibility to concurrently use resources. Without transactions it is possible that two users (or more specifically actors, this can be some parts of the system, not necessarly live users) read/write the same resource simultaneously, and changes made by one of them are lost and overwriten by the other one.

Consider following example document:

Cheesecake bottom
250g crunchy bisquits
100g butter

Cheesecake top
900g cottage cheese
250g mascarpone cheese
1 lemon

Imagine one user gets the document and the other user also is getting the same document. Both users have the same data. Now both of them are editing it. One likes sweeter cake bottom, so adds "2 spoons of sugar" to the recipe, and writes changed recipe back to the database. In database now it looks like this:

Cheesecake bottom
250g crunchy bisquits
100g butter
2 spoons of sugar

Cheesecake top
900g cottage cheese
250g mascarpone cheese
1 lemon

Meanwhile second user adds "2 eggs" to cheescake top recipe and saves his version:

Cheesecake bottom
250g crunchy bisquits
100g butter
2 spoons of sugar

Cheesecake top
900g cottage cheese
250g mascarpone cheese
1 lemon
2 eggs
Change made by the first user is lost. In some cases it can lead to inconsistent data.

So that's why transactions were introduced. Transaction isolates changes made by one user from changes made by the other. There are 4 levels of transaction isolation.

Serializable

Transaction locks all necessary resources. If user wants to read or write some resource, it is locked and no one else can read or write it. This level of isolation can easly lead to deadlocks:
  1. User Ann opens transaction. Ann needs resource A, transaction locks it.
  2. User John opens another transaction. John needs resource B, transaction locks it.
  3. Ann needs resource B, but it is locked, so Ann's transaction needs to wait until it is released
  4. John needs resource A, but it is locked. John's transaction waits until it is released. Both users wait endlessly.

If John didn't need A, he finishes his transaction, B is released and Ann can finish her transaction too.

Repeatable reads

Guarantees that the same query will return the same results if executed in one transaction. Even if other transaction modifies resource meanwhile. Exception is adding - new rows can appear in query result. If another transaction deletes existing rows or modifies them, this changes are not visible.
  1. Ann opens transaction. She reads names of java4people organizers: "Stawicki" and "Gruchała".
  2. Bob opens transaction and deletes "Stawicki". Bob commits changes and closes transaction.
  3. Again Ann reads names. She gets "Stawicki" and "Gruchała".

If Bob added some name, Ann would read it too.

Read commited

The same query can return different result even in one transaction if another transaction makes changes and commits them.
  1. Ann opens transaction and reads names of java4people organizers: "Stawicki" and "Gruchała".
  2. Bob opens transaction and deletes "Stawicki".
  3. Ann reads names again. She gets "Stawicki" and "Gruchała".
  4. Bob commits his transaction
  5. Ann reads names again (still in one transaction). She gets "Gruchała".

Read uncommited

Like no transactions at all. It only gives us atomic operations. User can rollback all the changes in transaction. Changes are visible to other transactions even before commit. Of course, if transaction that made changes is rolled back, changes are not visible any more.

Saturday, May 15, 2010

After GeeCON 2010

This year GeeCON took place in Poznań which is much closer to Szczecin than Cracow. I expected it to be as good as a year before, maybe I expected too much. I don't mean it was bad, but previous year it was so great that maybe it was hard to keep this level. Or maybe I was not choosing right presentations. There were three tracks this year so choice was sometimes difficult.

Like a previous year there was "University Day" before the conference. University Day is day of workshops on various topics. I have chosen Gradle, like many other people. Too many unfortunately. Hans said there are too many attendants to make exercises for everyone, so only he coded live. I missed coding myself.

First interesting concept I heard about was Object Teams. It is a new idea of modularizing our programs, based on entities, collaborations and roles. All three look pretty much like java classes. In OOP there are objects which are data and methods. We can use objects to modularize our applications, but often connections between modules are becoming complicated. Creators of Object Teams concluded that modularization and OOP is not about modules/objects itself but about connections between them. So there are entities, which represent data. There are collaborations, that represent operations on data, and roles. In a collaboration, each entity plays some role. Everything looks pretty and neat, and I wonder if it is going to become popular in coming years. I must admit I haven't heard about it before.

Quite interesting presentation was about JSF 2.0 by Ed Burns. Nothing quite new for me, but if someone thinks JSF didn't change much since 1.2 version he should attend this presentation (or watch on GeeCON's channel on parleys.com when it becomes available ;)). Guys who created version 2.0 addressed all the problems developers were complaining about with previous versions. I remember Seam was framework which was built on JSF and addressed such problems. Now, when it is solved in JSF itself, I wonder what are differences between Seam and JSF 2.0.

On the second day Jonas Bonér was talking about actors, agents, STM and other solutions making concurrent programming much easier than threads with shared state and locks. Next time I'll need to do some concurrent programming in Java I'll definitely use something from Akka.

Vaclav Pech's presentation was quite similar. He also talked about concurrent programming with actors, but in Groovy. As usual Vaclav gave very good presentation.

Vaadin is the next thing I want to use. At work I use GWT and I don't really like it. GWT compilation is very long and asynchronous callbacks are making code more complicated. Besides most of work is done on the server anyway. Programming in Vaadin is very similar to programming in GWT, but everything is server side. Vaadin uses GWT for presentation, so it is possible to write custom components for Vaadin in GWT, but then compilation is needed only once. Damn, this could save many hours of my life.

Bruno Bossola's presentation was very good and funny. No suprise here :) Bruno talks about things that many developers really need. I agree somewhere we lost real Object Oriented programming. Bruno also mentioned a book which I feel is unfairly forgotten. "Applying UML and Patterns: An Introduction to Object-Oriented Analysis and Design and Iterative Development" by Craig Larman. When I started working for NCDC more than few years ago, my boss put this book to company's library and recommended it to us. I read it and I feel it made me a better developer. If you put all your code into one huge class or even method, buy or borrow this book and read it. Even if it's old it's worth it. Some things don't change even in IT ;)

If I can have some hint for the organizers I'd like to suggest to make presentations longer. Many presenters didn't manage to show everything they wanted. I think 1.5 hour is minimum.

I heard food was also a problem on the first day. It didn't suffice for everybody, but I managed to grab my plate so I don't complain :)