Java Architect: August 2010

Wednesday, August 11, 2010

Don't Ignore serialVersionUID

Okay,I admit that this one should have totally been obvious to me long ago. But just 4.5 years with Java...so perhaps I can be forgiven ;)
The serializable class BlaBlaBla does not declare a static final serialVersionUID field of type long BlaBlaBla.java
If you're like me, you roll your eyes and politely add a @SuppressWarnings("serial") to the top of the class definition (or, worse, you just shut the warning message off in your IDE altogether. Even I don't do that!). You reason with yourself that current versions of Java conveniently and automatically compute the serialVersionUID at run-time, so there's no need to bother with the formality of a version number on your class - it's just a nuisance holdover from days of Java yore.!

IT'S A TRAP!

Now that I've found myself well into a new project with this lazy philosophy, I'm starting to run into problems.\I'm finding that when I make the most trivial changes to my shared classes, I need to compile both the server and the client components. The two components that were supposed to be loosely coupled are now hopelessly intertwined. So I did some further research on how the JVM computes the ad-hoc serialVersionUID at runtime when it isn't provided.

In a nutshell, backward-compatability with respect to serialization and de-serialization is a lot less fragile than the cases that the serialVersionUID generation is protecting you against. That version generation algorithm computes an SHA hash based on the class name, sorted member variables, modifiers, and interfaces.

In reality, serialization and de-serialization generally only breaks when one of the following things happens to your class (from the aforementioned article at JavaWorld):

Delete fields
Change class hierarchy
Change non-static to static
Change non-transient to transient
Change type of a primitive field

Ensure Minimal Coupling Between Components
To ensure that your components which use Serialization have minimal runtime dependencies on each other, you have two options:

Declare a specific serialVersionUID, and update it whenever you make a change that breaks backward compatability.
Don't rely on any classes for use as transfer objects which will potentially change. This one is pretty obvious, but sometimes you will be surprised down the road at which classes are modified more often than others.
Don't use your own objects at all when transferring data. Instead, rely on classes like Integers, Strings, or HashMaps to shuttle data around among components. (Obviously, protocols like SOAP and REST rely on XML documents for this to ensure maximum de-coupling, but you're presumably using something like EJB remoting to avoid the complexity or overhead of these protocols).

What is difference between iterator access and index access?

Index based access allow access of the element directly on the basis of index. The cursor of the datastructure can directly goto the ‘n’ location and get the element. It doesnot traverse through n-1 elements.
In Iterator based access, the cursor has to traverse through each element to get the desired element.So to reach the ‘n’th element it need to traverse through n-1 elements.
Insertion,updation or deletion will be faster for iterator based access if the operations are performed on elements present in between the datastructure.
Insertion,updation or deletion will be faster for index based access if the operations are performed on elements present at last of the datastructure.
Traversal or search in index based datastructure is faster.
ArrayList is index access and LinkedList is iterator access.

Can a static block throw exception?

Yes, static block can throw only Runtime exception or can use a try-catch block to catch checked exception.
Typically scenario will be if JDBC connection is created in static block and it fails then exception can be caught, logged and application can exit. If System.exit() is not done, then application may continue and next time if the class is referred JVM will throw NoClassDefFounderror since the class was not loaded by the Classloader

What the heck in Serialization?

Why to Serialize at all?

A typical enterprise application will have multiple components and will be distributed across various systems and networks. In Java, everything is represented as objects; if two Java components want to communicate with each other, there needs be a mechanism to exchange data. One way to achieve this is to define your own protocol and transfer an object. This means that the receiving end must know the protocol used by the sender to re-create the object, which would make it very difficult to talk to third-party components. Hence, there needs to be a generic and efficient protocol to transfer the object between components. Serialization is defined for this purpose, and Java components use this protocol to transfer objects.

This is how Serialization algo work in Java:

Static vs init block in java

The static block is only loaded when the class object is created by the JVM for the 1st time whereas init {} block is loaded every time class object is created. Also first the static block is loaded then the init block.



public  class LoadingBlocks {

 static{

System.out.println("Inside static");

}

{

System.out.println("Inside init");

}

public static void  main(String args[]){

new LoadingBlocks();

new LoadingBlocks();

new  LoadingBlocks();

}

}

Output:

Inside static

Inside init

Inside init

Inside init

Tuesday, August 10, 2010

SRP:The Single Responsibility Principle

THERE SHOULD NEVER BE MORE THAN ONE REASON FOR A

CLASS TO CHANGE.

It is important to separate two responsibilities into two separate classes.Because each responsibility is an axis of change. When the requirements change, that change will be manifest through a change in responsibility amongst the classes. If a class assumes more than one responsibility, then there will be more than one reason for it to change.
If a class has more then one responsibility, then the responsibilities become coupled.Changes to one responsibility may impair or inhibit the class’ ability to meet the others.This kind of coupling leads to fragile designs that break in unexpected ways when changed.
E.g. If we define a Rectangle class with two methods. One draws the rectangle on the screen, the other computes the area of the rectangle. Two different applications use the Rectangle class. One application does computational geometry. It uses Rectangle to help it with the mathematics of geometric shapes.It never draws the rectangle on the screen. The other application is graphical in nature. It may also do some computational geometry, but it definitely draws the rectangle on the screen.

This design violates the SRP. The Rectangle class has two responsibilities. The first responsibility is to provide a mathematical model of the geometry of a rectangle. The second responsibility is to render the rectangle on a graphical user interface.

The violation of SRP causes several nasty problems. Firstly, we must include the GUI
in the computational geometry application. In a Java application, the .class files for the GUI have to be deployed to the target platform. Secondly, if a change to the GraphicalApplication causes the Rectangle to
change for some reason, that change may force us to rebuild, retest, and redeploy the ComputationalGeometryApplication. If we forget to do this, that application may break in unpredictable ways.

Wednesday, August 4, 2010

Pass by reference or value?

Check out this program to understand:

public class TestPassByReference {
         public static void main(String[] args) {
                // declare and initialize variables and objects
                int i = 25;
                String s = "Java is fun!";
                StringBuffer sb = new StringBuffer("Hello, world");

                // print variable i and objects s and sb
                System.out.println(i);     // print it (1)
                System.out.println(s);    // print it (2)
                System.out.println(sb);  // print it (3)

                // attempt to change i, s, and sb using methods
                iMethod(i);
                sMethod(s);
                sbMethod(sb);

                 // print variable i and objects s and sb (again)
                 System.out.println(i);    // print it (7)
                 System.out.println(s);   // print it (8)
                 System.out.println(sb); // print it (9)

         }

         public static void iMethod(int iTest) {
                iTest = 9;                          // change it
                System.out.println(iTest); // print it (4)
                return;
         }

         public static void sMethod(String sTest) {
                sTest = sTest.substring(8, 11); // change it
                System.out.println(sTest);        // print it (5)
                return;
         }

         public static void sbMethod(StringBuffer sbTest) {
                sbTest = sbTest.insert(7, "Java "); // change it
                System.out.println(sbTest);            // print it (6)
                return;
          }
}

Output of the program :

25
Java is fun!
Hello, world
9
fun
Hello, Java world
25
Java is fun!
Hello, Java world

Tuesday, August 3, 2010

Equals and Hash code contract

What is a hashCode?

First of all, what the heck are hashcodes for? Well, oddly enough, they're used heavily in Hashtables.For the purposes of this article, I'm going to assume you know what a Hashtable is and how useful it can be. Needless to say, Hashtables can help you keep a large number of objects organized and allow you to access them very quickly. Of course, to do so, a Hashtable relies on the power of the hashCode method.
In essence, when you invoke the get(Object o) method of a Hashtable, the Hashtable will use the hashCode method of the object you passed to it in order to access an object (or list of objects). As long as the hashCode method is working properly, everything works just fine. If it doesn't, however, you can have some rather serious problems.
So, what makes a valid hashCode? Well, here's what is said about hashCodes in the API Specification for Object:

The general contract of hashCode is:

Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.

If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.

It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hashtables.

As much as is reasonably practical, the hashCode method defined by class Object does return distinct integers for distinct objects. (This is typically implemented by converting the internal address of the object into an integer, but this implementation technique is not required by the JavaTM programming language.)

Study: A Proper hashCode

Let's start by looking at a good program. In this case, we've defined a new object, MyObject, which defines its own equals and hashCode methods. In general, it is good practice to define your own hashCode method any time you override equals (more about this later). Here it is:

import java.util.Hashtable;
import java.util.Date;

public class MyObject
{
    int a;
    
    public MyObject(int val)
    {
        a = val;
    }
    
    public boolean equals(Object o)
    {
        boolean isEqual = false;
        
        if ( o instanceof MyObject )
        {
            if ( ((MyObject)o).a == a )
            {
                isEqual = true;
            }
        }
        
        return isEqual;
    }
    
    public int hashCode()
    {
        return a;
    }
    
    public static void main(String[] args)
    {
        Hashtable h = new Hashtable();
        
        MyObject[] keys = 
        {
            new MyObject(11),
            new MyObject(12),
            new MyObject(13),
            new MyObject(14),
            new MyObject(15),
            new MyObject(16),
            new MyObject(17),
            new MyObject(18),
            new MyObject(19),
            new MyObject(110)
        };
        
        for ( int i = 0; i < 10; i++ )
        {
         h.put(keys[i], Integer.toString(i+1));
        }
        
        long startTime = new Date().getTime();
        
        for ( int i = 0; i < 10; i++ )
        {
         System.out.println(h.get(keys[i]));
        }
   
        long endTime = new Date().getTime();
        
        System.out.println("Elapsed Time: " + (endTime - startTime) + " ms");
    }
}

Executing the above code leaves you with this output:

1
2
3
4
5
6
7
8
9
10
Elapsed Time: 0 ms

As you can see, we easily retrieved the objects we had originally put into the Hashtable and it took practically no time at all. How does our hashCode method do? Does it pass all 3 of the criteria laid out earlier? Let's look at each of the criteria one at a a time.

1. Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.

Does our hashCode meet that criteria? Does our hashCode continually return the same value (assuming that our variable, a, hasn't changed)? Certainly, it does - it returns the value of a. Okay, next criteria.

2. If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.

How about this one? Does our hashCode method still work here? Sure, it does. If two object have the same value for a, they will be equal (by the equals method). In such a situation, they would also return the same hashCode value. Our hashCode method works here. Okay, on to the final criteria.

3. It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hashtables.

Well, this isn't really a requirement at all - it's more of a suggestion, if anything. It is best if the hashCodes for unequal objects are different, but it's not required. We'll look at this a little more in a few minutes.
So, there you have it - we've successfully overridden the hashCode method. So, how do you know when you should do such a thing? Well, in general, it's considered good practice to override hashCode any time you override equals. The reason for this is due to the default behavior of the equals and hashCode methods.

When Should you Override hashCode()?

In Java, the default equals method (as defined in Object) compares the two objects to see if they are, in fact, the same object. That implementation does not take into account any data that might be contained by the object. For example, had we not overridden the equals method in our class, MyObject, we'd see that the following code:

MyObject obj1 = new MyObject(1);
MyObject obj2 = new MyObject(1);
System.out.println(obj1.equals(obj2));

...would produce the output "false." The reason for this is that, even though the two objects contain the same data, they are different objects - two separate objects on the heap. Fortunately, we overrode the equals method and, given the MyObject class as defined originally, we'd get the output "true" from this example. However, what about the hashCode?
Well, the default hashCode method works in a similar fashion to the default equals method. It converts the internal address of the object into an int and uses that as the hashCode. Well, that won't work so well here. We just defined a way in which two distinct objects (which will, necessarily, have distinct memory addresses) to be considered "equal." The default hashCode implementation, however, will returndifferent hashCodes for the two objects. That violates the second rule defined above - any objects that are considered equal (by their equals method) must generate the same hashCode value. Therefore, whenever you override the equals method, you should also override the hashCode method.

Study: Faulty hashCodes

What happens if we override the hashCode method poorly? Let's violate one of the rules, shall we? Let's change our hashCode method to look like this:

public int hashCode()
{
    return (int)(Math.random() * 5);
}

This one actually violates a couple rules. Not only does it not guarantee that two objects that are equal have the same hashCode, it doesn't even guarantee that the same object will keep the same hashCode from one invocation to the next. Any idea what havoc this might wreak on a poor, unsuspecting Hashtable? If I execute the main method again, as I did before, I get this output:

null
2
null
4
null
6
null
null
null
null
Elapsed Time: 0 ms

Eek! Look at all of the objects I'm missing! Without a properly functioning hashCode function, our Hashtable can't do its job. Objects are being put into the Hashtable but we can't properly get at them because our hashCode is random. This is certainly not the way to go. If you were to run this, you might even get different output than I got!
Even if the hashCode that is returned is always the same for a given object, we must ensure that the hashCodes that are returned for two objects that are equal are identical (Rule #2). Let's modify our MyObject class so that we hold true to Rule #1 but not to Rule #2. Below is the modified parts of our class:

public class MyObject
{
    int a;
    int b;
    
    public MyObject(int val1, int val2)
    {
        a = val1;
        b = val2;
    }
    
    ...
    
    public int hashCode()
    {
        return a - b;
    }
    
    ...
    
    public static void main(String[] args)
    {
        ....
        MyObject[] keys = 
        {
            new MyObject(11, 0),
            new MyObject(11, 1),
            new MyObject(11, 2),
            new MyObject(11, 3),
            new MyObject(11, 5),
            new MyObject(11, 5),
            new MyObject(11, 6),
            new MyObject(11, 7),
            new MyObject(11, 8),
            new MyObject(11, 9)
        };
        ...
    }
}

Executing this code gives us some more disturbing results, although they may not appear that way at first. Here's my output:

1
2
3
4
5
6
7
8
9
10
Elapsed Time: 0 ms

So what's wrong with that, you ask? Well, what should the put method do? If you first put an object into a Hashtable using a specific key and then put a new value into the Hashtable using a key that is equal to that one, the original value should be replaced with this new one. That's notwhat's happening here. Instead, our Hashtable is treating our keys as if they're all unequal. Eek! This is the same result you could expect if you were to override equals without overriding hashCode. Here's the output we should get, assuming we have a good hashCode method:

10
10
10
10
10
10
10
10
10
10
Elapsed Time: 0 ms

Inefficient hashCodes

Okay, one more thing to go over. What happens if we have a valid hashCode, but the values aren't very distinct. In this case, I'm going to hold to requirements 1 and 2, but I'm going to ignore requirement 3 entirely. Let's modify MyObject.hashCode() and our main method to look like this:

public int hashCode()
{
    return 0;
}

public static void main(String[] args)
{
    Hashtable h = new Hashtable();
    
    MyObject[] keys = new MyObject[10000];
    for ( int i = 0; i < 10000; i++ )
    {
        keys[i] = new MyObject(i);
    }
    
    for ( int i = 0; i < 10000; i++ )
    {
     h.put(keys[i], Integer.toString(i+1));
    }
    
    long startTime = new Date().getTime();
    
    for ( int i = 0; i < 10000; i++ )
    {
     h.get(keys[i]);
    }

    long endTime = new Date().getTime();
    
    System.out.println("Elapsed Time: " + (endTime - startTime) + " ms");   
}

Note that this is a valid hashCode method. It always returns the same value for a given object, assuming that nothing used in the equals method changes (not that it cares anything about that). It also returns the same value for two objects that are "equal." What it doesn't do is return a different value for objects that are not equal. It always returns the same value. This is a valid hashCode but, as this example will show, an inefficient one. Executing this 5 times using this new hashCode method, I get this output:

Elapsed Time: 7016
Elapsed Time: 7125
Elapsed Time: 7297
Elapsed Time: 7047
Elapsed Time: 7218

That gives me an average time of 7140.6 - roughly 7 seconds. By executing my original hashCode method and execute the same main method, I got this output:

Elapsed Time: 16
Elapsed Time: 16
Elapsed Time: 16
Elapsed Time: 15
Elapsed Time: 16

That's an average of about 16 millseconds - a far cry from the 7 seconds we saw earlier! We're seeing a dramatic increase in the amount of time required to retrieve objects from our Hashtable using the poor hashCode method. The reason for this is that, with every key object having the same hashCode, our Hashtable has no choice but to index every value under the same hashCode. In such a case, we've got everything in one long list and our Hashtable does us no good at all - we might as well have stored the objects in an ArrayList.

Summary

Hopefully you see the importance of creating a valid hashCode for any objects you're using. Whenever creating your own hashCode method, it's important to remember the 3 rules. I've listed them here in a summarized format; refer to the API Spec for full details.

The hashCode method must always return the same value (assuming the object has not changed) from one invocation to the next.
The hashCode for multiple "equal" objects must be identical.
The hashCodes for multiple "unequal" objects should be different, but it is not required.

With that knowledge, you should have a firm grasp on how to override hashCodes and make the most out of the Hashtable (and other Map classes) in Java.

Java Architect