Java Object Size In Memory

Mon, 25 Apr 2011 15:58:00 +0000

Anyone who has worked with java in a high end application will be well aware of the double edged sword that is java garbage collection. When it works - it is awesome but when it doesn’t - it is an absolute nightmare. We work on a ticketing system where it is imperative that the system is as near real-time as possible. The biggest issue that we have found is the running of memory in the JVM which causes a stop the world garbage collection. This then results in cluster failures since an individual node is inaccessible for long enough that it is kicked out of the cluster.

There are various ways to combat this issue and the first instinct would be suggest that there is a memory leak. After eliminating this as a possibility, the next challenge was to identify where the memory was being taken up. This took some time and effort and the hibernate second level cache was identified. We were storing far too much in the second level cache.

This is another double edged sword. The hibernate second level cache is absolutely imperative to a high performance system. It does however, come with a price. The cache needs to be managed carefully to ensure that balance between performance and memory requirements.

To this end, it was important to be able to identify what was taking up all the memory in the cache. Each object might only take a couple of hundred bytes, but with our second level cache set to store hundreds of thousands of items, this quickly takes up hundreds of megabytes. With the metadata of the cache, this could easily hike it up near a gigabyte of memory usage. This gets substantially worse with cache evictions and the adding of new items into the cache.

The correct way to resolve this is to identify specific object types that “overload” the cache. i.e. items that have an large number of instances stored in the cache. Identifying classes that store a large number of items is easy enough - we just traverse the cache and count up the number of items. However, there might be a class that stores a smaller number of items but take a sizeable amount of memory. For this reason, it is important to understand the object sizes in memory as well.

If you have ever tried to find a way to identify object sizes, you will know that this is no easy task. You can calculate to some degree of accuracy the size of an object based on the data it stores but this is a manual process.

The only real way to get this information is to use a java agent and use that to calculate a more accurate memory usage. For this purpose, we used the classmexer agent which requires a simple installation step of adding the following parameter to java -javaagent:classmexer.jar. You can then figure out the memory utilisation of an object by calling

```java MemoryUtil.deepMemoryUsageOf(objectInstance) ```

You can also pass in a collection of objects:

```java MemoryUtil.deepMemoryUsageOfAll(objectInstanceCollection) ```

This was the simple part.

Traversing the node structure of jboss cache and collating a collection statistics with regards to the number of each type of object and its memory utilisation was a little more interesting.

I will cover this separately

Hibernate Domain Model Testing

Tue, 23 Dec 2008 22:14:42 +0000

One of my pet peeves with Hibernate has always been how difficult it was to test it. I want to test the persistence of data, loading the data back and any specific funtionality with the domain model.

Simple? NO! The main problem was the management of the data set. I had set up, in the past fairly interesting classes to test the functionality using reflection, and injecting the data from the classes themselves through the data provider mechanism of TestNG. However, this was error prone and clunky at best. It also made dependency management of data quite cumbersome.

With a view to resolving this, I also looked at DbUnit, unitils and Ejb3Unit. They all did some things that I liked but lacked some functionality that was important.

This led me to write a simple testing infrastructure. The goal was straightforward.

I need to be able to define data in a CSV (actually it was seperated by the pipe character |, so PSV) based on entities.
The framework should automatically persist the data (and fail on errors)
It should test that it can load all that data back
It should run as many automated tests on the DOM as possible.

The framework uses the CSV files to read the data for each of the classes (using the excellent SuperCsv library). It needs an Id field for internal reference. As long as the id’s match within the CSV files for the relationships, it will be persisted correctly into the database even when the persisted id’s are different.

For example, I could have a Contact.csv with 5 records (ids 1 through 5) and a Company.csv with 3 records (ids 1 through 3).

The Contact.csv records can map to the id specified in the Company.csv file and when the records get persisted, they will be associated correctly, even if the id’s in the database end up being different.

The framework also looks for the CSV file which has the same name as the class within the location defined within the configuration file. This means that as long as the filename matches the class name, the data loading is automatic.

For simple classes, the Test case is as simple as:

```java public class CompanyTest extends DOMTest { public CompanyTest() { super(Company.class); } } ```

The system (with the help of testNG) is also easily flexible to define object model dependencies. Just override the persist method (which just calls the super.persist) and define the groups to be persist and .persist

in this particular case, it would be

```java @override @Test(groups={"persist", "Company.persist"} public void persist() { super.persist(); } ```

For all dependent classes, I then depend on the Company.persist group (For the ContactTest class for example, since it needs to link to the Company object)

You can specify OneToOne and ManyToOne relationships with just the CSV files - just defining the field name and the id of the object to pull in.

ManyToMany is more complex and requires an interim object to be created within the test section. If the Contact to Company relationship above was ManyToMany, we would create a ContactCompany class with just the two fields - Contact & Company, then create a csv file with three fields, id, Contact, & Company. The framework currently always needs an id field.

You would then need to write a method within the ContactTest or CompanyTest(I use the owning side) to read the CSV file in and pump the data. This process is a little bit complex just now.

With an appropriate amount of test data, you are able to write a test suite that can consistently test your domain model. More importantly, you can configure it to drop the database at the start of each run so that once the tests are complete, you have a database structure and data than can be used for testing of higher level components (EJB/Spring/UI/WebApp)

We currently use this framework to test the domain model as well as distribute a data set for development and testing of the higher tier functionalities.

For the future, there are several additional features this framework needs:

It currently needs the setters/getters & constructors to be public. This needs to be FIXED
Refactor the ManyToMany Relationship code to make it easier and simpler to test and pump data
See if we can ensure that additional tests which data is done within a transaction and rolled back so that the database is left in the “CSV Imported” state on completion of tests
Easier Dependency management if possible

This framework is still inside the walls of Kraya, but once the above issues are resolved and it is in a releasable state, it will be published into the open source community. If you are interested in getting a hold of it, email me and I’ll provide you with the latest version.

The easier and quicker it is to test, the more time we can spend on writing code… :-) The higher the coverage of the tests, the more confident you can be of your final product.

To more testing…

Hibernate on despatches

Java Object Size In Memory

Hibernate Domain Model Testing