Saturday, February 07, 2009

Caching Data and IFactory

Recently I have been working a project to extract data from a huge database server for a daily reporting service. The process generates data in a required format based a complicated business logic.

The structure of this project contains a Repository service as a gateway to provide interfaces to get domain objects by Domain classes. To retrieve data, I use IFactory pattern to get data:

interface IFactory<T> 
{
T CreateObject(IDBService);
}

where IDBService is an interface to provide methods to get data from database by using SQL.

The only problem with this strategy is that the process is too slow. It constantly makes SQL queries from the database, more than thousands SQL query calls.

I need a way to cache data to reduce those SQL calls at Repository. Since the data amount is none-predictable, I have to limit the cache. I use a maximum cache counter in the application configuration, 10000 as example. If the count of required data is less than this limit, then the Repository will cache all the data for later use. Otherwise, I have to get one data at a time.

The more I get into this caching feature, the better the process performs. With a maximum caching number, I reduced the SQL calls down to 8, and reduced the time from hours to minutes or less than 1 minute! That's great improvement.

For example, I created two factory classes for retrieving data:

public class SpectInfoListFac<List;<SpectInfo>> :
IFactory
{
...
List<SpectInfo> CreateObject(IDBSerivce db) {...}
...
}

public class SpectInfoFac<SpectInfo> :
IFactory
{
...
SpectInfo CreateObject(IDBSerivce db) {...}
...
}

For caching data, I used the first factory to get a list of data back if not too much data. The constructor of the factory provides information about the data and date range, as well as the maximum count number. If too much data or over the limit, the factory will return a null and I'll use the second factory class to get a specific data.

However,how about different levels of caching? Taking another finding people as example, if I want to find people with the name like "David Chu" and male from Canada, I may get too many. I could try with one more condition province = "AB", then city = "Calgary", then district area in "NW"... This will result in too many factory classes. Can I combine all these into one?

Here I changed the IFactory interface to a more generic way:

interface IFactory<T> 
{
IEnumerable<T> CreateObject(IDBService);
}

public class SpectInfoFac<SpectInfo> :
IFactory
{
...
IEnumerable<SpectInfo> CreateObject(IDBSerivce db, int level) {...}
...
}

There is only one method to create or retrieve objects. The level parameter is used for the granularity of data range. It will be up to the implementation Factory class to decide the level. For example, 0 for all, 1 for AB, 2 for Calgary...

The result is an enumerable collection. It could be a collection more than one, one for specified data item, or null for nothing being found.

Still I can have options with various implementations of IFactory class or one implementation to cover all the cases. My simplified factory pattern provides a caching option.

I love to re-factory of my codes!

0 comments: