Java Practices->Implement code tables

Implement code tables

Most applications use code tables. A reasonable definition of a code table might be:

A code table is a list of closely related items, each of which has minimal substructure.

Examples:

a list of countries (or other geographical category)
the list of credit cards accepted by a web site - Mastercard, Visa, etc.
the number of items shown to the user in a search result - 10, 50, 100

Code tables are often presented in drop down lists, as in:

(Radio buttons or other presentation styles may also be appropriate, according to the needs of each case.)

Code Table Structure
Most applications use a relational database. Code tables usually represent the simplest kinds of tables that you can model in a database. Here are some example code tables, as defined by SQL CREATE TABLE statements:

Number of items shown in a search result:

CREATE TABLE NumResults (  
  Id TINYINT UNSIGNED NOT NULL AUTO_INCREMENT,
  NumItems TINYINT UNSIGNED UNIQUE NOT NULL, 
  PRIMARY KEY (Id)
) TYPE=InnoDB;

List of countries supported by an application:

CREATE TABLE Country (
  Id MEDIUMINT UNSIGNED NOT NULL AUTO_INCREMENT,
  Text VARCHAR(50)UNIQUE NOT NULL,
  ShortText VARCHAR(2)UNIQUE NOT NULL,
  PRIMARY KEY (Id)
) TYPE=InnoDB;

The same list of countries, but with a field added to control sort order:

CREATE TABLE Country (
  Id MEDIUMINT UNSIGNED NOT NULL AUTO_INCREMENT,
  OrderWith MEDIUMINT UNIQUE UNSIGNED NOT NULL,
  Text VARCHAR(50)UNIQUE NOT NULL,
  ShortText VARCHAR(2)UNIQUE NOT NULL,
  PRIMARY KEY (Id)
) TYPE=InnoDB;

It's often useful to picture each row in a code table as representing a set of aliases for a single idea. One alias can be more appropriate than another, according to the context in which it's used. For example, in a report which tries to squeeze as much information as possible onto pages of fixed width, short abbreviations are often useful. When presenting a drop-down list to an end user, it may be more desirable to show a longer description, instead of an abbreviation.

So, you may decide to represent an item in a code table using what amounts to several aliases for the same item:

a numeric id
the 'regular' text, usually seen by the user
a short abbreviation used when appropriate
some other alias appropriate to a given case

For this idea to make sense, each of the above fields would need a UNIQUE constraint, and each would need to be non-null.

Code Table Evolution
As shown above, code tables don't have a specific, definitive structure. You aren't locked into a specific style. A single application often has many code tables, but those code tables don't necessarily share the exact same form. Also, it's not uncommon for a code table to start its life in a simple form, and then later grow into something more elaborate. In this sense, code tables are roughly similar to Java's enumeration types, which can start out being very simple. Since enumerations are also classes, you can add more structure to them later, if needed.

In-Memory Caching
Code tables often represent relatively static data. Since the data doesn't change very often, it usually makes sense to consider reading in all code tables once upon startup. Then, each time a code table is needed, the in-memory representations of the code tables are referenced directly, instead of repeatedly going back to the database. This usually improves application performance.

In-Memory Joins
If code tables are cached in memory after startup, then you will often be able to move logic formerly implemented by a database JOIN operation into in-memory operations in Java instead. (This is an exception to the rule of not performing database tasks in code.) There are two advantages to this:

it will likely improve performance.
your SQL statements will be simpler, since the JOINs can be left out

For illustration, consider an application that models a team sport, in which each player on a team is assigned a specific position of some sort (bowler, pitcher, fullback, or whatever). In this case, the Team table has a foreign key into the Position code table. To retrieve the list of players on a team, you might have an explicit join to the Position code table, as in :

SELECT 
 Name, Number, PositionName
FROM 
 Team JOIN Position ON PositionFK = Position.Id
WHERE
 Team.Id = ?

Alternatively, the JOIN might be dropped in favor of returning the 'raw' PositionFK identifier, instead of the PositionName text, as in :

SELECT 
 Name, Number, PositionFK
FROM 
 Team
WHERE 
 Team.Id = ?

Of course, the PositionFK identifier would need to be translated into text (in Java-land) before presenting the result to the user.

Id For Code, Text For User
The fact that a code table item is essentially a collection of aliases for a single idea can be put to use in the following way. In Java-land, it's best to identify items using internal, static codes, instead of the text visible to the user. Forms in web applications are a good example of this:

<select name="Flavor">
  <option value='1'>Chocolate</option>
  <option value='2'>Strawberry</option>
  <option value='3'>Vanilla</option>
</select>

Here, the text ('Chocolate') is shown to the user in a drop-down, but the value submitted to the server is actually a numeric id ('1'), as controlled by the value attribute of the option tag. This separates two things nicely. The text may change for many reasons (change the spelling, add translations into another language), but such changes will not ripple into Java, since Java-land uses numeric codes, not the text. (Some object to exposing database primary keys to the end user like this. But in this case, it doesn't seem to do any harm.)

Sorting
Sorting of code tables can be tricky. If the sort is alphabetical, then that's simple to implement. However, sorts aren't always alphabetical. Sometimes they depend on arbitrary rules. For example, if an application is used only in Australia, a list of countries having Australia and New Zealand at the beginning may be required. In such cases, one option is to define a column in the underlying code table which explicitly defines the sort order.

The sort order can also be affected by whether an application is multilingual. If a code table is sorted alphabetically in each language, then the order of presentation to the user will usually not be the same in all languages.

Monster Code Tables
Some applications put all of their code tables into a single "monster" code table. The monster code table has two primary keys -- one to identify the code table, and one to identify the value within the code table. This seems to be an inferior design:

it lumps together items which aren't logically related to each other. It's almost always a mistake to put implementation details before meaning. That's a bad sign.
foreign keys to code tables are no longer possible. Since foreign keys are the heart and soul of a relational database, this is a major drawback.
since some foreign keys are absent, the clarity of a database's structure (and related SQL statements) is significantly reduced.
when a single code table has special needs, then it usually can't fit into the structure of the monster code table.
code tables can't evolve independently and still remain in the monster code table.
JOINs to monster code tables are usually doubled. What used to be a single join becomes 2 joins -- one join to define the code table, another to define the value within the code table.
in large SQL statements, the large number of joins can rapidly become annoying.

Number of Results:
Flavor of Ice Cream: