[jdom-interest] About optional values, nulls, and exceptions: its relational all over again

Mon Jul 10 21:36:49 PDT 2000

I just started subscribing to this list and have been reading the argument about optional values, nulls, and exceptions. I'd like to offer a few observations to this interesting discussion:

1) In the world of relational databases Chris Date, Hugh Darwen, and a few of their colleagues have been arguing for years that nulls are bad things. They believe that a relational database that didn't allow 
nulls would be more logical, consistent, faster, easier to write, (choose your superlatives and insert here <g>), easier to validate and generally better database. 
   They offer theoretical and practical reasons for this. For instance, on the theoretical side nulls mess up basic algebra: The Sum of column A + the sum of column B does not equal the Sum of Column 
A + Column B. I think the reason is that if A is added to B and one of them is null then the result is null. But when you sum an individual column in an RDBMS you skip over null rows (someone correct 
me if I'm wrong but its something close to that). Nulls really complicate database internals enormously.
   Another argument they make is that null has too many meanings: 
     - The value isn't known
     - No one bothered to type the value in (kinda the same as the first)
     - The value is not applicable to the thing being described.
     - still other meanings. 

2) Unfortunately,. nullable columns are very common in database design (and I've got to admit that I've designed database schemas that have nullable columns). Therefore, XML documents generated 
from RBDMS tables _will_ have nullable columns (er, optional attributes and elements).
   In some designs the number of nullable columns in a table may be far greater than the number of non-nullable columns. There are problem domains that map easily to this sort of thing. One ends up 
with a table that looks kinda like a sparse matrix. Now, you can get rid of those columns by switching over to having the column names be values that are part of pri keys in dependent tables.
   The same philosophy can be applied to XML btw: Don't make optional things be elements or attributes. Now you can have the same argument that the relational database people have about whether 
nullable colums (er optional attributes and elements) are necessary. Mind you, I'll pass on joining in that argument: BTDTGTTS. 

3) It is worth noting what JDBC does about nullable columns for something like an int field: You have to do a separate call after the get to find out if the previous column was null or not. 
   getInt returns a simple int, not an Integer. Then you do a rsResulSet.wasNull() call to get a boolean on whether the previous return value was null. 
   Basically, the reason this is necessary is that int doesn't have null in its normal range of values.
   Oddly enough, for SQL Types that are translated into Java Class types (eg date columns that are returned as java.sql.Date) it appears that one can also still call wasNull() to check if the returned value 
was null. Not clear if there is ever any situation where that is useful. Probably not.

As far as throwing exceptions is concerned: Well, I like clean simple interfaces. However, while nulls may seem like they ought to occur rarely (ie that optional elements and attributes should occur 
infrequently) there will certainly be data where nulls (ie missing optional things) occur very frequently. Throwing exceptions in these cases does seem wrong to me. 

So if people are going to have lots of optional thingies perhaps handling their missing cases with exceptions isn't such a good idea. Of course, of you could take the position that people shouldn't make 
DTDs and XML Schemas that do that sort of thing. Geez, I'm burned out on that argument. But feel free to take that position. Some formidable and impressively accomplished minds in the relational 
world certainly do <g>.