Building on the database that spurred my Color Considerations, I spent the better part of today debating the best way to log observers in a database. Having documented it to explore the pros and cons with my coworkers, I figured I'd share my musings with y'all as well. What way do you think is best? Is there a better pattern I left out?
OUR EXAMPLE
On a blue-footed booby capture Nov. 11, 2007 we had four biologists recorded on the capture event: Mike, Roy, Phil and Jay.
How do I record this? Let’s review a few options:
OPTION 1
Two Integer Fields to an Observer lookup table
NEGATIVE: The rest of the observers disappear in the data abyss.
OPTION 2
One Integer Field to an Observer lookup, One Free Text Field
NEGATIVE: Nothing lost, but only able to reliably query on Observer1… Other Observers are listed, but there is no way to find PHIL. He could be “PHIL, P., PHILLIP, PHILIP, or FILLUP”
OPTION 3
Observation Table linked to Observation_Observer Table by ObservationID
Observation_Observer Table has the following Structure for example record, “123”
NEGATIVE: The weakness of this model is that it is possible to enter an observation with no observer since enforcing the creation of child record in a database is difficult and one would have to rely on the interface for data integrity. Additionally, querying subtables is just a pain and often done incorrectly by users.
OPTION 4
Modified Option 3 – With Observers assigned roles.
Observation Table linked to Observation_Observer Table by ObservationID
Observation_Observer Table has the following Structure for example record, “123”
NEGATIVE: This model shares most of the same weaknesses as Option 3. The addition of roles does codify the each of the observer's parts better, but introduces another weakness that no Role is necessarily required. At least in Option 3 one could produce a 1-to-1 relationship by selecting the highest ranked Observer (lowest number) as long as one tested for nulls first.
Option 5
Modified Option 2 & 4 combined – Primary Observer in the Main Table with Secondary Observers in a subtable assigned roles.
Observation Table
Observation_Observer Table has the following Structure for example record, “123”
NEGATIVE: The only weakness of this is that entering secondary observers may be a little tiring over the free text field Option 2. One may also want to consider the ability to enter free text for anecdotal observations rather than creating another Observer record. Perhaps certain applications would enter structured data in the Observation Comment field and the PrimaryObserver would be “Anecdotal – See comments”. For most incidentals however, follow-up is very important making the addition of a new observer an obvious and necessary step so a biologist has a phone number or email to confirm the sighting.
I have my favorite. Which option do you think has the best combination of ease of use and future queryability? Do you have a better design pattern?
OUR EXAMPLE
On a blue-footed booby capture Nov. 11, 2007 we had four biologists recorded on the capture event: Mike, Roy, Phil and Jay.
How do I record this? Let’s review a few options:
OPTION 1
Two Integer Fields to an Observer lookup table
Observer1ID: “272” (MIKE)
Observer2ID: “312” (ROY)
NEGATIVE: The rest of the observers disappear in the data abyss.
OPTION 2
One Integer Field to an Observer lookup, One Free Text Field
Observer1ID: “272” (MIKE)
OtherObservers: “ROY, PHIL, JAY”
NEGATIVE: Nothing lost, but only able to reliably query on Observer1… Other Observers are listed, but there is no way to find PHIL. He could be “PHIL, P., PHILLIP, PHILIP, or FILLUP”
OPTION 3
Observation Table linked to Observation_Observer Table by ObservationID
Observation_Observer Table has the following Structure for example record, “123”
ObservationID ObserverID Rank
123 272 1
123 312 2
123 21 3
123 128 4
NEGATIVE: The weakness of this model is that it is possible to enter an observation with no observer since enforcing the creation of child record in a database is difficult and one would have to rely on the interface for data integrity. Additionally, querying subtables is just a pain and often done incorrectly by users.
OPTION 4
Modified Option 3 – With Observers assigned roles.
Observation Table linked to Observation_Observer Table by ObservationID
Observation_Observer Table has the following Structure for example record, “123”
ObservationID ObserverID RoleID
123 272 1 (Primary Observer)
123 312 2 (Secondary Observer)
123 21 6 (Sample Collector)
123 128 2 (Secondary Observer)
NEGATIVE: This model shares most of the same weaknesses as Option 3. The addition of roles does codify the each of the observer's parts better, but introduces another weakness that no Role is necessarily required. At least in Option 3 one could produce a 1-to-1 relationship by selecting the highest ranked Observer (lowest number) as long as one tested for nulls first.
Option 5
Modified Option 2 & 4 combined – Primary Observer in the Main Table with Secondary Observers in a subtable assigned roles.
Observation Table
PrimaryObserverID: “272”
Observation_Observer Table has the following Structure for example record, “123”
ObservationID ObserverID RoleID
123 312 2 (Secondary Observer)
123 21 3 (Sample Collector)
123 128 4 (Secondary Observer)
NEGATIVE: The only weakness of this is that entering secondary observers may be a little tiring over the free text field Option 2. One may also want to consider the ability to enter free text for anecdotal observations rather than creating another Observer record. Perhaps certain applications would enter structured data in the Observation Comment field and the PrimaryObserver would be “Anecdotal – See comments”. For most incidentals however, follow-up is very important making the addition of a new observer an obvious and necessary step so a biologist has a phone number or email to confirm the sighting.
I have my favorite. Which option do you think has the best combination of ease of use and future queryability? Do you have a better design pattern?
Comments