In my last post I went off on relational database design patterns. Now let's try implementing these patterns in MS Access.
Download the MS Access 2007 Project used in this post.
In the old days we made spreadsheets. And they got ugly really quick. Let's work with our city-zipcode analogy from last time.
We started with what is basically a spreadsheet in MS Access:
We outgrew this quickly when we added Meridian:
I didn't mention this in my last post, but I see this all too often. Basic database design cue, if you're numbering your columns your design is wrong.
Why? Boise has 29 zipcodes. New York has 161. Washington, DC has 278. Do you really want 278 columns for storing zipcodes?
So you try something else. You create what really is the right pattern, but it lacks referential integrity. If you could spell and type perfectly it would be ideal.
Ok, the answer is "in view" or "in query" for an MS Access world. This is what we want to come up with, it just isn't maintainable in the real world. Take Coeur d'Alene, Idaho for example. In our main database at work there is 34 ways to spell Coeur d'Alene because someone left town a textfield with no zip code validation. What we need to do is break this table into a discrete entity. We only have to type Coeur d'Alene once and numerical dyslexics like me only have to type zip codes once. We throw a table in the middle that is not human-readable that ties the primary keys (unique autonumbers that mean nothing) into the middle table to tie together this many-to-many relationship. Ideally, humans don't even view this table, it sits in the background and stores the relationships human define using forms looking at real world data (city names and zipcodes in this case).
* There's some cool things going on here. Since zip codes in New England start with leading zeros, I've made zipcodes a text field. Uniqueness is being enforced through an Index that doesn't allow duplicates and an Input Mask is requiring all zipcodes to be five digits. With this setup both "123" and "ababa" would not be allowed. Additionally we couldn't add "83714" twice. This is being enforced at the table level so all forms, queries and other programs (ArcGIS) that may use this table will have to comply with these validation rules.
Now we relate these together. In MS Access 2007, go to the Database Tools tab and select Relationships. Click Show Table to add your tables and then drag and drop your relationships.
Your tables should already have autonumbers set up as Primary Keys for Cities.CityID, ZipCodes.ZipCodeID and Cities_ZipCodes.ID. When you make the relationships choose the following checkboxes:
- Select Enforce Referential Integrity - This will ensure that you do not add a CityID or ZipCodeID to Cities_ZipCodes that does not exist in the main Cities or ZipCodes table.
- Do not select Cascade Update Related Records. We we do not need to cascade updates because we are following best practices and not allowing our primary keys to mean anything in the real world. If you were to use zipcode for example as your primary key you would have to set this, because invariably zipcodes will change and your primary keys will have to change. When this is your model bad things will happen. Not right away. After you have thousands of records. Don't let your primary keys mean anything. Choose long integers or guids. Period.
- Cascade Delete Related Records - This will delete related records in Cities_ZipCodes when you delete a City or ZipCode from the main Cities or ZipCodes table. This will prevent orphaned records in Cities_ZipCodes. Having recommended this, I will add that this is a dangerous setting for scenarios where inexperienced users have the ability to delete records as the entire history for a record could be deleted, e.g. the 278 zipcodes for New York City if a user were to delete the one record for New York. Make sure your security settings match your users. And consider not really deleting anything, just adding a flag that makes data appear to disappear for most users.
Save and close. Now with one query we can recreate the table dynamically that we had above without 34 misspelling of Coeur d'Alene.
In MS Access 2007 Select the Create Tab and click Query Design. Now use Show Table again to add your three tables. Auto-magically they will relate to each other using the rules you just added in the Relationship Diagram. All that's left is to double-click (or drag) Cities.City and ZipCodes.ZipCode and they will drop into the results area.
You can also swing over to the SQL by clicking an icon in the lower right. When you're a big boy you may come here first, but we all start GUI (it's a fact ask a mid-wife).
SELECT Cities.City, ZipCodes.ZipCode
FROM ZipCodes INNER JOIN (Cities INNER JOIN Cities_ZipCodes ON Cities.CityID = Cities_ZipCodes.CityID) ON ZipCodes.ZipCodeID = Cities_ZipCodes.ZipCodeID
ORDER BY Cities.City, ZipCodes.ZipCode;
Now run the query and you will get the same table we had in Figure 3. So we got reporting down. Now how do we swing data entry.
We'll start with a table hack. In MS Access you can bastardize a table to act like a mini-form. This is a good place to start.
Open Cities_ZipCodes and switch to design view. We're going to add lookups for those two foreign keys, CityID and ZipCodeID so they are human readable.
Select CityID and in the lower panel click the Lookup tab. Add the following settings:
|Property||Value||What is going on here?|
|Display Control||Combobox||This will display a combobox pick list instead of a textbox|
|Row Source Type||Table/Query||This is the data source type we are using to fill the pick list|
|Row Source||Cities||Use the cities table to fill the list|
|Bound Column||1||We saying that the field CityID matches the first field in the Cities table|
|Column Count||2||Pass across the first two fields: CityID and City|
|Column Heads||No||Don't display column headings, only really useful if you are bringing back a slew of fields such as City, County and State|
|Column Widths||0";1"||By settings CityID to 0" I am hiding it and displaying City 1" wide, but since it is the only one it will stretch to however wide the column ends up being. Note this is in inches, semi-colon delimited.|
Repeat these settings for ZipCodeID, Save and open the table.
Surprise! We have a human-readable spreadsheet... but every value is validated for referential integrity!
Now before we move to forms we have one more table perversion that MS Access allows: Subdatasheets. This method, combined with the look-ups we just authored will let us manage many-to-many relationships as if they were convenient hierarchies in table view.
Open the Cities table in datasheet (table) view. If you're using MS Access 2007 you'll see some + symbols auto-magically added themselves to your datasheet. Click one and the related records in Cities_ZipCodes will be displayed.
Now, for various reasons, for example you added a second relationship to cities (maybe a states table) this won't appear magically. No worries. Follow these steps:
Select the Home tab, browse across the Records Fieldset and you'll see a small More button. Click this and select SubDataSheet. Find the table you wish to relate, if you have the Cities table open select Cities_ZipCodes and identify the parent and child fields that match, in this case CityID.
This turned into a long post. In the next installment I'll add some forms to manage our postal codes.
Download the MS Access 2007 Project used in this post.