Best unofficial Apache Server developers community |
|
I am considering designing a relational DB schema for a DB that never actually deletes anything (sets a deleted flag or something). 1) What metadata columns are typically used to accomodate such an architecture? Obviously a boolean flag for IsDeleted can be set. Or maybe just a timestamp in a Deleted column works better, or possibly both. I'm not sure which method will cause me more problems in the long run. 2) How are updates typically handled in such architectures? If you mark the old value as deleted and insert a new one, you will run into PK unique constraint issues (e.g. if you have PK column id, then the new row must have the same id as the one you just marked as invalid, or else all of your foreign keys in other tables for that id will be rendered useless).
posted via StackOverflow
|
|
 
|
If your goal is auditing, I'd create a shadow table for each table you have. Add some triggers that get fired on update and delete and insert a copy of the row into the shadow table. |
|
 
|
I think what you're looking for here is typically referred to as "knowledge dating". In this case, your primary key would be your regular key plus the knowledge start date. Your end date might either be null for a current record or an "end of time" sentinel. On an update, you'd typically set the end date of the current record to "now" and insert a new record the starts at the same "now" with the new values. On a "delete", you'd just set the end date to "now". |
|
 
|
2.a) version number solves the unique constraint issue somewhat although that's really just relaxing the uniqueness isn't it. 2.b) you can also archive the old versions into another table. |
|
 
|
There's a couple of ways to do this sort of thing; I've previously used an "active" flag, with a compound uniqueness constraint with the primary key and that flag on each table. This works well when you don't particularly care about deletion times, or if you already have a timestamp column for updates in your table (the flag gets updated at update time, and prevents a select on max "updated" column). I liked the solution; it's fast, and (relatively) simple. |
|
 
|
Here are some additional questions that you'll also want to consider
Typically the systems that care a lot about audit use tables as Steve Prentice mentioned. It often has every field from the original table with all the constraints turned off. It often will have a action field to track updates vs deletes, and include a date/timestamp of the change along with the user. For an example see the PostHistory Table at http://data.stackexchange.com/stackoverflow/query/new |