Audit Tables#

An “audit table”, as implemented in this library, is a supplemental table that records changes to some source table over time.

While it’s certainly possible to design tables which obviate the need for something like an audit table, the purpose of a feature like this is to make it trivial to add history/change tracking features to existing tables, after the fact.

Additionally, since it is implemented with database-level triggers. It also ensures that even manual alterations to data in audited tables are tracked without writing any code.

Take the following code example:

from sqlalchemy import Column, types
from sqlalchemy.orm import declarative_base

from sqlalchemy_declarative_extensions import declarative_database
from sqlalchemy_declarative_extensions.audit import audit

Base = declarative_database(declarative_base())


@audit()
class Foo(Base):
    __tablename__ = "foo"

    id = Column(types.Integer(), primary_key=True)
    name = Column(types.Unicode())
    json = Column(JSONB())


audit_table = Foo.__audit_table__

Essentially, any existing declared model (or raw table through audit_table) can be annotated in order to enable auditing.

This should automatically register:

  • A new table with the (default) name of: {table}_audit.

  • A set of functions which insert data into the audit table

  • A set of triggers which automatically execute when insert/update/delete occur on the source table.

Now imagine some code which makes alterations to the data in the table:

foo1 = Foo(id=1, name=None, json=None)
foo2 = Foo(id=2, name='wat?', json={})
pg.add(foo1)
pg.add(foo2)
pg.commit()

foo1.name = 'wow!'
pg.delete(foo2)
pg.commit()

After calling some_function(), the foo_audit table should look like so:

audit_pk

audit_operation

audit_timestamp

audit_current_user

id

name

json

1

I

2020-01-01T00:00:00

user

1

null

null

2

I

2020-01-01T00:00:01

user

2

wat?

{}

3

U

2020-01-01T00:00:02

user

1

wow!

null

4

D

2020-01-01T00:00:03

user

2

wat?

{}

Context Columns#

One can optionally include what we call “context” columns. These are additional columns added to the audit table, which lets users capture extra context about the operations, which is not included in the table itself.

For example, in the context of an API, it might be useful to record the currently logged in user.

from sqlalchemy import Column, types

context_columns = [
    Columns('audit_job_id', types.Unicode(), nullable=False)
    Columns('audit_username', types.Unicode(), nullable=True)
]

@audit(context_columns=context_columns)
class Foo(Base):
    __tablename__ = "foo"

    ...

Note

Note that the nullability of these columns is important! Any non-nullable column included in a context column will be required to be set at the session level (a SQL native feature).

If missing, this will prevent the original insertion from begin committed, when the audit table’s triggered insertion fails!

You would then use sqlalchemy_declarative_extensions.audit.set_context_values to set that value at the session/transaction level:

set_context_values(session, audit_job_id='abcd', audit_username='example@example.com')

Or alternatively set local audit.audit_username = 'example@example.com' in raw SQL.

Naming convention#

The naming of all autogenerated objects (table, function, trigger) are controlled through sqlalchemy’s native “naming convention” system.

Below is the equivalent of the naming scheme we use by default.

from sqlalchemy import MetaData

metadata = MetaData(
    naming_convention={
        "audit_table": "%(table_fullname)s_audit"
        "audit_function": "%(schema)s_%(table_name)s_audit"
        "audit_trigger": "%(schema)s_%(table_name)s_audit"
    }
)

Note the audit_table option is slightly different. We default to using the fully qualified name of the table (i.e. including the schema).