Audit Logging IR

How to design audit logs that help investigations

I've been on both sides of incident investigations: the engineer who built the logging, and the analyst who needed to piece together what happened. Here's what I've learned about building audit logs that actually help when things go wrong.

The Investigation Mindset

When an incident occurs, investigators ask questions like:

  • Who accessed what, and when?
  • What changed, and what were the previous values?
  • Was there unusual behavior leading up to this?
  • What was the sequence of events?

Your audit logs need to answer these questions definitively. If they can't, you're flying blind.

Essential Fields

Every audit event should include:

{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "timestamp": "2024-01-15T14:32:01.234Z",
  "eventType": "TICKET_UPDATE",
  "actor": {
    "userId": "user-123",
    "email": "agent@company.com",
    "roles": ["Agent"],
    "sessionId": "sess-456",
    "ip": "192.168.1.100",
    "userAgent": "Mozilla/5.0..."
  },
  "resource": {
    "type": "ticket",
    "id": "ticket-789"
  },
  "action": "UPDATE",
  "outcome": "SUCCESS",
  "metadata": {
    "changedFields": ["status", "priority"],
    "previousValues": {
      "status": "open",
      "priority": "low"
    },
    "newValues": {
      "status": "in_progress",
      "priority": "high"
    }
  }
}

Why Each Field Matters

  • id: Unique identifier for correlation across systems
  • timestamp: ISO 8601 with milliseconds, always UTC
  • actor: Everything about who performed the action. Include session and network info for forensics.
  • resource: What was acted upon, with enough info to find it
  • action + outcome: What happened and whether it succeeded
  • metadata: Context specific to this event type. For changes, include before/after values.

What to Log

Log security-relevant events, not everything. Focus on:

  1. Authentication events: Login success, failure, logout, token refresh
  2. Authorization decisions: Especially denials—these are gold for detecting attacks
  3. Data access: Who viewed sensitive records
  4. Data modifications: Creates, updates, deletes with before/after state
  5. Configuration changes: Role assignments, permission changes, settings modifications
  6. Administrative actions: User management, system configuration

Anti-Patterns

1. Logging Too Little

"We log logins" is not enough. If you can't answer "did user X ever view document Y?", your logging is insufficient.

2. Logging Too Much

Logging every function call creates noise that buries signal. Nobody wants to grep through millions of "user moved mouse" events.

3. Missing Context

"Status changed" is useless. "Status changed from 'open' to 'closed' by user@email at 14:32:01 from IP 192.168.1.100" is actionable.

4. Mutable Logs

If an attacker can delete logs, they can cover their tracks. Audit logs should be append-only or stored in a separate, protected system.

Implementation Tips

  • Use structured logging (JSON) so logs are queryable
  • Log at the service/API layer, not the database layer
  • Include request context (trace ID, session ID) for correlation
  • Ship logs to a centralized SIEM (Splunk, Sentinel, etc.)
  • Set retention policies based on compliance requirements
  • Alert on suspicious patterns, don't just store and forget

Takeaway

Build your audit logging as if you'll need to explain exactly what happened to regulators, lawyers, and executives—because someday you might.