Filter & Score Logs

Not all logs are created equal. In a production system, "DEBUG" logs can consume 80% of your storage budget while providing 1% of the value. In this step, we implement intelligent filtering and scoring.

Goal

Score logs based on severity and content
Filter out low-value logs (e.g., successful health checks, debug logs)
Prioritize critical errors

Configuration

1. Severity Scoring

We verify the standard logging levels and can also upgrade the level based on keywords (e.g., "panic" or "exception").

    - mapping: |
        root = this
        
        # Calculate numeric priority (higher is more critical)
        let priority = match root.level {
          "FATAL" => 10,
          "ERROR" => 8,
          "WARN" => 5,
          "INFO" => 3,
          "DEBUG" => 1,
          _ => 0
        }
        
        # Upgrade priority if sensitive keywords found
        if root.message.contains("panic") || root.message.contains("crash") {
           priority = 10
           root.level = "FATAL"
        }
        
        root.priority_score = $priority

2. Intelligent Filtering

We drop logs that fall below a certain priority, or filter out specific noisy patterns.

    - mapping: |
        root = this
        
        # Drop DEBUG logs unless explicitly enabled
        if root.level == "DEBUG" && env("DEBUG_LOGGING_ENABLED") != "true" {
           deleted()
        }
        
        # Drop Health Checks (often very noisy)
        if root.service == "health-check" && root.level == "INFO" {
           deleted()
        }

Complete Step 4 Configuration

production-pipeline-step-4.yaml
input:
  http_server:
    address: "0.0.0.0:8080"
    path: /logs/ingest
    rate_limit: "1000/1s"
    auth:
      type: header
      header: "X-API-Key"
      required_value: "${LOG_API_KEY}"

pipeline:
  processors:
    - mapping: |
        root = this.parse_json().catch({"message": content()})
        if !root.exists("level") { root.level = "INFO" }
        root.level = root.level.uppercase()

    # Score & Filter
    - mapping: |
        root = this
        
        # 1. Scoring
        let score = match root.level {
          "FATAL" => 10,
          "ERROR" => 8,
          "WARN" => 5,
          "INFO" => 3,
          "DEBUG" => 1,
          _ => 0
        }
        root.score = $score
        
        # 2. Filtering
        # Drop low value logs
        if $score < 3 { # Drop DEBUG
           deleted()
        }
        
        # Drop noisy health checks
        if root.message.contains("HealthCheck") && $score < 5 {
           deleted()
        }

output:
  stdout: {}

Deployment & Verification

Test Dropped Log (DEBUG):

curl -X POST http://localhost:8080/logs/ingest \
  -H "X-API-Key: $LOG_API_KEY" \
  -d '{"level": "DEBUG", "message": "This should be dropped"}'

Result: No output (log dropped).

Test Kept Log (ERROR):

curl -X POST http://localhost:8080/logs/ingest \
  -H "X-API-Key: $LOG_API_KEY" \
  -d '{"level": "ERROR", "message": "Keep this"}'

Result: Log output with score=8.

Next Steps

We've reduced noise. Now we must ensure we don't leak sensitive user data.

👉 Step 5: Redact Sensitive Data

Goal​

Configuration​

1. Severity Scoring​

2. Intelligent Filtering​

Complete Step 4 Configuration​

Deployment & Verification​

Next Steps​