Skip to content

Pipeline Editor Tutorial

Master log parsing with Grok patterns, processors, and pipelines.

Updated: 2025-01-11

Interactive Tutorial Available

This tutorial is also available as an interactive walkthrough in the app. Open the Pipeline Editor and click "Start Tutorial".

Sample Logs

We'll use real Django application logs throughout this tutorial:

Django Log (with Multiline Traceback)

2024-11-30 10:15:25,890 INFO django.server "GET /api/products/ HTTP/1.1" 200 8456
2024-11-30 10:15:26,901 WARNING django.security.csrf Forbidden (CSRF token missing or incorrect): /api/orders/
2024-11-30 10:15:27,012 ERROR myapp.views Exception in view function
Traceback (most recent call last):
  File "/app/myapp/views.py", line 45, in create_order
    order = Order.objects.create(**validated_data)
  File "/usr/local/lib/python3.11/site-packages/django/db/models/manager.py", line 87, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
django.db.utils.IntegrityError: duplicate key value violates unique constraint "orders_order_number_key"
2024-11-30 10:15:28,123 INFO django.server "POST /api/orders/ HTTP/1.1" 500 1234
2024-11-30 10:15:29,234 INFO celery.worker Task myapp.tasks.send_notification[abc-123] received

Mixed Format Log (for Fallback Demo)

[object Object]
2024-11-30 10:15:31,000 INFO legacy.service Old format log message
[object Object]
2024-11-30 10:15:33,000 ERROR legacy.service Database connection failed

Step 1: Understanding Raw Logs

Before parsing, logs are just text. Hover over the parts below to see how each field maps to a pattern:

Log Structure
Raw Log Entry

Each log entry typically contains a timestamp, level, logger, and message.

Multiline Entries

Tracebacks span multiple lines but represent a single log event. We'll handle these in Step 6.

Step 2: Creating a Grok Pattern

Grok patterns use named placeholders to extract fields. Hover to see how each part maps:

Log Structure
Raw Log Entry
Grok Pattern

Pattern Syntax

The basic syntax is: %{PATTERN_NAME:field_name}

  • PATTERN_NAME: A predefined pattern (like TIMESTAMP_ISO8601)
  • field_name: The name you want for the extracted field

Complete Pattern

%{TIMESTAMP_ISO8601:timestamp} %{LOGLEVEL:level} %{NOTSPACE:logger} %{GREEDYDATA:message}

Step 3: Understanding Processors

Processors are reusable parsing rules. Hover over each type to see examples:

4 Processor Types
Hover over a processor type to see an example

To create a processor: Processors tab → + New Processor → Name it and select a type.

Step 4: Building a Pipeline

Watch how logs flow through a pipeline:

Pipeline Structure
Raw Log
2024-11-30 INFO Started
Pipeline
Django Logs
1
Django Grok
Grok processor
Extracted Fields
...
Pipeline = ordered steps
Step references a processor

Why Pipelines?

  • Reusability: Same processor in multiple pipelines
  • Composition: Combine multiple parsing strategies

Create One

  1. Pipelines tab → + New Pipeline
  2. Name it "Django App Logs"
  3. + Add Step → Select your processor

Step 5: Multi-Step Fallback Pipeline

Watch how mixed-format logs flow through a fallback chain:

Fallback Pipeline Flow
Mixed Format Logs
{"level":"info","msg":"ok"}
2024-11-30 ERROR plain log
{"level":"error","msg":"fail"}
2024-11-30 INFO another
Pipeline Steps
JSON Parser
stopOnSuccess ✓
Grok Parser
stopOnSuccess ✓
JSON logs → Step 1
Plain text → Step 2

How It Works

  • JSON logs → Parsed by Step 1, stops there
  • Plain text → Step 1 fails, falls through to Step 2

stopOnSuccess

  • Enabled ✓: Stop at first successful step (fallback chain)
  • Disabled ○: Run all steps (combine fields from multiple parsers)

Pro Tip

Check Step Traces to see which processor handled each entry!

Step 6: Multiline Log Parsing

Watch how lines are grouped before parsing:

Multiline Grouping
Raw Lines
2024-11-30 10:15:25 INFO Server started
2024-11-30 10:15:27 ERROR Exception occurred
Traceback (most recent call last):
File "app.py", line 42
ValueError: invalid input
2024-11-30 10:15:28 INFO Request handled
Grouped Entries
Start pattern match
^%{TIMESTAMP_ISO8601}
Lines without timestamp are grouped with the previous entry

Key Concept

Lines matching the start pattern begin a new entry. Lines without a match are appended to the previous entry.

Configure It

  1. Expand Multiline Configuration in your pipeline
  2. Set Start Pattern: ^%{TIMESTAMP_ISO8601}
  3. Lines without timestamps will group with the previous entry

Step 7: Testing & Validation

Click entries to explore. Toggle Step Traces to see which processor handled each:

Test Results Panel
Parsed Entries (3)
Field Values
timestamp
2024-11-30 10:15:25
level
INFO
message
Server started
Parsed successfully
Parse failed
Step traces show which processor handled each entry

Debugging Tips

  • Entry not parsed? Check multiline grouping or pattern syntax
  • Wrong fields? Patterns are whitespace-sensitive
  • Fallback not working? Enable stopOnSuccess on earlier steps

You're Ready!

You've learned:

  • ✓ Log structure recognition
  • ✓ Grok pattern creation
  • ✓ Processor types
  • ✓ Pipeline building
  • ✓ Fallback chains
  • ✓ Multiline handling
  • ✓ Testing and debugging

Now go parse some real logs!

Was this tutorial helpful?