Sed, a powerfull mini-language from the 70s

/linux

There's a moment when using sed stops feeling like typing weird incantations… and starts feeling like you're programming a living stream of text.

At first, it looks like this:

  
  
sed '1,2p' file
  
  

And you think:

"ok… print lines 1 and 2… neat, but whatever"

But if you stay with it — if you push just a bit further — you discover something unexpected:

sed is not a command. It is a language.

A small one. A strange one. But a real one --> with control flow, memory, and a model of execution.

And that's how you end up writing things like:

script.sed

  
  
/./H
/^$/ {
    x
    /./ {
        s/\n/ /g
        s/^ //
        p
    }
}
$ {
    x
    s/\n/ /g
    s/^ //
    /./ { p }
}
  
  
  
  
sed -n -E -f script.sed ex3.log
  
  

…and suddenly, you're not filtering text anymore.

👉 You're processing a stream.


The first principle: sed is a stream machine

At its core, sed does this:

  
  
read line → transform → (maybe print) → repeat
  
  

One line at a time. No context. No memory.

That's why the simplest commands feel trivial:

Basic streaming

  
  
sed '1,2p' file      # print lines 1 and 2
sed '/error/p' file  # print lines matching "error"
sed '5q' file        # stop at line 5
sed '/foo/d' file    # delete lines matching "foo"
  
  

Core commands you must know

Command Meaning
p print
d delete and skip rest
q quit immediately
s/// substitute

Example

  
  
sed -n '1,5p' file
  
  

👉 disable default output (-n) and print explicitly


Editing the stream

You can also modify lines:

  
  
sed 's/foo/bar/g' file   # replace text
  
  

Insert / append / change

These are surprisingly expressive:

  
  
sed '/Alice/i BEFORE' file   # insert before
sed '/Alice/a AFTER' file    # append after
sed '/Alice/c REPLACED' file # replace line
  
  
Command Effect
i insert before
a append after
c replace line

You can combine streaming portions of the lines to processing them like doing:

  
  
sed -n '2,4 {/Alice/c REPLACED}; p' file
  
  

Note that we seprarate the commands with ; and wrapped the change c command inside {}, so after we can print what happened.

Some commands like c stops direclty further commands that is why we wrapped them.

In fact c will stop further commands inside the block.


The first illusion breaks

Up to now, sed is just: "a tool that edits lines"

But then you hit a problem:

Problem: multi-line data

  
  
Error: Something failed:
    at module A
    at module B
  
  

👉 This is not line-based anymore

Enter: loops and multi-line thinking

  
  
sed -n -e ":loop N; s/\n[[:space:]]\+/ /g; t loop; p" ex.log
  
  

Input:

  
  
Error: Something failed:
    at module A
    at module B

Next event
  
  

Output:

  
  
Error: Something failed: at module A at module B
Next event
  
  

What just happened?

You just wrote a loop:

  
  
:loop
N
s/\n[[:space:]]\+/ /g
t loop
  
  

Breakdown:

  • N → append next line
  • s → try to merge lines
  • t loop → repeat if substitution worked

💥 This is a real loop. Equivalent to:

  
  
while can_merge:
    merge_lines()
  
  

👉 At this moment, sed stops being a filter — and becomes a program.


The big shift: pattern space is not a line anymore

With N, pattern space becomes:

  
  
line1\nline2\nline3
  
  

👉 You are now working on buffers.


The second big shift: sliding windows

Now consider:

  
  
sed -n -E 'N; h; s/\n/->/g ;p; g; D' ex2.log
  
  

Input:

  
  
A
B
C
D
E
  
  

Output:

  
  
A->B
B->C
C->D
D->E
  
  

What is happening?

This is where sed becomes… almost magical.

Two worlds appear:

  1. Pattern space → computation

      
      
    A\nB  →  transformed  →  A->B
      
      
  2. Hold space → memory

      
      
    A\nB
      
      

The key operations

  
  
h   # save structure
g   # restore structure
D   # slide window
  
  

What you built

A sliding window:

  
  
[A,B]
[B,C]
[C,D]
[D,E]
  
  

The invariant

👉 D needs \n to exist

So you:

  1. destroy structure (s/\n/->/)
  2. print
  3. restore structure (g)
  4. slide (D)

Final model

  
  
structure = truth
display   = transformation
  
  

👉 You compute on a view, but preserve the truth.


When it starts being surprinsingly good

Up until now, you've been transforming lines.

Then suddenly, you see this:

  
  
sed -n -E 'N; h; s/\n/->/g; p; G; D' ex2.log
  
  
  
  
A->B
A->B->C
A->B->C->D
A->B->C->D->E
  
  

And something feels… different.

This is no longer line editing.

👉 This is stateful stream computation.


The key: two memory regions

To understand this, you must accept one fundamental truth:

sed operates on two distinct spaces

1. Pattern space (the present)

👉 This is:

  • the current working buffer
  • what commands modify
  • what p prints

Think:

  
  
pattern = what I am computing right now
  
  

2. Hold space (the memory)

👉 This is:

  • persistent across commands
  • invisible unless accessed
  • controlled manually

Think:

  
  
hold = what I choose to remember
  
  

The commands that matter here

Command Meaning
N append next line to pattern (\n)
h save pattern → hold
G append hold → pattern
D remove first line of pattern, continue

Let's walk through execution

** Input:**

  
  
A
B
C
D
E
  
  

Iteration 1

Step 1 — N

  
  
pattern = A\nB
hold    = (empty)
  
  

Step 2 — h

  
  
pattern = A\nB
hold    = A\nB
  
  

👉 snapshot of structure

Step 3 — s/\n/->/

  
  
pattern = A->B
hold    = A\nB
  
  

👉 ⚠️ structure is destroyed in pattern

Step 4 — p

  
  
OUTPUT: A->B
  
  

Step 5 — G

  
  
pattern = A->B\nA\nB
hold    = A\nB
  
  

👉 re-inject original structure

Step 6 — D

  
  
pattern = A\nB
hold    = A\nB
  
  

👉 remove display part, keep structure


Iteration 2

Step 1 — N

  
  
pattern = A\nB\nC
  
  

Step 2 — h

  
  
hold = A\nB\nC
  
  

👉 overwrite previous snapshot

Step 3 — s

  
  
pattern = A->B->C
  
  

Step 4 — p

  
  
OUTPUT: A->B->C
  
  

Step 5 — G

  
  
pattern = A->B->C\nA\nB\nC
  
  

Step 6 — D

  
  
pattern = A\nB\nC
  
  

What is really happening

At every iteration, the structure grows:

  
  
A\nB
A\nB\nC
A\nB\nC\nD
  
  

But you never build from transformed data.

The key insight

👉 You are maintaining two representations:

Role Content
structure (truth) A\nB\nC
display (view) A->B->C

The trick

  
  
h   # save structure
s   # destroy structure (for display)
p   # show display
G   # restore structure
D   # continue sliding
  
  

The invariant

D only works if \n exists. So:

  • you temporarily break the structure
  • then restore it before D

This is the algorithm

  
  
grow structure   (N)
save structure   (h)

display = transform(structure)
print(display)

restore structure (G + D)
repeat
  
  

This is NOT obvious

At first glance, you might think:

"it builds A->B, then B->C…"

Wrong.

👉 It always builds from:

  
  
A\nB\nC\nD...
  
  

Think like this

  
  
pattern = working buffer
hold    = checkpoint of truth
  
  

Why this matters

This pattern unlocks:

  • cumulative computations
  • prefix building
  • streaming aggregation
  • rolling transformations

Minimal mental model

👉 sed =

  
  
(pattern space) + (hold space) + (control flow)
  
  

sed is now a state machine

You now have:

  • memory (h, g)
  • loops (:, t)
  • buffers (N, D)

👉 This is no longer "text processing"
👉 This is stream programming


Final evolution: grouping blocks

Now consider real-world logs:

  
  
line1
line2



line3
line4
line5




line6
line7
  
  

** Goal:**

  
  
line1 line2
line3 line4 line5
line6 line7
  
  

Script

  
  
/./H
/^$/ {
    x
    /./ {
        s/\n/ /g
        s/^ //
        p
    }
}
$ {
    x
    s/\n/ /g
    s/^ //
    /./ { p }
}
  
  

▶️ Run it:

  
  
sed -n -E -f script.sed ex3.log
  
  

What is happening?

Expression Effect
/./H accumulate non-empty lines
/^$/ block separator
x bring accumulated block into pattern space
s/\n/ /g flatten block
$ handle last block (EOF)

🔥 You just:

  • grouped variable-length blocks
  • transformed them
  • handled edge cases
  • wrote reusable logic

👉 This is not a one-liner anymore.
👉 This is a program.


The final mental model

sed is:

  
  
a streaming engine
+ a working buffer (pattern space)
+ a memory buffer (hold space)
+ control flow (loops, branches)
  
  

The secret to mastering sed

It's not about memorizing commands. It's about understanding:

1. The execution loop

  
  
read → process → print → repeat
  
  

2. The invariant

  • pattern space = computation
  • hold space = state

3. The transformations

Command Action
N grow
D slide
H accumulate
h / g checkpoint / restore

Why it feels "shamanic"

Because you're not writing steps.

You're maintaining invariants:

  • "newline must exist"
  • "structure must be preserved"
  • "state must be restored"

The master of sed doesn't edit text.
They shape streams.

In fact sed is what jq is for JSon but for text.

Note:

To modify the file in-place, use the -i flag followed by the name of the file.