sed
[<command> | ] sed [-e "<sed-script>"] [-f "<script-file>"] [input-file]* [ | <command>]
- sed is a line-by-line editor
- Reads the current line from the input stream, removes the
trailing newline,
places it in the pattern space, and then runs
the commands
-
Can concatenate consecutive lines in the pattern space (each
line will be separated by \n):
For Unix-formatted files:
- A single blank line: '^$'
- Two consecutive blank lines (concatenated): '^\n$'
- Three consecutive blank lines (concatenated): '^\n\n$'
- When the commands are finished, unless the -n option is used, the
pattern space is printed out to the output stream, with the
trailing newline restored
- By default, the pattern space is deleted after each line
- The hold space retains its data
- By default, if the -e or -f option is not used, the first
non-option argument is run as a sed script
- Typically, piping is used for input and output
- A trailing newline character (Unix) at the end of the last line
of a file (a non-empty line)
does not start a new line; it simply
terminates the last line of the file
- In some editors, when <enter> is pressed in a text editor
at the end of the last line in the file (a non-empty line),
it
does not add a trailing blank line; the last line is still the same
line
(only now it ends with a newline character (Unix))
; it may
appear so for such editors that allow the cursor to be placed below the
last line
- Using '...' instead of "..." prevents the Unix shell from
expanding $ and `...`
Terms
Pattern space
- The current line (or concatenated set of lines), minus the
trailing newline
Hold space
Cycle
- Advancing to the next line of the file (which initially is the
first line) and applying the script from the beginning
Syntax
command Run the command on the pattern space
command;command; Run multiple commands on the pattern space
ADDRESS command Run the command if the pattern space matches ADDRESS (the separating space is usually optional)
ADDRESS!command Run the command if the pattern space does not match ADDRESS
ADDRESS {commands} Run the commands if the pattern space matches ADDRESS (the separating space is usually optional)
ADDRESS!{commands} Run the command if the pattern space does not match ADDRESS
ADDRESS {commands};ADDRESS {commands} Run multiple conditional commands on the pattern space
- The ADDRESS is essentially an if statement, used to determine if
the command should be applied to the pattern space:
/one/ command # if (/one/) command
/one/!command # if (!/one/) command
Regular expressions
/regex/ Address range regular expression
^ Start of file
$ End of file
/one/ Lines that contain "one"
/^$/ Empty line
/./ Non-empty line
\%regex% Use different delimiter
\|regex| Use different delimiter
/regex/I Case insensitive
- Backreference and matching variable (respectively): s|(ab)\1|\1|
- Possibly not supported: {n}
Line number
n Line number
1 Line 1
$ Last line
Line range
START,END First line that matches START, to first subsequent line that matches END (inclusive)
1,5 Lines 1 through 5
5,$ Line 4 through last line
/one/,/two/ First line that contains "one" to first line that contains "two"
/one/,15 First line that contains "one" to line 15
Commands
:LABEL Specify the location of LABEL for branch commands
bLABEL Unconditionally branch to LABEL (goto, jump) (remains in the current cycle; pattern space unchanged)
tLABEL Branch to LABEL only if there has been a successful substitution
since the last input line was read or conditional branch was taken
{ COMMANDS } A group of commands may be enclosed between { and } characters.
Allows a group of commands to be triggered by a single address (or address-range) match.
Basic commands
q Quit, printing the current pattern space by default (unless -n is used)
-e '...' Run the following commands
-n Only prints out lines explicitly requested using "p" (by default, the entire
pattern space is printed)
Regular expressions
-r Extended regular expressions (GNU extension) (requires fewer regex characters
to be escaped) (default: basic regular expressions)
Basic regular expressions (default)
Must escape the following: ?, (, ), +, |, {, }
Extended regular expressions (-r)
None of the above characters must be escaped, unless they are meant
to be used as literal characters
Pattern space commands
- [nN]ext, [dD]elete, [pP]rint
- NDP are the multi-line equivalents of ndp
d Delete the pattern space; immediately start next cycle
D Delete text in the pattern space up to the first newline (first line in the pattern space)
n Jump to the next line (applying any additional commands to that line)
N Add a newline to the pattern space, then append the next line of input
p Print the pattern space (to standard output) (used in conjunction with -n)
P Print the pattern space up to the first newline (first line in the pattern space)
Hold space commands
- [hH]old, [gG]et, [x]change
- HG are the multi-line equivalents of hg
h Replace the hold space with the pattern space
H Append a newline to the hold space, and then append the pattern space to the hold space
g Replace the pattern space with the hold space
G Append a newline to the pattern space, and then append the hold space to the pattern space
x Exchange the hold and pattern spaces
Substitution
s/search/replace/flags
- Flags
I, i Case insensitive
g Global (apply to all matches)
M, m Multi-line (allows ^ and $ to match for individual lines)
(\` and \' will always match the beginning or end of the buffer)
e Pipe input from a shell command (trailing newline is suppressed)
NUMBER Only replace the NUMBERth match
p Print the substitution made
w FILE Write the result (if modified) to FILE
Sample commands
1d Delete the first line
$d Delete the last line
1,5d Delete the first 5 lines
n;n;d; Delete every third line (skip skip delete ...)
q Print the first line
5q Print the first 5 lines
-n $p Print the last line
-n /regex/p Print lines that match regex
/regex/!d Print lines that match regex
Examples
File names
path=`echo "$file" | sed -e "s|/\?[^/]*$||g"` # Extract <path> from <path>/<filename>
filename=`echo "$file" | sed -e "s|^.*/\([^/]*$\)|\1|g"` # Extract <filename> from <path>/<filename>
path_is_absolute=`echo "$file" | sed -e "s|^/.*$|true|g"` # "true", if the path starts with "/"
file_ext=`echo "$file" | sed -e "s|^.*\.\([^.]*\)$|\1|g"` # Extract <ext> from [<path>/]<base>.<ext>
file_base=`echo "$filename" | sed -e "s|\.\([^.]*\)$||g"` # Extract <base> from <base>.<ext>
number=`echo $file | sed -re "s|([0-9]*).*|\1|g"` # Extract the leading number from the filename
is_file_group=`echo "$data" | sed -e "s|^[ \t]*\[.*\][\t]*$|true|g"` # [text]
file_group_name=`echo "$data" | sed -e "s|^[ \t]*\[||g;s|\][ \t]*$||g"` # text
Trailing blank lines
# Identify files with a trailing blank line
#
# Description:
# - If the last line of the file is a blank line, the sed command will output "found".
# - Print the file name for each such file.
# - Supports Unix-, DOS-, and Mac-formatted files.
#
# Regular expressions for matching a trailing blank line in the pattern space:
# - Unix: /^$/
# - DOS: /^\r+$/
# - Mac: /\r\r$/
#
for file in $(find -name "*.txt"); do
result=$(cat $file | sed -nre '/(^\r*$)|(\r\r$)/ {s/.*/found/;$p}');
if [ "$result" = "found" ]; then
echo $file;
fi;
done;
# Delete trailing blank lines
#
# Description:
# - Concatenate each set of consecutive blank lines in the pattern space.
# - When the end of the file is reached, delete the pattern space
# (which will delete the trailing blank lines, if there were any).
# - This command will overwrite all of the files, even those without trailing blank lines.
#
# - The regular expression matches any single blank line in the file. It also matches any number
# of blank lines that have been concatenated in the pattern space.
#
# - Supports Unix-, DOS-, and Mac-formatted files.
#
# Regular expressions for matching concatenated trailing blank lines in the pattern space:
# - Unix: /^\n*$/
# - DOS: /^\r(\r\n)*$/
# - Mac: /\r{2,}$/
#
for file in `find -name "*.txt"`; do
cat $file | sed -re '/\r{2,}$/ {$s/\r*$/\r/}; :a /^[\r\n]*$/ {$d;N;ba}' > ~/sed.tmp;
mv ~/sed.tmp $file;
done;
# Identify files with trailing blank lines, and delete the offending lines from those files.
#
# Description:
# - Supports Unix-, DOS-, and Mac-formatted files.
#
files=`for file in $(find -name "*.txt"); do
result=$(cat $file | sed -nre '/(^\r*$)|(\r\r$)/ {s/.*/found/;$p}');
if [ "$result" = "found" ]; then
echo $file;
fi;
done`;
for file in $files; do
echo $file;
cat $file | sed -re '/\r{2,}$/ {$s/\r*$/\r/}; :a /^[\r\n]*$/ {$d;N;ba}' > ~/sed.tmp;
mv ~/sed.tmp $file;
done;