The three pattern styles VMS, ULTRIX and TPU can be used in search
and substitute commands. The pattern style is set by the SET SEARCH
PATTERN command.
1. VMS style patterns
The VMS pattern style enables the special interpretation of wildcard
characters and a quote character in the search-string parameter as
shown below:
VMS-Style Wildcards
Wildcard Matches
* One or more characters of any kind on a line.
** One or more characters of any kind crossing lines.
% A single character.
\< Beginning of a line.
\> End of a line.
\[set-of-characters]
Any character in the specified set. For example,
\[abc] matches any letter in the set "abc" and
\[c-t] matches any letter in the set "c" through
"t."
\[~set-of-characters]
Anything not in the specified set of characters.
\ Lets you specify the characters \,*,% or ] within
wildcard expressions. For example, \\ matches the
backslash character (\).
\. Repeats the previous pattern zero or more times,
including the original.
\: Repeats the previous pattern at least once,
including the original; that is, a null occurrence
does not match.
\w Any empty space created by the space bar or tab
stops, including no more than one line break.
\d Any decimal digit.
\o Any octal digit.
\x Any hexadecimal digit.
\a Any alphabetic character, including accented
letters, other marked letters, and non-English
letters.
\n Any alphanumeric character.
\s Any character that can be used in a symbol:
alphanumeric, dollar sign, and underscore.
\l Any lowercase letter.
\u Any uppercase letter.
\p Any punctuation character.
\f Any formatting characters: backspace, tab, line
feed, vertical tab, form feed, and carriage return.
\^ Any control character.
\+ Any character with bit 7 set; that is, ASCII decimal
values from 128 through 255.
For example the following command will find a line starting with an
uppercase letter:
PATTERN SEARCH "\<\u"
2. ULTRIX style patterns
The ULTRIX pattern style enables the special interpretation of
wildcard characters and a quote character in the search-string
parameter as shown below:
ULTRIX-Style Wildcards
Wildcard Matches
. A single character.
^ Beginning of a line.
$ End of a line.
[set-of-characters]
Any character in the specified set. For example,
[abc] matches any letter in the set "abc" and [c-t]
matches any letter in the set "c" through "t."
[^set-of-characters]
Anything not in the specified set of characters.
\ Lets you specify the characters \,.,^,$,[,],or * in
wildcard expressions. For example, \\ matches the
backslash character (\).
* Repeats the previous pattern zero or more times,
including the original.
+ Repeats the previous pattern at least once,
including the original; that is, a null occurrence
does not match.
For example the following command will find a line starting with
a, b or c:
PATTERN SEARCH "^[abc]"
3. TPU style patterns
The TPU pattern style enables the use of TPU patterns. For full
details of TPU patterns see the DEC Text Processing Utility Manual.
3.1 Simple examples
The first example searches for abc or def and the second example
substitutes all occurrences of abc or def by ghi:
PATTERN SEARCH "'abc' | 'def'"
PATTERN SUBSTITUTE "'abc' | 'def'" "'ghi'" ALL
In the examples 'abc', 'def' and 'ghi' are TPU
strings and | is the TPU pattern alternation operator.
The outermost quotes in the examples must be omitted
if the parameters are prompted for or if a dialog box is
used.
3.2 Search string
The search string is a TPU expression that must evaluate to a
TPU pattern.
3.3 Replace string
The replace string is a TPU expression that must evaluate to a
TPU string.
3.4 Partial pattern assignment variables
Partial pattern assignment variables allow a substitution
to be a function of the found pattern.
For example, the following command replaces a date of the
form yyyy/mm/dd with one of the form dd/mm/yyyy:
PATTERN SUBSTITUTE -
"(_year@_v1)+'/'+(_month@_v2)+'/'+(_day@_v3)" -
"str(_v3)+'/'+str(_v2)+'/'+ str(_v1)"
when applied to: 1998/04/21 generates: 21/04/1998
In the above example _year, _month and _day are TPU
variables holding patterns that match the year, month
and day parts of a date, for details of how to set up
these variables see Section 3.8.
@ is the TPU partial pattern assignment operator and _v1,
_v2 and _v3 are partial pattern assignment variables that
are set to the found year, date and day.
A partial pattern assignment variable holds a TPU range
and when used in the replacement string must be converted
to a string using the TPU procedure STR.
For example, the following command will prefix any lines
that start with any three characters from ABCDEFGHI with
XYZ_ :
PATTERN SUBSTITUTE -
"LINE_BEGIN + (ANY('ABCDEFGHI',3)@_v1)" -
"'XYZ_'+ str(_v1)" -
ALL
before substitution
abc
012
defghi
after substitution
XYZ_abc
012
XYZ_defghi
In the above example LINE_BEGIN is a TPU keyword that
matches the beginning of a line and ANY is a TPU pattern
procedure that matches a specified number of characters
from a specified set of characters.
3.5 New line
A new line will be generated for each line feed character
in the replacement string, a line feed character can be
introduced by means of the TPU procedure ASCII with the
value 10 as a parameter.
For example, to replace any numbers at the end of lines
with the string 'xxx' (a line feed is necessary because
the search pattern includes the end of the line):
PATTERN SUBSTITUTE -
"_n + LINE_END" -
"'xxx' + ASCII(10)" -
ALL
before substitution
123 456
789
after substitution
123 xxx
xxx
In the above example LINE_END is a TPU keyword that
matches the end of a line and _n is TPU variable holding a
pattern that matches a number.
When a partial pattern assignment variable is converted
to a string by the TPU procedure STR an optional second
parameter can be set to ASCII(10) to cause any end
of lines in the range described by the variable to be
converted to line feed characters (without the parameter
they are represented by the null string). For example:
PATTERN SUBSTITUTE -
"(LINE_BEGIN + _n + LINE_END + _n + LINE_END)@_v1" -
"STR(_v1, ASCII(10)) + STR(_v1, ASCII(10))" -
ALL
before substitution
123
456
after substitution
123
456
123
456
Carriage return characters adjacent to line feed characters
in the replacement string are ignored.
3.6 Errors
The search and replace strings are TPU expressions
which have to be evaluated and may generate various TPU
compilation / evaluation error messages.
The following error messages are generated for invalid
search or replace strings:
Error in search pattern
Error in replacement string
These messages will normally be preceded by various TPU
error messages. For example, the search string "'aaa' +
bbb" would result in the following error messages:
Undefined procedure call BBB
Operand combination STRING + INTEGER unsupported
Error in search pattern
3.7 Global variables
Partial pattern assignment variables and pattern variables
(such as _year in an earlier example) need to be global
and must not clash with any TPU global variables used by
LSE. This can be achieved by starting any such variable
names with an underscore character.
3.8 Pattern variables
Any complicated search or substitution is likely to need
various pattern variables to have already been set up.
This can be achieved in various ways.
The definitions can be setup by issuing TPU commands,
for example:
TPU "_digits:='0123456789'"
TPU "_digit:=any(_digits)"
TPU "_year:=any(_digits,4)"
TPU "_month:=any('01',1)+_digit"
TPU "_day:=any('0123',1)+_digit"
TPU "_n:=span(_digits)"
The file LSE$PATTERNS.TPU in the LSE$EXAMPLE directory
contains some examples of patterns which can be added to
LSE by means of the following commands:
OPEN FILE LSE$EXAMPLE:LSE$PATTERNS.TPU
EXTEND *
TPU "LSE$PATTERNS_MODULE_INIT"
3.9 Use for developing DTM user filters
The user defined filters global replace feature introduced
in Digital Test Manager for OpenVMS version V4.0 can
be simulated using the PATTERN SUBSTITUTE command. This
allows DTM user defined filters to be developed
interactively using LSE.
For example, to replace any numbers at the end of lines
with the string 'xxx':
global_replace(
_n + LINE_END,
'xxx' + ASCII(10),
NO_EXACT,
OFF,
ON);
The LSE equivalent (assuming that the current search
attributes are equivalent to NO_EXACT) is:
PATTERN SUBSTITUTE -
"_n + LINE_END" -
"'xxx' + ASCII(1O)" -
ALL
The LSE equivalent of the pattern to replace parameter
(first parameter of the global_replace routine) is the
same except that the parameter has to be in quotes.
The LSE equivalent of the replacement string parameter
(second parameter) is the same if the evaluate replacement
parameter (fourth parameter) is set to ON and is the
same except that the parameter has to be in quotes if
the evaluate replacement parameter is set to OFF.
The LSE equivalent of the search mode parameter (third
parameter) is the setting of the search options (set by
the SET SEARCH command).
LSE does not have equivalents of the evaluate replacement
parameter (fourth parameter) or the convert linefeeds
parameter (fifth parameter). It always evaluates the
replacement string parameter and it always converts
linefeed characters (and ignores adjacent carriage return
characters).