More about Perl, CGIplus and the RTE

Version 1.3.0, 20th May 2023

Copyright © 2000-2023 Mark G. Daniel
Licensed under the Apache License, Version 2.0 (the "License");
https://www.apache.org/licenses/LICENSE-2.0
Software distributed under the License is on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND,
either express or implied.

Contents




Perl is a powerful, interpreted runtime environment commonly used in Web environments. In a CGI context the overheads of starting Perl (particularly under VMS) can severely impact request response time and system overhead. WASD's persistant Run-Time Environment (RTE) and persistant scripting (CGIplus) both improve script response time and reduce system impact.

Developed and tested against various Perl. See Releases list. Other versions YMMV.

CGI, CGIplus, RTE ... which to choose?

With multiple WASD scripting technologies available choosing which is best suited to a given application may somtimes seem daunting. As a general comment; the RTE environment offers the best performance and most flexibility for general Perl scripting environments. For a locally developed, maximal performance requirement, the CGIplus environment is recommended. The following table attempts to summarize the considerations.


TechnologyDescription
CGI Standard CGI is just that, the lowest common denominator between server environments. Nothing persists (except perhaps the process). The script is activated with each request, receiving request parameters via CGI environment variables. Processing occurs, with response output. The script runs down and the process becomes quiescent or exits. Perl engine activation, script source parsing (and any required modules), script activation, execution and rundown are all fairly expensive undertakings. Eliminating one or more of these improves latency and reduces system resource consumption.
CGIplus CGIplus allows the script itself to persist. This means Perl engine activation, script source parsing and script resource instantiation occurs only once for any number of successive requests. This is particularly significant with highly latent resources, such as databases, network connections, etc. CGIplus scripts must be designed and implemented to take persistence into account. Care must be taken to ensure that there is no leakage of data between requests and that resources used and no longer required are released - script rundown can no longer be used to perform these tasks! It is generally a simple undertaking to make a script able to behave in a CGI or CGIplus manner depending on how it is activated. The Perl module CGIplus.pm is available to perform the few extra steps required for CGIplus.
RTE A Run-Time Environment allows the scripting engine to persist. For interpreters such as Perl this can significantly reduce engine activation latency and resource consumption. The script itself does not persist and so all state is lost between activations. In the case of the Perl RTE engine the parsed code for that script is cached and will be used again when the script is next activated. Eliminating this step on subsequent activations provides a significant saving in latency and system resources. Generally any plain old CGI script can be used with the RTE, no specific changes are required for this environment. It can be a little fussy about namespace. All variables must have an explicit package reference or be "my"ed.

More information on these technologies is available in the WASD Scripting Overview.

Hints and Kinks

A few suggestions for when using Perl with WASD.

CGI Variables

The Perl interpreter needs to be informed about the source of it's environment variables. For WASD standard CGI environment the CGI variables are stored as DCL symbols. This requires the definition of a logical name, which can be on a system-wide basis (which of course affects all Perl scripting).

$ DEFINE /SYSTEM PERL_ENV_TABLES CLISYM_GLOBAL,LNM$PROCESS

Alternatively, it may be applied only to the scripting environment through the use of a Perl "wrapper" procedure. The example PERL.COM shows how this can be done.

Note that the next comment only applies to standard Perl CGI. It does not apply when using the PerlRTE or the CGIplus.pm module described below. With these internal processing performs all such munging transparently.

Many, perhaps most, open-source Perl CGI scripts will be using CGI variable names without the common VMS (and WASD-default) prefix of "WWW_". These will not function correctly using $ENV{} to retrieve values with the standard Perl interpreter. To remove the "WWW_" prefix add an appropriate rule to HTTPD$MAP.

set /cgi*-bin/*.pl CGIprefix=

POSTed Requests

CGI.pm and Perl v5.6.0 could not read a POSTed multipart stream satisfactorily producing the error "CGI.pm: Server closed socket during multipart read (client aborted?)". This is apparently a known problem fixed by migrating to Perl 5.6.1 (or later) and it's CGI.pm.

When not using the PerlRTE engine some redirection is necessary to get the correct stream to connect to Perl's <STDIN>. It controls the script process via SYS$INPUT and supplies any POSTed body via the separate stream HTTP$INPUT requiring redirection before Perl engine activation.

$ define /user perl_env_tables clisym_global,lnm$process
$ define /user sys$input http$input
$ perl device:[directory]script.pl

This approach can be seen in use with PerlRTEexample4.pl and PerlRTEexample4.com below. Although PerlRTE does not actually require this because the two scripts can be used in both environments (to illustrate differences in latency) it is used in both.

There are also often issues in getting POSTed content to the script without the VMS record boundary munging interfering. This generally requires the <STDIN> stream being placed into binary mode, sometimes not trivial in standard VMS Perl. The PerlRTE environment can be used as if it was the standard Perl verb eminiating this requirement. See the comments in the PERLRTE.C prologue. This is done with the standard CGI example of PerlRTE_example5 (look at PerlRTE_example5.com).

Binary Streams

By default VMS I/O streams are record-oriented and RMS has a tendency to want to adjust carriage-control on these. This is often counter-productive when needing to return a binary (non-textual) response.

The VMS::stdio Perl extension allows I/O streams to be very finely controlled using low-level C-RTL and RMS functionality. See the appropriate Perl documentation page. The CGIplus.pm module described below uses this extensively.

Perl RTE

The Perl Run-Time Environment provides a persistent Perl engine, caching both the running Perl interpreter and any one or more Perl scripts activated and parsed by that interpreter.

The Perl Run-Time Environment should execute any standard Perl script using any collection of modules. This approach uses standard techniques and code described in the "perlembed" document to load and keep cached multiple script and module sources. How isolated are these scripts in reality? The document indicates effectively enough! Each is treated as an autonomous package and so storage restrictions etc. need to be observed. However apart from that it would seem as if any old (perhaps slightly tweaked) CGI script could be used within this environment.

The embedding code maintains the last modification time of each script cached and checks this against the last modification time of the script file before each activiation. If there is a difference in the two times (i.e. the file has changed in some way) the cache is overwritten with a fresh evaluation of the script. There is no need to explicitly flush this cache in any way.

The CGI environment variables are available via the ENV associative array. There are a number of ways to customize the startup of this environment. See the PerlRTE.c and PerlRTeng.c sources for detailed information.

#WASD#

Placing  #WASD#  as the first line of a script interpreted by the PerlRTE engine makes the response plain-text. This can assist in locating syntax errors, etc., during initial interpreter load, or during script debugging.

#WASD#
require CG1;
use CG1 qw(:standard);

new CG1;

(rest of script)

Shown in plain-text as:

%PERLRTE-E-CALLARGV, Can't locate CG1.pm in @INC (@INC contains:
perl_root:[lib.VMS_AXP.5_8_6] perl_root:[lib] perl_root:[lib.site_perl.VMS_AXP]
perl_root:[lib.site_perl] /perl_root/lib/site_perl .) at (eval 1) line 3.
BEGIN failed--compilation aborted at (eval 1) line 3.

Caution When Persistent!

When using a persistent engine similar cautions and requirements apply as to other, more mainstream persistent environments, such as FastCGI, Apache mod_perl and Active State PerlEx (which PerlRTE impersonates). Most of these cautions advise against the use of global variables.

Here some relevant advice from the mod_perl documentation:

Global Variables Persistence

Since the child process generally doesn't exit before it has serviced several requests, global variables persist inside the same process from request to request. This means that you must never rely on the value of the global variable if it wasn't initialized at the beginning of the request processing. See "Variables globally, lexically scoped and fully qualified" for more information.

You should avoid using global variables unless it's impossible without them, because it will make code development harder and you will have to make certain that all the variables are initialized before they are used. Use my () scoped variables wherever you can.

You should be especially careful with Perl Special Variables which cannot be lexically scoped. You have to use local() instead.

Here is an example with Perl hash variables, which store the iteration state in the hash variable and that state persists between requests unless explicitly reset. Consider the following registry script:

    #file:hash_iteration.pl
    #----------------------
    our %hash;
    %hash = map {$_ => 1 } 'a'..'c' unless %hash;
  
    print "Content-type: text/plain\n\n";
  
    for (my ($k, $v) = each %hash) {
        print "$k $v\n";
        last;
    }

That script prints different values on the first 3 invocations and prints nothing on the 4th, and then repeats the loop. (when you run with httpd -X). There are 3 hash key/value pairs in the global variable %hash.

In order to get the iteration state to its initial state at the beginning of each request, you need to reset the iterator as explained in the manpage for the each() operator. So adding:

    keys %hash;

before using %hash solves the problem for the current example.

https://perl.apache.org/docs/1.0/guide/porting.html#Global_Variables_Persistence

The strict pragma

It's _absolutely_ mandatory (at least for development) to start all your scripts with:

    use strict;
If needed, you can always turn off the 'strict' pragma or a part of it inside the block, e.g:
    {
      no strict 'refs';
      ... some code
    }

It's more important to have the strict pragma enabled under mod_perl than anywhere else. While it's not required by the language, its use cannot be too strongly recommended. It will save you a great deal of time. And, of course, clean scripts will still run under mod_cgi (plain CGI)!

https://perl.apache.org/docs/1.0/guide/porting.html#The_strict_pragma

CGI.pm has been deprecated and removed from more recent Perl distributions (on VMS from v5.22)

CGI.pm is a particularly common and useful Perl module for CGI scripting. This extract from the author of CGI.pm discusses function and object-oriented usage and emphasizes the latter for persistent environments.

Function-Oriented vs Object-Oriented Use

CGI.pm can be used in two distinct modes called function-oriented and object-oriented. In the function-oriented mode, you first import CGI functions into your script's namespace, then call these functions directly. A simple function-oriented script looks like this:

    #!/usr/local/bin/perl
    use CGI qw/:standard/;
    print header(),
          start_html(-title=>'Wow!'),
          h1('Wow!'),
          'Look Ma, no hands!',
          end_html();

The use operator loads the CGI.pm definitions and imports the ":standard" set of function definitions. We then make calls to various functions such as header(), to generate the HTTP header, start_html(), to produce the top part of an HTML document, h1() to produce a level one header, and so forth.

In addition to the standard set, there are many optional sets of less frequently used CGI functions. See Importing CGI Methods for full details.

In the object-oriented mode, you use CGI; without specifying any functions or function sets to import. In this case, you communicate with CGI.pm via a CGI object. The object is created by a call to CGI::new() and encapsulates all the state information about the current CGI transaction, such as values of the CGI parameters passed to your script. Although more verbose, this coding style has the advantage of allowing you to create multiple CGI objects, save their state to disk or to a database, and otherwise manipulate them to achieve neat effects.

The same script written using the object-oriented style looks like this:

    #!/usr/local/bin/perl
    use CGI;
    $q = new CGI;
    print $q->header(),
          $q->start_html(-title=>'Wow!'),
          $q->h1('Wow!'),
          'Look Ma, no hands!',
          $q->end_html();

The object-oriented mode also has the advantage of consuming somewhat less memory than the function-oriented coding style. This may be of value to users of persistent Perl interpreters such as mod_perl.

https://cpansearch.perl.org/src/LDS/CGI.pm-3.29/cgi_docs.html

At the very least the CGI.pm module when used for functional programming should be initialised at the start the code (and therefore the before beginning of each subsequent request).

require CGI;
use CGI qw(:standard);

new CGI;

(rest of script)

Similar but sometimes subtly different cautions apply for CGIplus scripts.

Demonstrations

These examples and demonstrations rely on the following configuration.

# HTTPD$CONFIG
[DclScriptRunTime]
.PL PERL
# or perhaps (depending on the local system)
# .PL $PERL_ROOT:[000000]PERL.EXE

# HTTPD$MAP
set /cgi-bin/*.pl CGIprefix=
exec+ /plrte/* (CGI-BIN:[000000]PERLRTE.EXE)/cgi-bin/*

Note the differences in latency between standard CGI and the RTE!
(and that jumping back and forth between them often causes the first to fail)

  Script Sources:  PerlRTE_example1.pl
PerlRTE_example2.pl
PerlRTE_example3.pl (loads the CGI.pm module for extra latency)
PerlRTE_example4.pl (it's standard CGI POST wrapper PerlRTE_example4.com)
PerlRTE_example5.pl (it's standard CGI POST wrapper PerlRTE_example5.com)
Standard CGI:  /cgi-bin/PerlRTE_example1
/cgi-bin/PerlRTE_example2
/cgi-bin/PerlRTE_example3
one two three
using PerlRTEexample5
 (upload will not work with Perl 5.6.0)
If the CGI variables appear to be empty make adjustments in line with CGI Variables.
Persistent RTE:  /plrte/PerlRTE_example1   [restart RTE]
/plrte/PerlRTE_example2   [restart RTE]
/plrte/PerlRTE_example3   [restart RTE]
one two three
using PerlRTEexample5
 (upload will not work with Perl 5.6.0)

35x Performance

Benchmarking the PerlRTE_example3.pl script (which requires CGI.PM) using h2load on the author's development system indicated

h2load --h1 -n 100 -c 1 -t 1 "http://192.168.1.3/cgi-bin/CGIplusPM_example3.pl"
h2load --h1 -n 100 -c 1 -t 1 "http://192.168.1.3/prlrte/CGIplusPM_example3.pl"
0.62 and 23 requests/second for the CGI and RTE usages respectively. Expressed another way, the RTE usage responded 35 times faster!!  Remember this is the same script - no changes apart from invocation path.

Scripting Engine

The Perl RTE is suitable for use as the main Perl scripting engine. That is, it can activate standard CGI, CGIplus and RTE scripts. To enable this for the execution of all CGI Perl scripts by the server add or change the following HTTPD$MAP rules and reload.

  if (!script-name:/plrte/*) map /cgi-bin/*.pl* /plrte/*.pl*
  exec+ /plrte/* (CGI-BIN:[000000]PERLRTE.EXE)/cgi-bin/*

Also see mapping caution in CGIplus.pm.

Building PerlRTE

Compile*+ Link

$ SET DEFAULT WASD_ROOT:[SRC.PERL]
$ @BUILD_PERLRTE BUILD

* To compile the RTE sources the Perl distribution header files must available from PERL_ROOT:[000000].

Link-only

$ SET DEFAULT WASD_ROOT:[SRC.PERL]
$ @BUILD_PERLRTE LINK
$ @INSTALL

CGIplus Perl

CGIplus Perl is an alternative, perhaps complimentary method to Perl RTE for increasing efficiency and reducing request latency, again using the idea of persistence. This time it is up to the script to maintain the state.

These examples also demonstrate the use of the VMS::DCLsym and VMS::Stdio extensions. The principles may be more generally applied to other scripts.

Script Source:  CGIplusPM_example1.pl
Standard CGI:  /cgi-bin/CGIplusPM_example1/wasd_root/src/perl/?this+is+a+query+string
CGIplus:  /cgiplus-bin/CGIplusPM_example1/wasd_root/src/perl/?this+is+a+query+string   [restart]

30x Performance

Benchmarking this using h2load on the author's development system

h2load --h1 -n 100 -c 1 -t 1 "http://192.168.1.3/cgi-bin/CGIplusPM_example1.pl/wasd_root/src/perl/?this+is+a+query+string"
h2load --h1 -n 100 -c 1 -t 1 "http://192.168.1.3/cgiplus-bin/CGIplusPM_example1.pl/wasd_root/src/perl/?this+is+a+query+string"
indicated 0.8 and 25 requests/second for the CGI and CGIplus usages respectively.  Expressed another way, the CGIplus usage responded approximately 30 times faster!  Remember this is the same script - no changes apart from invocation path.

CGIplus.pm

The CGIplus.pm Perl module is intended to assist authors write scripts that may be used transparently in both vanilla CGI environments (including non-WASD) as well as under WASD CGIplus. Perl implementation refinements courtesy of Dick Munroe (munroe@csworks.com).

A script using CGIplus.pm (or any other autonomous CGIplus behaviour) should never be activated using an RTE path; that is, one using the following mapping syntax.

exec+ ($cgi-bin:[000000]perlrte.exe)/plrte/* /cgi-bin/* 

When an RTE becomes quiescent the server will give it another script. With the CGIplus.pm request processing loop is active servicing the one script. This unintended and probably incorrect script will become active. Always activate CGIplus.pm enabled scripts via a CGIplus path.

exec+ /plplus/* /cgi-bin/* 

CGIplus.pm will detect such a mapping mistake and die!

Binary Responses and CGIplus

VMS RMS complicates output streams under Perl. This is a particular issue with CGIplus end-of-file sentinals, which must be output as a single record. CGIplus.pm attempts to provide a simple mechanism for providing binary streams if necessary, while still ensuring it's own records are not interfered with. This uses Charles Bailey's VMS::Stdio extension module built into most versions of VMS Perl.

This script demonstrates how the module's stream functions can be directly used.

Script Source:  CGIplusPM_example2.pl
Standard CGI:  /cgi-bin/CGIplusPM_example2
CGIplus:  /cgiplus-bin/CGIplusPM_example2   [restart]

This script demonstrates how simply to return a binary file as a response.

Script Source:  CGIplusPM_example3.pl
Standard CGI:  /cgi-bin/CGIplusPM_example3
CGIplus:  /cgiplus-bin/CGIplusPM_example3   [restart]

Any accessable image location may be added to these scripts following the script part. Again, because these are being accessed via a CGIplus script, notice the difference in latency between the initial and subsequent requests.

Notes

There have been a number of changes implemented in the CGIplus.pm supplied with PerlRTE 1.2. These should be backward compatible. The original module is supplied in the source directory if required.

  1. The module can be used with PerlRTE v1.2 as the scripting engine (the previous version could not).

  2. All CGI variables names have any leading "WWW_" automatically stripped. This is a more common convention for CGI variables names. This behaviour can be changed - check the module source.

  3. CGI variables are now also available from the more standard ENV associative array. Previously they were only accessable using CGIplus::var().

Releases

The installed release can be checked using  $ MCR CGI-BIN:[000000]PERLRTE /VERSION

v1.3.0  20-MAY-2023
•  move CGI (not-plus) variables to use GATEWAY_SYMBOLS (WASD v12.1.1)
v1.2.9  01-MAR-2023
•  verified OK against Perl 5.34 (x86-64 indigenous)
•  in line with other WASD apps, moved under Apache License (allowed under Artistic License section 4(c)(ii))
v1.2.8  18-MAY-2016
•  verified OK against Perl 5.24.1 (VMS ports)
v1.2.7  06-JUN-2015
•  verified OK against Perl 5.22.1 (VMS ports)
•  deprecated CGI Perl module extracted from Perl 5.18 and deposited into directory [.CGIPM].
v1.2.6  11-JAN-2014
•  verified OK against Perl 5.18.2 (VMS ports)
   note: v5.18.1 unsuitable due to bug (PERL_ENV_TABLES)
v1.2.5  03-JAN-2011
•  proctor detect (for WASD v10.1.0)
v1.2.4  24-JAN-2008
•  make STDOUT autoflush (Perl 5.10.0)
•  verified OK against Perl 5.10.0
•  #WASD# at start of script issues plain-text header
v1.2.3  23-NOV-2005
•  verified OK against CPQ AXPVMS PERL V5.8-6 and HP I64VMS PERL V5.8-6
v1.2.2  28-JUN-2003
•  minor tweak
v1.2.1  19-APR-2003
•  minor tweak
v1.2.0  02-JAN-2003
•  refinement for non-RTE/CGIplus CGI variables
•  refinement to CGI.pm support
v1.1.0  27-JUL-2002
•  support for Perl 5.8.0 (previously 5.6.0)
•  support CGI.pm
v1.0.0  28-OCT-2000
•  initial