/OPTIMIZE[=(opt[,...])] D=/OPTI=(LEV=4,INL=SPE,NOL,NOP,TUN=GEN,UNR=0)

 /[NO]OPTIMIZE

 Controls how the compiler produces optimized code.

 The default is /OPTIMIZE, which is the same as /OPTIMIZE=(LEVEL=4,
 INLINE=SPEED, NOLOOPS, NOPIPELINE, TUNE=GENERIC, UNROLL=0).  For a
 debugging session, use the negative form (/NOOPTIMIZE or
 /OPTIMIZE=LEVEL=0) to ensure that the debugger has sufficient
 information to locate errors in the source program.

 In most cases, using /OPTIMIZE will make the program execute
 faster.  As a side effect of getting the fastest execution speeds,
 using /OPTIMIZE can produce larger object modules and longer
 compile times than /NOOPTIMIZE.

 To allow full interprocedural optimization when compiling multiple
 source modules, consider separating source file specifications with
 plus signs (+), so the files are concatenated and compiled as one
 program.  Full interprocedural optimization can reduce overall
 program execution time.  Consider not concatenating source files
 when the size of the source files is excessively large and the
 amount of memory or disk space is limited.

 INLINE=keyword
   Controls the inlining performed by the compiler.  The keyword can
   be any of the following:

       Keyword   Meaning
       -------   -------
       NONE      Suppresses all inlining of routines.

       MANUAL    This is the same as INLINE=NONE for VSI Fortran.

       SIZE      Inlines calls that the compiler feels will improve 
                 run-time performance without significantly increasing 
                 the size of the program.

       SPEED     Inlines calls that the compiler feels will improve 
                 run-time performance, even where it may significantly 
                 increase the size of the program.

       ALL       Inlines every procedure call that can be inlined
                 while still generating correct code.  Recursive 
                 routines will not cause an infinite loop at
                 compile time.

   /OPTIMIZE=INLINE is equivalent to /OPTIMIZE=(INLINE=SPEED).
   /OPTIMIZE=NOINLINE is equivalent to /OPTIMIZE=(INLINE=NONE)

   For all optimization levels other than 0, the inlining mode is
   the one specified on the command line.  If no inlining mode is
   explicitly specified, the compiler derives it from the
   optimization level, as follows:

       Level       Inlining Mode
       -----       -------------
       0           NONE
       1           NONE
       2           NONE
       3           NONE
       4           SPEED
       5           SPEED

 LEVEL=n
   Controls the level of optimization performed by the compiler.
   The "n" is an integer in the range 0 through 5.  LEVEL=0 is the
   same as /NOOPTIMIZE; LEVEL=4 is the same as /OPTIMIZE.  The
   following explains the level numbers:

       Level Number  Meaning
       ------------  -------
       LEVEL=0       Disables nearly all optimizations. 

       LEVEL=1       Enables local optimizations within the 
                     source program unit and recognition of common 
                     subexpressions.     

       LEVEL=2       Enables global optimizations and optimizations     
                     performed with LEVEL=1. 

       LEVEL=3       Enables additional global optimizations that
                     improve speed (at the cost of extra code size) 
                     and optimizations performed with LEVEL=2. 

       LEVEL=4       Enables interprocedural analysis, automatic        
                     inlining of small procedures (with heuristics 
                     limiting the amount of extra code), and 
                     optimizations performed with LEVEL=3. LEVEL=4 
                     is the default.

       LEVEL=5       Activates software pipelining, loop transformation 
                     optimizations, and optimizations performed with 
                     LEVEL=4.  Loop transformation optimizations apply 
                     to array references within loops. Software pipe-
                     lining allows instructions within a loop to 
                     "wrap around" and execute in a different itera-
                     tion of the loop.  In certain cases, loop trans-
                     formation and software pipelining can improve 
                     run-time performance.

   For more information about these LEVEL numbers, see the HP Fortran
   for OpenVMS User Manual.

 [NO]LOOPS
   Specifies a group of loop transformation optimizations that apply
   to array references within loops.  These optimizations can
   improve the performance of the memory system and usually apply to
   multiply nested loops.

   The loops chosen for loop transformation optimizations are always
   "counted" loops (which include DO or IF loops, but not DO WHILE
   loops).

   Conditions that typically prevent the loop transformation
   optimizations from occurring include subprogram references that
   are not inlined (such as an external function call), complicated
   exit conditions, and uncounted loops.

   The types of optimizations associated with this option are:

     Loop blocking
     Loop distribution
     Loop fusion
     Loop interchange
     Loop scalar replacement
     Outer loop unrolling

   This type of optimization can be specified for /OPTIMIZE=LEVEL=2
   or higher; it is performed by default if /OPTIMIZE=LEVEL=5 is in
   effect.

 [NO]PIPELINE
   Applies instruction scheduling to certain innermost loops,
   allowing instructions within a loop to "wrap around" and execute
   in a different iteration of the loop.  This can reduce the impact
   of long-latency operations, resulting in faster loop execution.
   /OPTIMIZE=PIPELINE also enables prefetching of data to reduce the
   impact of cache misses.      

   This type of optimization can be specified for /OPTIMIZE=LEVEL=2
   or higher; it is performed by default if /OPTIMIZE=LEVEL=5 is in
   effect.

 TUNE=keyword (Alpha only)
   Specifies the kind of optimized code to be generated.  The
   keyword can be any of the following:

       Keyword   Meaning
       -------   -------
       GENERIC   Generates and schedules code that will execute 
                 well for all generations of Alpha processors. 
                 This provides generally efficient code for those 
                 cases where all processor generations are likely 
                 to be used.

       HOST      Generates and schedules code optimized for the
                 processor generation in use on the system being 
                 used for compilation. 

       EV4       Generates and schedules code optimized for the     
                 21064, 21064A, 21066, and 21068 implementations
                 of the Alpha chip.

                 Programs compiled with the EV4 option run without 
                 instruction emulation overhead on all Alpha 
                 processors.                                      

       EV5       Generates and schedules code optimized for the     
                 21164 implementation of the Alpha chip. This 
                 processor generation is faster than EV4.            

                 Programs compiled with the EV5 option run without 
                 instruction emulation overhead on all Alpha 
                 processors.                                      

       EV56      Generates code for some 21164 chip implementations 
                 that use the byte and word manipulation instruction 
                 extensions of the Alpha architecture. 
                                                                              
                 Programs compiled with the EV56 option may incur 
                 emulation overhead on EV4 and EV5 processors, but 
                 will still run correctly on OpenVMS Version 7.1 (or 
                 later) systems.                                

       EV6       Generates and schedules code for the 21264 chip
                 implementation that uses the following extensions
                 to the base Alpha instruction set: BWX (Byte/Word
                 manipulation) and MAX (Multimedia) instructions, 
                 square root and floating-point convert instructions, 
                 and count instructions. 

                 Programs compiled with the EV6 option may incur 
                 emulation overhead on EV4, EV5, EV56, and PCA56 
                 processors, but will still run correctly on OpenVMS 
                 Version 7.1 (or later) systems.

       EV67      Generates and schedules code for the 21264 chip
                 implementation that uses the following extensions
                 to the base Alpha instruction set: BWX (Byte/Word
                 manipulation) and MAX (Multimedia) instructions,
                 square root and floating-point convert instructions,
                 and CIX (Count) instructions.

                 Programs compiled with the EV67 option may incur 
                 emulation overhead on EV4, EV5, EV56, EV6 and PCA56 
                 processors, but will still run correctly on OpenVMS 
                 Version 7.1 (or later) systems.

       PCA56     Generates code for the 21164PC chip implementation 
                 that uses the byte and word manipulation instruction 
                 extensions and multimedia instruction extensions 
                 of the Alpha architecture.      
                                                                              
                 Running programs compiled with the PCA56 keyword       
                 may incur emulation overhead on EV4, EV5, and      
                 EV56 processors, but will still run correctly on       
                 OpenVMS Version 7.1 (or later) systems.           

   The default is /OPTIMIZE=TUNE=GENERIC.

 UNROLL=n
   Controls loop unrolling done by the optimizer.  UNROLL=n means to
   unroll loop bodies n times, where "n" is an integer in the range
   0 through 16.  UNROLL=0 (the default) means the optimizer will
   use its default unroll amount.  For more information, see the
   HP Fortran for OpenVMS User Manual.