1  Frequently_Asked_Questions
   Included below are some frequently asked questions about PCA and
   their answers.
 

2  80%_of_Time_Spent_in_P1_Space
   Why is 80% of my program in P1 space? How do I get the wait time
   reflected in code I can change?

   When your program is waiting for a system service to complete,
   the program counter points to a location in the system service
   vector in P1 space. Since the most common form of system service
   wait is waiting for an I/O operation to complete, your program
   thus appears to be spending most of its time in P1 space.

   If your program does a lot of terminal I/O, you should expect the
   program to be I/O-bound and to appear to spend a lot of time
   in P1 space; the terminal is a slow device. If your program
   primarily does disk or tape I/O and appears to spend a lot of
   time in P1 space, you should investigate why the program is I/O-
   bound. By reprogramming your program's I/O to reduce the I/O
   wait-time, you may be able to speed up your program considerably.

   To get the system service wait time reflected in the code of your
   own program, you should gather stack PC values using the STACK_
   PCS command in the Collector, then use the /MAIN_IMAGE qualifier
   on a PLOT or TABULATE command in the Analyzer. This will charge
   the time outside your image (including that spent in P1 space)
   to the actual location within your image that caused it to be
   spent.
 

2  Charging_Back_Shareable_Image_Data_Points
   How do I get the time spent in shareable images to be charged to
   the parts of my program that used it?

   Gather STACK_PCs in the Collector, and use the /MAIN_IMAGE
   qualifier on your PLOT or TABULATE commands in the Analyzer. This
   will charge the time spent outside your image to the PC within
   the image that caused it to be spent.
 

2  Charging_Back_RTL_Data_Points
   How do I get the time spent in a specific RTL to be charged to
   the parts of my program that used it?

   Gather STACK_PCs in the Collector, then use the /MAIN_
   IMAGE=SHARE$mumbleRTL and /STACK=n qualifiers on the PLOT or
   TABULATE commands in the Analyzer.
 

2  Analyzing_Individual_Instructions
   How can I find the specific instructions within a line that are
   taking the most time?

   Use PLOT LINE module_name\%LINE nnn BY BYTE. Then, look at a
   machine listing to correlate byte offsets from the beginning of
   the line to the specific instructions.
 

2  Getting_Rid_of_Terminal_I/O
   How do I get rid of all the time spent in terminal I/O?

   Place an event marker before each terminal I/O statement and
   a different event marker after the terminal I/O statement.
   Then use SET FILTER foo TIME <> the_first_event_marker_name in
   the Analyzer. This will discard all the time spent waiting for
   terminal I/O.
 

2  ACCVIO_in_Program_Run_with_PCA
   Why does my program ACCVIO when linked with PCA?

   If the PCAC> prompt never appears, the Collector has probably
   not been installed as a privileged image. Possibly, the system
   manager forgot to edit the system start-up file to include
   @SYS$MANAGER:PCA$STARTUP. If the PCAC> prompt does appear, then
   see question 7.
 

2  PCA_Changes_Program_Behavior
   Why does my program behave differently when ran with PCA?

   One of the following conditions probably exists:

   -  Uninitialized stack variables

   -  Dependence on memory above SP

   -  Assumptions about memory allocation

   Cases 1 and 2 occur because PCA comes in as a handler and uses
   the stack above the user program's stack. Consequently, the
   stack is manipulated in ways that are different than when run
   without PCA. Although this is unlikely to happen, compiler code
   generation bugs have caused this sort of behavior.

   Case 3 above occurs because PCA now lives in the process memory
   space and requests memory by means of SYS$EXPREG. PCA requests a
   large amount of memory at initialization to minimize the altering
   of memory allocation, but this still may happen.

   PCA may have a bug where it is smashing the stack or the random
   user memory. HP appreciates your input because these bugs are
   hard to track down, and because they have been known to come and
   go based on the order of modules in a linker options file.

   HP recommends that you try the following:

   -  Try GO/NOCOLLECT. If the program malfunctions, it's your LINK
      with PCA.

   -  Try a run with simple PC sampling only. PC sampling is the
      least likely mode for bugs. If your program malfunctions,
      the problem probably lies with your program. If your program
      doesn't malfunction, the problem probably lies with PCA.

   If you are still convinced that PCA is changing the behavior of
   your program:

   If you have a support contract, contact your HP support
   representative. Otherwise, contact your HP account representative
   or your authorized reseller.

   When reporting a problem, please include the following
   information:

   o  The versions of PCA and of the OpenVMS operating system being
      used.

   o  As complete a description of the problem as possible, trying
      not to overlook any details.

   o  The problem reduced to as small a size as possible.

   o  If the problem is with the collector:

      -  Does the program run GO/NOCOLLECT?

      -  Does the program run with the OpenVMS debugger?

      -  If Counters, Coverage, or Events are involved, does the
         program behave properly when breakpoints are put in the
         same locations with the OpenVMS debugger?

      -  Please supply the version of the compiler(s) used.

      -  Files needed to build the program, including build
         procedures.

   o  If the problem is with the Analyzer:

      -  The .PCA file involved

      -  The sources referenced by the .PCA file

      -  A PCA initialization file to reproduce the problem.

   o  All file should be submitted on machine-readable media
      (magnetic tape `*preferred`*, floppy diskette, or tape
      cassette).

   o  Any stack dumps the occurred, if applicable.
 

2  Creating_Data_File_Takes_Very_Long
   Why does it take so long to create a performance data file?

   The Collector copies the portions of the DST it needs to the
   performance data file. This can take some time for large
   programs. The DST is placed in the performance data file to
   avoid confusion over which image contains the DST for the data
   gathered. Also, PCA does not need all the information in the DST
   and condenses it. This avoids the overhead of reading useless
   information every time the file is used.
 

2  Using_PCA_in_a_Batch_Process
   Will PCA run in batch?

   Yes. However, you should avoid using screen mode.
 

2  Avoiding_Recompilation
   My application takes 10 days to compile. Is there a way I can
   avoid compiling my whole application with /DEBUG?

   Yes. PCA will provide all functionality except annotated source
   listings and codepath analysis, as long as the objects contain
   traceback information because most of the DST information that
   PCA needs is there. Once you find which modules are of interest,
   you can compile those with /DEBUG, then relink the application
   and gather the data again.
 

2  CPU_Time_Stamp_Explained
   What exactly is a CPU time stamp?

   The time stamp found in the PCA performance data file always
   expresses the CPU time from the start of the current program
   execution. In the data file, it is represented as 10-millisecond
   increments (number of CPU ticks), but to the user, it is always
   presented as milliseconds. This CPU time represents the total
   amount of CPU time consumed by the program and by the Collector
   from the time the program started executing.
 

2  CALL_Instruction_Gets_a_Spike
   PCA tells me that a large amount of time is being spent at a
   CALL instruction. Why? The CALL instruction should only consume a
   small part of the time spent executing the routine.

   First, check page faulting. Sometimes the faulting behavior of
   a program causes a moderately called routine to get paged out
   just before it is called. If that isn't the case, check for JSB
   linkages to an RTL routine.

   For performance reasons, some RTL routines use JSB linkages. This
   can cause confusion for the user when the /MAIN_IMAGE qualifier
   is used. This is especially true with PC sampling data, but can
   occur with any kind of data for which you can gather stack PC
   data.

   Because a JSB linkage does not place a call frame on the stack,
   the return address to the site of the call is lost to PCA.
   Consequently, the first return address found by /MAIN_IMAGE is
   the site of the call to the routine that called the RTL by means
   of a JSB linkage. As an example, suppose routine MAIN called
   routine FOO which in turn called the RTL via a JSB linkage.
   Then, suppose that a PC sampling hit occurred in the RTL. This
   will cause the PC of the call to FOO and the PC of the call to
   MAIN to be recorded. Thus, in the presence of the /MAIN_IMAGE
   qualifier, the first PC within the image is the PC of the call to
   FOO. Consequently, FOO's call site will be inflated by the number
   of data points in the RTL that are in routines which have JSB
   linkages.

   Note that the above can yield useful information. If you compare
   the time with /MAIN to the time without /MAIN, you can tell how
   much time was spent in JSB linkage routines. You cannot, however,
   separate the various JSB linkage routines. Note further that if
   the JSB routine is called from the main program, the data points
   will be lost because there is no caller of the main program.
 

2  0.0%_With_********_in_Plots
   Why does the Analyzer report 0.0% for a line and then output a
   full line of stars, indicating that the line was covered?

   Probably the total number of data points is over 2000, and the
   percentage is less than 0.05%. Therefore, rounding makes it
   0.0%.
 

2  Optimizing_Calls_To_Utility_Routines
   I have a utility routine which I have optimized as much as I can.
   I need to know who is calling it and how often, so I can reduce
   the number of calls to it. How do I get this information?

   Use /MAIN_IMAGE=utility-routine on the plot command. This gives
   you the following options:

   -  PLOT CALL_TREE BY CHAIN_ROUTINE will list all the call chains
      which pass through utility-routine with the number of data
      points for each call chain.

   -  PLOT/STACK=1 PROGRAM BY ROUTINE will list all the callers of
      utility-routine.

   -  If one particular caller of utility-routine is of interest,
      try the following:

 PCAA> SET FILTER filter-name CHAIN=(*,caller,utility-routine,*)

   This will assure that the data being viewed is only of those
   whose chains have a subchain caller,utility-routine.

   -  Many other combinations of /CUMULATIVE, /MAIN_IMAGE, /STACK
      with various filters and nodespecs may be useful.
 

2  FAULT_ADDRESS_Data_Kind
   What information does the /FAULT_ADDRESS data kind provide?

   When a page fault occurs, two virtual addresses are gathered:
   the PC of the instruction and the virtual address which caused
   the fault. The CPU time is also gathered. In general, the PC
   which caused the fault, i.e., the /PAGEFAULT data kind, is most
   significant because PCA can plot this against the PROGRAM_ADDRESS
   domain and show where the page faulting is occurring in your
   program.

   The /FAULT_ADDRESS data kind can also be plotted against the
   PROGRAM_ADDRESS domain to find where the page faulting is
   occurring. This can be useful in laying out CLUSTERs for the
   link. (Note that these same page faults should also show up at
   the branch or call instructions when plotting /PAGEFAULT.)
 

2  Stack_PC_Data_and_Page_Fault_Data
   Why can't I do stack PC analysis with page fault data?

   While the Collector is gathering page fault data, walking the
   stack might cause additional page faults. This problem has not
   been addressed for the current release of OpenVMS.
 

2  Routines_and_Coverage_Data
   Why am I getting several coverage data points associated with my
   routine declaration when I do COVERAGE BY CODEPATH?

   Several languages generate prologue code at each routine entry
   to initialize the language specific semantics. As far as PCA
   is concerned, code is code and deserves code path analysis.
   This environment is usually set up by a CALLS or JSB to an RTL
   routine. PCA considers CALLS, CALLG and JSB to be transfers of
   control, because control does not in principle have to come back,
   and places a BPT in the instruction following.
 

2  HEX_Numbers_in_CALLTREE_Plots
   Why are HEX numbers showing up in the CALL_TREE plot?

   The Analyzer was not able to symbolize the return address it
   found in the call stack. If IO_SERVICES or SERVICES data is being
   gathered, these may be addresses in the relocated system service
   vector.
 

2  Getting_Right_To_Hot_Spots
   How can I avoid all the source header information and get right
   to the most interesting line?

   Use the traverse commands, NEXT, FIRST, PREVIOUS, and CURRENT.
 

2  Comparing_Different_Kinds_of_Data
   How can I easily compare different kinds of data?

   Use the INCLUDE command.
 

2  Virtual_Memory_in_the_Analyzer
   The Analyzer is running out of virtual memory. What do I do?

   Raise the appropriate quotas, limit the number of displays, and
   limit the memory used by displays (use /SIZE=n).

   Limit the size of your plots with the following methods:

   -  Use limiting nodespecs. For example, if PROGRAM_ADDRESS BY
      LINE doesn't work, try MODULE foo BY LINE, or ROUTINE fee BY
      LINE.

   -  Use the traverse commands after issuing PLOT/your_qualifiers
      PROGRAM_ADDRESS BY MODULE

   -  Use the /NOZEROS, /MINIMUM, /MAXIMUM qualifiers.

   -  Use filters with CALL_TREE nodespecs to reduce the number of
      call chains.
 

2  Virtual_Memory_in_the_Collector
   The Collector is running out of virtual memory. What do I do?

   If you are doing coverage or counter analysis, limit the number
   of breakpoint settings by using either MODULE BY LINE or BY
   CODEPATH node specifications instead of using PROGRAM_ADDRESS BY
   LINE or BY CODEPATH. Then, do several collection runs to gather
   the data.
 

2  Some_Plots_Appear_So_Quickly
   Why do some plots execute more quickly than others?

   Some PLOT commands execute more quickly than others because PCA
   uses all available information from the previous plot to produce
   the requested one. For example, if you enter PLOT PROGRAM BY
   LINE, and then enter PLOT/DESCENDING, PCA will only sort the
   previous plot. However, if you use a different nodespec, such
   as PLOT ROUTINE foo$bar BY CODEPATH, then PCA must rebuild its
   internal tables and read the data again, which takes more time.
   In addition, the number of filters and/or buckets you use affects
   the time it takes to build a plot. This is because filters affect
   the amount of data the Analyzer looks at, and because all buckets
   must be searched for each data point.
 

2  Missing_Subroutine_Calls
   Why don't I see all of my subroutine calls in a CALL_TREE plot?

   It may be that your routine has a JSB linkage. See question 7.
 

2  Bad_Offsets_In_MACRO_Modules
   Why do I get bad offsets when plotting MACRO modules by byte?

   When plotting MACRO modules by byte, the offset is actually from
   the beginning of the module, including the data psects. You get
   bad offsets because the linker moves the psects around based
   on the psect attributes. Thus, the offsets you get may have no
   relationship to any listing you have. However, if you use PLOT
   ROUTINE foo BY BYTE, then the offsets will be from the beginning
   of the routine. (This will only work if you have an .ENTRY foo
   ... directive in your program.)
 

2  LIB$FIND_IMAGE_SYMBOL
   Can PCA measure shareable images activated "on the fly" with
   LIB$FIND_IMAGE_SYMBOL?

   Yes, if you relink against the image you want to activate. PCA
   uses a structure built by the image activator to find all the
   shareable image information it needs. By relinking the image,
   the image activator will know about the image and LIB$FIND_IMAGE_
   SYMBOL will work.