Tests With WASD Under Heavy Load

Originally posted to the info-WASD mailing list by Gérard Labadie of HP France, <gerard.REMOVE.labadie@laposte.net>.
Minor editing and formatted into HTML.
13th February 2003
Thanks to Gérard.


Here are some results I had while benching WASD with Apache Bench. This may be useful for other people.

  1. Using Supported Methods

    Alpha VMS 7.3-1, TCP/IP 5.3 ECO 1, WASD 7.2 (yes I know, very old), Apache Bench.

    Before benching, I check what gives

      $ ana/sys
      clue mem/look
    

    and note the number for Non-Paged Dynamic Storage Pool, size:384, Valid: xxxxx elements

    A BG device needs 2 elements, so if I want to do a

      $ Apache_Bench -n 20000 -c 200 (-n is the number of requests)
    

    I will need 40000 elements. If I do not have them, my test will hang or fail with exceeded quota. I can pre-populate these lookaside lists with a program using  EXE_STD$ALLOCBUF,  the program  NPP.C  providing an example implementation. Some severe tests hit the limit of 10,000 BG devices, seen with

      $ ana/sys
      tcpip sh inetcb/stat
    
    see if max_socket contains 2710 (hexa) or 10000 (or even 10001) decimal.

    I define a symbol, useful during the bench

      $ look :== pipe (wr sys$output """clue mem/lookaside""" ; wr sys$output """tcpip sh inetcb/stat""") | ana/sys | sea sys$pipe max_socket,"""Size:  384"""
    

    This will show 3 lines

    1. the number of elements in the lookaside list in Non paged dynamic memory of size 384
    2. the same for paged memory (non-relevant here)
    3. the values for active/peak/max sockets

    (it is possible to get only 2 lines with a more precise search)

    It is possible to reduce the inet parameter tcp_msl, in order to speed up the "reusability" of the BG devices. The doc says (as the TCP/IP stack is a port from Tru64)

    http://www.tru64unix.compaq.com/docs/internet/TITLE.HTM

    http://www.openvms.compaq.com/doc/73final/6631/6631pro_002.html#index_x_114

    The TCP protocol includes a concept known as the Maximum Segment Lifetime (MSL). When a TCP connection enters the TIME_WAIT state, it must remain in this state for twice the value of the MSL; otherwise, undetected data errors on future connections can occur. The inet subsystem attribute tcp_msl determines the maximum lifetime of a TCP segment and the timeout value for the TIME_WAIT state. In some situations, the default timeout value for the TIME_WAIT state (60 seconds) is too large, thereby reducing the value of the tcp_msl attribute frees connection resources sooner than the default setting.

    Performance Benefits and Tradeoffs  You can decrease the value of the tcp_msl attribute to make the TCP connection context time out more quickly at the end of a connection. However, this will increase the chance of data corruption. You can modify the tcp_msl attribute without rebooting the system.

    When to Tune  Usually, you do not have to modify the timeout limit for the TCP connection context.

    Recommended Values  The value of the tcp_msl attribute is set in units of 0.5 second. The default value is 60 units (30 seconds), which means that the TCP connection remains in TIME_WAIT state for 60 seconds, or twice the value of the MSL. Do not reduce the value of the tcp_msl attribute unless you fully understand the design and behavior of your network and the TCP protocol. It is strongly recommended that you use the default value; otherwise, there is the potential for data corruption.

    Making Changes Permanent  Two methods for when a "good" value for your site is selected.

    1. edit TCPIP$ETC:SYSCONFIGTAB.DAT
        inet: tcp_msl=32         !for example
      
    2. edit SYS$STARTUP:TCPIP$SYSTARTUP.COM adding these lines
        $ @sys$startup:tcpip$define_commmands
        $ sysconfig -r inet tcp_msl=32
      
  2. Using Unsupported Methods

    First be sure you have enough NPAGEDYN and that CHANNELCNT is above 33000. If not modify and reboot.

    The numbering of BG devices is limited to 9999, but we should be able to use 32767 (of course this is unsupported). Using

      $ ana/sys
      read sys$system:tcpip$net_globals 
      read/image sys$system:tcpip$internet_services
      exam inet$gl_ptr_inetcb+inetcb$w_max_socket
    

    I can see this location, and so modify it (a program, or use Delta) On my system, the address is 81A4EA40 (but it may change after a modification of a System parameter and a reboot, so you should always check). So I can put 32767 in this location.

      SDA> tcpip sh inetcb/stat 
    

    will show 32767 in max_socket My tests have shown a limit at 16431 BG devices, in fact TCBHASHSIZE must be raised from the default of 16384 to 32767 so I put in the file TCPIP$ETC:SYSCONFIGTAB.DAT

      $ sysconfig -r inet tcbhashsize=32767
    

    Now I have been able to use 28768 BG devices (why not 32767? I do not know) but more tests will be needed, as it seems the counter of active sockets is wrong (stays at 12342 when it should be about 30)

      $ tcpip sh dev /port=8080 
    

    during the bench always show devices above 10000, when all bg devices in the range 1-9999 seem to be available. May be the code of $ASSIGN, which should find the next available BG device, is modified for TCPIP.

    I have put tcp_msl to the unreasonable value of 10 (5 seconds) for my tests and had no data corruption. Your mileage may vary.