Tuesday, April 30, 2013

Solaris: using pmap to identify shared and private memory of a process

The Solaris operating system contains a number of nice commands to explore the status of processes.
They all access the process details maintained in the process directory /proc.

In this article I'll take a look into the pmap command (the link pointing to the current Solaris documentation at Oracle who inherited Solaris after acquiring Sun Microsystems) which allows to investigate the process memory layout in various ways (refer to the link for examples). A user can investigate his own processes, the root user can investigate any process.

pmap output for one process

In the first part of the discussion I will look into the pmap details of one process. As one can see from the output below pmap can answer those questions:
  • How much shared and private memory is a process using?
  • Which components are using how much memory? One could identify libraries or the stack or whatever to be a memory eater.

    I am using the  pmap -x pid  command to get a listing of all components and their address mapping.

    Here is the  pmap -x  example for a simple 'sleep 50' command.

    647:    /bin/sleep 50
     Address  Kbytes     RSS    Anon  Locked Mode   Mapped File
    08046000       8       8       4       - rw---    [ stack ]
    08050000       4       4       -       - r-x--  sleep
    08061000       4       4       -       - rw---  sleep
    08062000       8       8       -       - rw---    [ heap ]
    D1C90000      56      24       -       - r-x--  methods_unicode.so.3
    D1CAD000       4       4       4       - rwx--  methods_unicode.so.3
    D1CB0000    1772      36       -       - r-x--  de_DE.UTF-8.so.3
    D1E7A000       4       4       4       - rwx--  de_DE.UTF-8.so.3
    D1E80000    1080     664       -       - r-x--  libc.so.1
    D1F90000      24      12      12       - rwx--    [ anon ]
    D1F9E000      32      32      28       - rwx--  libc.so.1
    D1FA6000       8       8       8       - rwx--  libc.so.1
    D1FC0000       4       4       4       - rwx--    [ anon ]
    D1FC4000     160     160       -       - r-x--  ld.so.1
    D1FF0000       4       4       4       - rwx--    [ anon ]
    D1FF4000       4       4       -       - rwxs-    [ anon ]
    D1FFC000       8       8       8       - rwx--  ld.so.1
    D1FFE000       4       4       4       - rwx--  ld.so.1
    -------- ------- ------- ------- -------
    total Kb    3188     992      80       -
    
    

    Some Notes:

  • 647 in the first line is the process id.
  • The RSS column reports the physical memory i.e. shared and private combined.
  • The Anon column reports the private memory, thus shared can be calculated as RSS - Anon.
  • The mode column determines how to handle the various lines. The read bit is set always since it does not make sense to put something into memory which cannot be read. Looking at the write and execute bits there are these options.
    • r--: data, read-only
    • rw-: data
    • rwx: data, executable
    • r-x: this is code, it cannot be overwritten
    Things get a little more complex when considering that certain components appear more than once. If we look at libc.so.1 we see three occurances, one of them in mode r-x (code) and the other two in mode rwx (writable data). You'll also note four entries for [ anon ] when pmap cannot find a common name for the entry in this address space.

    What I want to do now is simplify and condense the pmap output by

  • reducing the number of columns: Address, Kbytes and Locked will be skipped
  • replacing RSS by a column Shared
  • replacing Mode by a column Type which holds only two possible values: code or data
  • merging all data lines for a mapped file into one (eventually all r--, rw- and rwx lines will be merged into one. This is not the case in this simple example but could well happen in more complex cases.)

    The table below shows calculations for libc.so.1 and [ anon ] rows: how to get from RSS and Anon to shared and private and how to merge multiple data lines into one. There should be only one code line anyway so nothing needs to be done here (other than maybe introduce a check to find out if this is really the case).

    RSS AnonShared Private Shared
    Merged
    Private
    Merged
    New name
    libc.so.1 r-x664066406640libc.so.1 code
    libc.so.1 rwx32284284
    = 4 + 0
    36
    = 28 + 8
    libc.so.1 data
    8808
    [ anon ] rwx 12120124
    =0 + 0 + 0 + 4
    20
    = 12 + 4 + 4 + 0
    [ anon ] data
    4404
    4404
    4040

    Here is how I want the 'pmap -x' output to look like:

    678:  /bin/sleep 50
          Shared      Private Type Mapped File
    ------------ ------------ ---- ----------
               4           20 data [ anon ]
               8            0 data [ heap ]
               4            4 data [ stack ]
               0            4 data de_DE.UTF-8.so.3
              36            0 code de_DE.UTF-8.so.3
               0           12 data ld.so.1
             160            0 code ld.so.1
               4           36 data libc.so.1
             664            0 code libc.so.1
               0            4 data methods_unicode.so.3
              24            0 code methods_unicode.so.3
               4            0 code sleep
               4            0 data sleep
    ------------ ------------ ---- ----------
             912           80      Total
    
    Looking at the total line you'll see that adding shared and private 912 + 80 = 992 which is the RSS total in the original pmap output.

    Here is a little nawk script to show how it can be done.

    NR==1     { header = $0; # first line }
    $1~/----/ { exit;        # no more processing after this line }
    NR>2      {
      # Capture 4 columns of interest
      rss = $3;     if(rss=="-")     rss = 0;
      private = $4; if(private=="-") private = 0;
      mode = substr($6,1,3);
      file = $7 " " $8 " " $9 " " $10;
    
      # Some calculations
      shared = rss - private;
      type   = "data"; if(mode=="r-x") type = "code";
    
      # Accumulate totals for each (file,type) combination
      sharedTotal[file,type] +=shared;
      privateTotal[file,type] +=private;
    }
    
    END {
      if( header=="" ) exit;
      print header;
      printf "%12s %12s %4.4s %s\n", "Shared", "Private", "Type", "Mapped File";
      printf "%12s %12s %4.4s %s\n", "------------", "------------", "----", "----------";
    
      shared = 0; private = 0;
      command = "sort +3";
      for( ij in sharedTotal ) {
        split(ij, a, SUBSEP);
        printf "%12d %12d %4.4s %s\n", sharedTotal[ij], privateTotal[ij], a[2], a[1] | command ;
        shared += sharedTotal[ij];
        private += privateTotal[ij];
      }
      close(command);
    
      printf "%12s %12s %4.4s %s\n", "------------", "------------", "----", "----------";
      printf "%12d %12d %4.4s %s\n", shared, private, "", "Total";
    
    }
    

    Note the interesting use of the pipe in printf "..." | command in the 'for' loop which will sort the printed lines by mapped filename, a construct which does not exist in the old awk. Also it is necessary to close the file descriptor before printing the footer lines, otherwise they would appear first and the sorted lines would be printed at the very end while finishing the program.

    Of course bigger programs lead to bigger pmap output naturally e.g. firefox created more than 600 lines.

    Comparing pmap for two (or more) processes

    Now what you really want to do is apply this memory check to all of your processes and do a comparison of the totals.

    When you compare the entries for two different process some of the mapped files will appear in both lists ( libc.so.1 will probably be on each process map). Looking at the shared and private memory there is a significant distinction. The private memory is really private and belongs to just one process whereas the shared memory is shared between processes. The consequences for counting memory are: private memory can simply be counted per process and the total is the sum of all whereas shared memory of two processes is

  • the memory in common
  • the shared memory used by just the first process
  • the shared memory used by just the second process
    In order to determine that one has to go through the list of mapped files and check for each of them whether they are unique to the process or shared with the second one.

    This idea can be applied to more processes too of course.

    This little shell script runs the awk script from above for every pid belonging to USER and stores its output in a file. Another awk script prints the 'Total' line of these files and sums up the values for shared and private and print an overall total.

    #!/bin/sh
    
    PSLIST=`/bin/ps -u $USER -o pid | sed 1d`
    [ -z "$PSLIST" ] && exit 1
    
    # Run 'pmap -x' for each process and condense its output with the script above
    for pid in $PSLIST  ; do
      pmap -x $pid | nawk -f pmapx.awk > pmapx.$pid
    done
    
    # Sort the filenames numerically
    FILENAMES=`/bin/ls pmapx.* | sort -t. +1n`
    
    nawk '
    BEGIN { newFile = 1 }
    newFile==1 { 
      cmd = $0; 
      newFile = 0;
      next;
    }
    /^-----/ {
      # The dashed lines serve as separators
      pmap = ++pmap %2;  # pmap alternates between 1 and 0
      next
    }
    pmap==1 {
      # There is some pmap output to be parsed
      file = $4 " " $5 " " $6 " " $7;
      type = $3;
      # Find the biggest shared
      if( $1 > shared[type,file] ) shared[type,file] = $1;
    }
    /Total/ {
      # Use the 'Total' line to get the already accumulated private memory
      private += $2;
      printf "%12d %12d   %s\n", $1, $2, cmd;
      # Now expect a new file
      newFile = 1;
    }
    END {
      for( ij in shared )
        sharedTotal += shared[ij];
      printf "%12s %12s   %s\n", "------------", "------------", "---------------";
      printf "%12d %12d   %s\n", sharedTotal, private, "Total"
    }
    ' $FILENAMES
    

    This will lead to this output (shortened a little).
    First of all it lists pmap errors as they occur for processes which cannot be examined.
    Then the totals of the condensed 'pmap -x' files are shown together with process id and name.
    At the end there is a total line but - as explained above - the total shared is not equal to the sum of the shared memory entries in the list whereas the private total is equal to the sum of the private memory in the list.

    pmap: cannot examine 627: permission denied
    ...
            1048           24   828:        /bin/ksh /usr/dt/bin/Xsession
            2180           84   863:        /usr/bin/iiimx -iiimd
            2740          556   864:        iiimd -nodaemon -desktop -udsfile /tmp/.iiim-andreash/:0.0 -vardir /ex
            3588         2552   867:        /usr/lib/gconfd-2 8
    ...
            2312           44   920:        /usr/dt/bin/sdt_shell -c unsetenv _ PWD;            unsetenv DT;
            1284           24   922:        -csh -c unsetenv _ PWD;             unsetenv DT;      setenv DISPLAY :
            1032           20   934:        /bin/ksh /usr/dt/config/Xsession2.jds
           15324          404   936:        /usr/bin/gnome-session
            1752           36   943:        /usr/bin/gnome-keyring-daemon
            3376          232   948:        /usr/lib/bonobo-activation-server --ac-activate --ior-output-fd=23
            5072          276   950:        gnome-smproxy --sm-client-id default0
           10120          436   952:        /usr/lib/gnome-settings-daemon --oaf-activate-iid=OAFIID:GNOME_Setting
            9556         2960   964:        /usr/bin/metacity --sm-client-id=default1
           15176        23648   1050:       /usr/bin/gnome-terminal
     ...
            1560           32   10640:      /bin/bash /usr/bin/firefox
            1588           28   10652:      /bin/bash /usr/lib/firefox/run-mozilla.sh /usr/lib/firefox/firefox-bin
           40628        98560   10656:      /usr/lib/firefox/firefox-bin
            1060           60   22265:      sh
            1248           48   28233:      csh
            1524           48   29139:      vi
    ------------ ------------   ---------------
           72260       148444   Total
    

    The root user could run this script for all users in order to get an overview of all users.

  • Monday, April 8, 2013

    String extracts in Perl with split, match and regular expressions

    Lately I had to solve the following issue:
    extract process id (pid) and program name from the header line of pmap.

    The strings can take these forms from simple to complex:

    123:     cmd
    123:     cmd -x foo
    123:     /usr/bin/cmd
    123:     /usr/bin/cmd -x foo
    
    and more complex with more parameters which are trickier to parse
    123:     /usr/bin/cmd -x /home/foo
    123:     /usr/bin/cmd -x 456: -d /home/foo
    
    i.e. very genereally speaking there is a pid followed by a colon and then a more or less complex command line where the program name can be fully qualified and carry a number of parameters. The last example deliberately introduces the digit and colon again as parameters.

    Here is a try to express the string more verbally as a sequence of

  • a number of digits
  • a colon
  • a tab
  • a program name, optionally qualified
  • optionally: an arbitrary number of space separated parameters (could me multiple spaces)

    There a various solutions to this in Perl and here I'll show two.

    # Example string
    $str = "123:     /usr/bin/cmd -x /home/foo";
    #           ^ should be a tab here
    
    # First I split the string using an optional colon :* 
    # and a sequence of white space \s+ as field delimiters.
    # This will give me the pid and the program name and strip of the parameters
    ($pid,$cmd) = split /:*\s+/,$str;
    
    # In case of a fully qualified program nane 
    # everything up to the last slash needs to be removed
    $cmd =~ s/.*\///;
    
    print "pid = $pid  X  cmd = $cmd\n";
    

    Always looking for more concise code I wondered whether these two lines couldn't be shortened. Here is a one liner which requires explanation of course.

    # Example string
    $str = "123:    /usr/bin/cmd -x /home/foo";
    #           ^ should be a tab here
    
    # I try to match the following reqular expression
    #   a sequence of digits    (\d+)    which will become $1 if successful
    #   a colon and a tab
    #   an optional sequence of characters ending in slash   (\S+\/)*   
    #                which will become $2
    #   a sequence of characters   (\S+)    which will become $3
    # The remainder of the string is not important as 
    # we anchor the regular expression at the beginning.
    $str =~ /^(\d+):\t(\S+\/)*(\S+)/ ;
    
    print "pid = $1  X  cmd = $3\n";
    

    For easier readability I would have preferred the first code but when taking a deeper look I found some flaws in it namely the handling of incorrect strings. Assume this string below where the colon is missing and a string sits between pid and program name

    $str = "123 xyz        /usr/bin/cmd -x 456:  /home/foo";
    
    The codes will result in
    # Code 1
    pid = 123 xyz /usr/bin/cmd -x 456  X  cmd = foo
    
    # Code 2
    pid = /home/  X  cmd =
    
    In both cases the split happens at the wrong place with unforeseeable results.
    I can use the second code though to its advantage by applying a check.
    if( $str =~ /^(\d+):\t(\S+\/)*(\S+)/ ) {
      print "pid = $1  X  cmd = $3\n";
    }
    
    i.e. only when the regular expression is really matched I will use its values. The check gives me assurance.
    I can't do this with the split in the first code other than doing a post-check by checking whether the pid really consists of digits etc. which would increase the code.

    So I decided to use the regular expression in my code since it is still fairly readable by extracting just three parts of the overall string.
    Would I want to extract more, say five or eight components, I probably would fall back to the split and a subsequent validity check.

  • Thursday, April 4, 2013

    A general approach to command line switches and their default values in Perl

    In the UNIX world you'll rarely find a program which doesn't support a few or many arguments (or command line parameters) which influence the execution of the program.

    When Perl programs require arguments (one of the simplest cases: an input filename) one could investigate the ARGV hash (an approach which works well in easy cases) or one could turn to one of the Perl modules, in particular if the arguments are command line switches.

    In this article I will discuss a few types of command line switches and the possible logic behind.

    What is a command line switch?

    Just to recap: a command line switch is traditionally denoted as a hyphen followed by a letter optionally followed by a value e.g.  -d or  -d 25 . Note the space between the switch and its value. Some programs require this space whereas other require the value to be attached to the switch like -d25 and still others allow both. Some programs allow switches to be concatenated like  -ltr instead of  -l -t -r . Others allow switches to be more than one letter. Some programs allow a switch to appear multiple times like -v in awk.
    Further complexities exist: one switch might override others. Some switches exclude each other mutually.
    All these cases would need to be handled properly.

    On top of that (I think it was) the GNU world introduced double hyphen switches with (usually) string switches e.g.  --verbose .

    In the remainder of this article I will only use the simple case of single letter switches with or without argument. I will be using Getopt::Std, one of the core Perl modules and its function getopts. Its basic usage is  getopts('ab:',\%opts); for two switches  -a and  -b foo.

    Various types of command line switches with or without default values

    The typical distinction between command line switches is whether they are boolean (switched on or off) or carry an additional argument. Then there is the question: if a command line switch is absent should there be a default value used in the program?

    The following table explains the differences and shows a few examples.

    switchExampleDefaultNotes
     -a 
    ...falseA boolean switch by its very nature has a default value true or false which should be the opposite of what the switch intends to trigger.
     -d $HOME/tmp 
    output directory/tmpCertain things in the program require a default value e.g. the program needs to know where to store its output files. It's left to the programmer to decide which of the default values can be overruled by command line switches.
     -u joe,sandy 
    user listcurrent userSome command line switches can take more complex arguments, in this case a comma separated list of users. Its absence should be covered by a reasonable default value e.g. the current user.
     -p 1507 
    process idall processesSome switches do specify a setting which acts as a filter or a kind of a restriction but its absence does not imply a default value but is somewhat vague.
    In the 'user list' example before another default behaviour could have been 'all users' instead of 'current user'.

    Rather than defining a list of variables to set the defaults like

    $OUTDIR = "/tmp";
    $USERS = $ENV{'USER'};
    ...
    
    and later somehow associate these variables with the switches a (in my view) cleaner approach is to
  • define the defaults in a hash (the keys are the switches)
  • create a new hash (again with switches for keys) and set them to either the defaults or values supplied by the command line
    The following Perl program handles the cases above.
  • boolean switches and unspecified defaults are set to undef, all others are set to their reasonable default values.
  • the  ... ? ... : ... operator is used to set the actual variables
    (Getopt::Std sets boolean switches to 1 which represents true, the opposite (and default) could be anything that evaluates to false in an if(...) clause, I chose 'undef' rather than 0).

    #!/usr/bin/perl
    use strict;
    
    use Getopt::Std;        # to process command line arguments
    
    # Define the defaults in a hash
    my %defaults;
    $defaults{"a"}  = undef;
    $defaults{"d"}  = "/tmp";
    $defaults{"u"}  = $ENV{'USER'};
    $defaults{"p"}  = undef;
    
    # Retrieve the command line switches into a hash
    # making sure which ones are boolean and which require an argument with ':'
    my %opts;
    getopts('ad:u:p:',\%opts);
    
    # Put either the default values or the command line switch arguments into a hash
    my %vars;
    foreach my $key (keys %defaults) {
      $vars{$key}    = exists $opts{$key} ? $opts{$key} : $defaults{$key} ;
    }
    
    # Test output: see what is contained in 'vars'
    foreach my $key (keys %vars) {
      print $key," ",$vars{$key},"\n";
    }
    print "\n";
    
    # Check decision tree for boolean and unspecified switches
    print "a is set\n" if( $vars{"a"} );
    print "p: all processes\n" unless( $vars{"p"} );
    

    If run without any command line switches:

    u andreas
    p 
    a 
    d /tmp
    
    p: all processes
    

    With -a and -u

    ... -a -u joe,sandy
    
    u joe,sandy
    p 
    a 1
    d /tmp
    
    a is set
    p: all processes
    

    With -p and -d

    ... -d $HOME/tmp -p 1507
    
    u andreas
    p 1507
    a 
    d /export/home/andreas/tmp
    

    With this general approach one hash 'vars' contains all the information and its contents can be used directly later in the program (like -d output directory) or used in a decision process defined vs. undefined.

    Of course there are more issues like the ones mentioned above (e.g. conflicting switches) or validity of values (e.g. does the output directory exist and is writable) but they need to be resolved somewhere else in the code.