2016/09/01
KAICHO: s_naray[at]yahoo[dot]co[dot]jp
※not the plane mail address to avoid SPAM

osw2csv - convert from OSWatcher/ExaWatcher to csv-files

■Abstract

osw2csv is a tool which generates csv(comma separated values)-files from log files generated by OSWatcher or ExaWatcher, provided by Oracle.

■download

package name SRPM package RPM package(for RHEL/CentOS/OracleLinux 5/6/7/8)
osw2csv osw2csv-3.1-3.src.rpm osw2csv-3.1-3.noarch.rpm

■How to install

This is mainly for RHEL5/6/7/8 including RHEL clone distoributions(ex. CentOS, Scientific Linux, Oracle Linux).

Just install osw2csv package like:
 # rpm -Uvh osw2csv-3.1-3.noarch.rpm

You may need perl module "Time::ParseDate" before installing osw2csv. The module can be installed from CPAN by running command "cpan Time:ParseDate", or install the module by installing appropriate rpm package below which includes it.

Distribution PackageName example where you can get
EL5 perl-Time-ParseDate External Package
EL6 perl-Time-modules Standard Package
EL7 perl-Time-ParseDate External Package
EL8 perl-Time-ParseDate External Package

■How to use

If you need csv files which are generated by all existent OSWatcher log files under /home/user/OSWatcherlog/, just run "osw2csv /home/user/OSWatcherlog". Then you can get top.csv,iostat.csv,meminfo.csv,netstat-s.csv and netstat-i.csv on you current directory. The OSWatcher log files can be analyzed with compressed by gzip/bzip2/xz. For example:

# osw2csv /home/user/OSWatcherlog
Searching top data files ... found 7 file(s)
generating './top.csv' by '/usr/bin/top2csv' ...
  entering directory /home/user/OSWatcherlog/oswbb_hostname_2015-02-26.tgz/oswtop
    hostname_top_15.02.24.0600.dat.gz
    hostname_top_15.02.24.0700.dat.gz
    hostname_top_15.02.24.0800.dat.gz
    hostname_top_15.02.24.0900.dat.gz
    hostname_top_15.02.24.1000.dat.gz
    hostname_top_15.02.24.1100.dat.gz
    hostname_top_15.02.24.1200.dat.gz
Searching meminfo data files ... found 7 file(s)
generating './meminfo.csv' by '/usr/bin/meminfo2csv' ...
  entering directory /home/user/OSWatcherlog/oswbb_hostname_2015-02-26.tgz/oswmeminfo
    hostname_meminfo_15.02.24.0600.dat.gz
    hostname_meminfo_15.02.24.0700.dat.gz
    hostname_meminfo_15.02.24.0800.dat.gz
    hostname_meminfo_15.02.24.0900.dat.gz
    hostname_meminfo_15.02.24.1000.dat.gz
    hostname_meminfo_15.02.24.1100.dat.gz
    hostname_meminfo_15.02.24.1200.dat.gz
Searching iostat data files ... found 7 file(s)
generating './iostat.csv' by '/usr/bin/iostat2csv' ...
  entering directory /home/user/OSWatcherlog/oswbb_hostname_2015-02-26.tgz/oswiostat
    hostname_iostat_15.02.24.0600.dat.gz
    hostname_iostat_15.02.24.0700.dat.gz
    hostname_iostat_15.02.24.0800.dat.gz
    hostname_iostat_15.02.24.0900.dat.gz
    hostname_iostat_15.02.24.1000.dat.gz
    hostname_iostat_15.02.24.1100.dat.gz
    hostname_iostat_15.02.24.1200.dat.gz
Searching netstat-s data files ... found 7 file(s)
generating './netstat-s.csv' by '/usr/bin/netstat-s2csv' ...
  entering directory /home/user/OSWatcherlog/oswbb_hostname_2015-02-26.tgz/oswnetstat
    hostname_netstat_15.02.24.0600.dat.gz
    hostname_netstat_15.02.24.0700.dat.gz
    hostname_netstat_15.02.24.0800.dat.gz
    hostname_netstat_15.02.24.0900.dat.gz
    hostname_netstat_15.02.24.1000.dat.gz
    hostname_netstat_15.02.24.1100.dat.gz
    hostname_netstat_15.02.24.1200.dat.gz
Searching netstat-i data files ... found 7 file(s)
generating './netstat-i.csv' by '/usr/bin/netstat-i2csv' ...
  entering directory /home/user/OSWatcherlog/oswbb_hostname_2015-02-26.tgz/oswnetstat
    hostname_netstat_15.02.24.0600.dat.gz
    hostname_netstat_15.02.24.0700.dat.gz
    hostname_netstat_15.02.24.0800.dat.gz
    hostname_netstat_15.02.24.0900.dat.gz
    hostname_netstat_15.02.24.1000.dat.gz
    hostname_netstat_15.02.24.1100.dat.gz
    hostname_netstat_15.02.24.1200.dat.gz
#
# ls -al *.csv
-rw-r--r-- 1 root root  1869968 Sep  1 16:38 iostat.csv
-rw-r--r-- 1 root root   170546 Sep  1 16:38 meminfo.csv
-rw-r--r-- 1 root root   139475 Sep  1 16:38 netstat-i.csv
-rw-r--r-- 1 root root   350505 Sep  1 16:38 netstat-s.csv
-rw-r--r-- 1 root root    71945 Sep  1 16:38 top.csv
#

detailed command line syntax:

usage:  # /usr/bin/osw2csv [options] datadirs_or_datafiles_or_data_tar+gz/bz2/xz_files
description:
        This script generate csv files which are generated by OSWatcher/
        ExaWatcher in the designated directory, onto current dir.
        the name of the generated files will be "top.csv", "iostat.csv"...
        datafiles may be compressed with gzip(.gz)/bzip2(.bz2)/xz(.xz).
options:
        -a <list_of_analysis>
        --analyze <list_of_analysis>
                This specifies what analysis is necessary.
                currently the combination of followings connected whith ','.
                ex = "-a top,meminfo"
                default = "-a meminfo,iostat,netstat-s,netstat-i,top"
                list_of_analysis:
                    iostat    ... iostat info of whole system
                    meminfo   ... /proc/meminfo of whole system
                    netstat-s ... network info of whole system
                    netstat-i ... network info of each interfaces
                    top       ... top header info of whole system
                    slabinfo  ... slabinfo info of whole system
                    top<ELM>  ... shows <ELM> of each processes from top data.
                                  <ELM> is one of top header, like VIRT, RES,
                                  SHR, %CPU or %MEM etc. ex. topVIRT, top%CPU
                    topNUM    ... analyzes process number from top, needs -w.
                    ps<LM>   ... shows <ELM> of each processes from ps data.
                                  <ELM> is one of ps header, like SZ, PID, PRI,
                                  WCHAN, or TIME etc.  ex. psSZ, ps%CPU
                    psNUM     ... analyzes process number from ps, needs -w.
        -w
        --wrapup
                Wrap up data of processes which has same name.
                This needs -a ps<ELM> or top<ELM>.
                This adds "wrapup" to output filename, like "top%CPUwrapup"
        -o <output_csv_file>
        --output <output_csv_file>
                This specifies the output csv file which is generated by
                this command.
                default: "top.csv" for top, "iostat.csv" for iostat...
                Note: this needs "-a " option, as the
                output_csv_file can only be specified just one file.
        -d <output_directory>
        --directory <output_directory>
                This specifies the output directory.
                default: "./"(current directory).
        -n
        --nonamecheck
                This specifies "do not check filenames which include analysis
                type of strings".
                By default, this command check the filename, file_extension
                and directory name of the file to decide whether the files
                need to be analyzed.
                ex: "abc-iostat-def/ghi-iostat-jkl.dat" is only for checking
                iostat.
                Note: this needs "-a " option, as the non-
                namecheck means one file could be handled by several analysis.
        --dirnamecheck
                This specifies "check directory names which include analysis
                type of strings". By default, directory name is never checked.
        -t <temporally directory>
        --tempdir >temporally directory>
                This specifies temporally directory which archived-files
                (ex. tar.gz) will be extracted into.
                default: "."(current directory).

Special analysis for "top" and "ps" are presented. They generate csv files which include specific information(ex. VERT/RES/SHR/%CPU/%MEM) for each processes and each time from top/ps log files. Note that it could take long time and consumes much cpu/memory resources to generate csv files from large log files. Occasionally memory would be necessary over 8GB against huge log files!

Another small tool "csvgrep" is presented, which is a grep tool, picks out specific data(by time, by value etc.) from existing csv file.

usage:  csvgrep [options] csv_files
description:
    This script just "grep" csv files. the options work in "and" condition.
options:
    --time <regex>  or  --row <regex>
        find out the times(1st column) which the regex matches
    --timerange <range>
        find out the times(1st column) which the range includes.
        range ex. "2016/05/23 1:23:45 - 2016/05/24 2:34:56"
    --col <regex>
        find out the columns(except 1st column) which the regex matches
    --val <regex>
        find out the columns/times which the regex matches
    --valcmp <(<|<=|==|=>|>)number>
        find out the columns/times which have the value matches the comparison
    --diffcmp <(<|<=|==|=>|>)number>
        find out the columns/times which have the difference matches the
        comparison
    --invert-match or -v
        output if the conditions are not matched
    --quiet or -q
        quiet, suppress printing what this program is processing to STDERR.
ex.:
        find out the oracle process which increase/reduces VIRT mem over 1000kb
        between 00:12:1# to 00:12:3# from top-pVIRT.csv.

        # /usr/bin/csvgrep --time "00:12:[1-3]" --col "oracle" --diffcmp ">=1000" top-pVIRT.csv

■License

WTFPL.

■Support

in a support bbs or e-mail.