#!/bin/ksh ## ## SCRIPT: host2host_filecnt_4subdirs_alllevs_locrmt_showBYgui ## ## Where: in $FEDIR/scripts where $FEDIR=/apps/nns_com/fea ## ############################################################################## ## PURPOSE: ## For a given Host & Directory ( passed in vars $1 & $2 ), ## this script creates a FILE-COUNTS report -- for ALL ## sub-directories of the Host:Directory --- at all levels under ## the directory. ## ## For each sub-directory, the report shows TWO separate counts: ## DIRECTORY-file-counts and NON-DIRECTORY-file-counts. ## ----- ## This is meant mainly as a comparison/difference utility to look ## for differences in two huge directory hierarchies. It is meant ## as a first step --- by looking at summary file counts for ## sub-directories, rather than huge lists of filenames & file-sizes ## in those directories. ## ## If differences in sub-directory file-counts are found, ## then one can 'zero in' on specific sub-directories where there ## are differences in counts --- to look at differences in two lists ## of file-names & file-sizes, for the 2 differing sub-directories. ## ## If files that should be equal in size are different in size, one ## can 'zero in' further, if necessary, doing a 'diff' on the two ## files. ## ############################################################################## ## PROCESSING LOGIC: ## The Unix 'find' command, executing on 'Host' (local or remote), ## is used to find all the sub-directories of Host:Directory. ## ## And the 'find' command uses a '1-dir-level' find-cmd-utility script ## 'host2host_filecnt_4dir_1level_2stdout' in $FEDIR/scripts ## to execute a combination of Unix commands ('ls', 'wc', 'echo', ## 'grep', 'sed') to provide the 2 file-counts for each sub-directory. ## ##--------------------------------------------------------------------------- ## REMOTE HOST NOTE: ## If 'Host' is a remote host that does not have access to the ## $FEDIR/scripts directory, a copy of the 'find-cmd-utility' script ## is made in a 'temp' directory on the remote host and the ## 'find' command is executed via 'rsh' with '-exec' executing ## the temp-copy of the 'find-cmd-utility' script. Example: ## ## dirHOST="$1" ## DIRNAME="$2" ## SCRIPT4FINDEXEC="/tmp/${USER}_host2host_filecnt_4dir_1level_2stdout" ## ## rcp engprd00:$FEDIR/scripts/host2host_filecnt_4dir_1level_2stdout \ ## ${dirHOST}:$SCRIPT4FINDEXEC ## ## rsh $dirHOST find $DIRNAME -type d -exec $SCRIPT4FINDEXEC {} \; ##--------------------------------------------------------------------------- ## ## This script puts the report (stdout from the 'find' command) ## into a temp-file (on the host running this script) ## and shows the report file with shofil_ver2.tk (the 'xpg' utility). ## ## This script checks for 2 arguments (host & directory) and checks that the ## host:directory exists. If any of these conditions are not satisfied, ## this script pops an 'xconfirm' error msg and exits. ## ## -------- ## NOTE1: The 'find'-command in this script is called either directly on ## 'this host' or via 'rsh' to a remote host. ## ## In either case (local or remote execution of the 'find'-command), ## the counts from THIS script are for the files in the ## sub-directories of the directory, $2, 'ON' THE HOST, $1, RUNNING ## THE 'find' COMMAND. ($2 may be NFS-mounted to the $1 host.) ## ## NOTE2: Someday one might want to provide 3 SORT OPTIONS, in this ## script that issues the 'find'-command: ## full-file-name sort, file-cnt-sort, dir-cnt-sort ## ## An indicator (NAME,FILECNT,DIRCNT) could be prompted-for ## in an 'xconfirm' prompt from this script. ## ## The 'sort' cmd could be run by this script, using ## the stdout from the 'find'-command as input to 'sort', running ## on 'this host'. ## ############################################################################## ## CALLED BY: ## host2host_fileman.tk in $FEDIR/tkGUIs ############################################################################## ## ## CALL FORMAT (2 examples): ## ## $FEDIR/scripts/host2host_filecnt_4subdirs_alllevs_locrmt_showBYgui \ ## cvxprd00 /CDMDATA/modcop ## ## $FEDIR/scripts/host2host_filecnt_4subdirs_alllevs_locrmt_showBYgui \ ## engprd00 /data/cvn78/division/catia_files/00573 ## ############################################################################## ## MAINTENANCE HISTORY: ## Written by: Blaise Montandon 05dec2001 Based on files-info script ## 'host2host_info4allfiles_4dir_showBYgui' ## in $FEDIR/scripts ## (includes rmt-dir-existence check) ## and test scripts ## 'test_filecnt_4subidrs_find_alllevs' ## & 'test_filecnt_4dir_1level' ## in $FEDIR/Dscrtest. ## Updated by: Blaise Montandon 12dec2001 Add 'awk' to accumulate TOTAL ## subdirs & files. ## Updated by: Blaise Montandon 11jan2002 Add quotes around $DIRNAME to ## handle embedded blanks in dirname. ############################################################################## THISHOST=`hostname` ############################################################################# ## Set scripts pathname in case this toolchest is not started from ## the nnsFEAmenu system --- and a $FEDIR utility script (or help) is needed. ############################################################################# if test "$FEDIR" = "" then FEDIR=/apps/nns_com/fea export FEDIR fi ############################################################ ## Assure that the DISPLAY variable is set. ## Set to do X-displays on 'this host'. ############################################################ # # if test "${DISPLAY}" = "" # then # DISPLAY="$THISHOST:0" # export DISPLAY # fi ############################################################ ## If this is a remote login, ## set X-display to the 'remote host'. ############################################################ # if test ! "$REMOTEHOST" = "" # then # REMOTEHOST1=`echo $REMOTEHOST | cut -d. -f1` # DISPLAY="$REMOTEHOST1:0" # export DISPLAY # fi ############################################################ ## Set dirHOST & DIRNAME vars using $1 & $2 -- and check ## that they are not null. ############################################################ ## FOR TESTING: # set -x dirHOST="$1" DIRNAME="$2" if test "$dirHOST" = "" then xconfirm -c -header "FilCnt,AllSubdirs: NO_INPUT_warning" \ -B Dismiss \ -t "The script" \ -t " $0" \ -t "needs 2 inputs --- host name & directory name." \ -t "" \ -t "No host was specified. Exiting..." \ -icon warning > /dev/null # -font $CONFIRM_FONT \ exit fi if test "$DIRNAME" = "" then xconfirm -c -header "FilCnt,AllSubdirs: NO_INPUT_warning" \ -B Dismiss \ -t "The script" \ -t " $0" \ -t "needs 2 inputs --- host name & directory name." \ -t "" \ -t "No directory was specified. Exiting..." \ -icon warning > /dev/null # -font $CONFIRM_FONT \ exit fi ############################################################ ## Check that the directory exists. ## If not, exit with msg. ############################################################ ## Directory-Existence-Check is broken into two ## mutually-exclusive cases: ## 1) 'specified-host' is 'this-host' ## 2) 'specified-host' is NOT 'this-host' (the case of ## a 'remote-host' directory) ############################################################ ############################################################ ## 1) 'specified-host' is 'this-host' ############################################################ if test "$dirHOST" = "$THISHOST" then if test ! -d "$DIRNAME" then xconfirm -c -header "FilCnt,AllSubdirs: DIRECTORY_NOT_FOUND_warning" \ -B Dismiss \ -t "The directory $DIRNAME " \ -t "is NOT KNOWN TO host $dirHOST." \ -t "" \ -t "*EXITING*." \ -icon warning > /dev/null # -font $CONFIRM_FONT \ exit # break # continue fi ## END OF if test ! -d "$DIRNAME" fi ## END OF if test "$dirHOST" = "$THISHOST" ############################################################ ## 2) 'specified-host' is NOT 'this-host' ############################################################ if test ! "$dirHOST" = "$THISHOST" then ## WORKS, when $DIRNAME has no embedded blanks. # RETcode=`rsh $dirHOST \ # "if test -d $DIRNAME ; then echo 1 ; else echo 0 ; fi"` ## WORKS? when $DIRNAME has embedded blanks? RETcode=`rsh $dirHOST \ "if test -d \"$DIRNAME\" ; then echo 1 ; else echo 0 ; fi"` if test $RETcode = 0 then xconfirm -c -header "FilCnt,AllSubdirs: DIRECTORY_NOT_FOUND_warning" \ -B Dismiss \ -t "The directory $DIRNAME " \ -t "is NOT KNOWN TO host $dirHOST." \ -t "" \ -t "*EXITING*." \ -icon warning > /dev/null # -font $CONFIRM_FONT \ exit # break # continue fi ## END OF if test $RETcode = 0 fi ## END OF if test ! "$dirHOST" = "$THISHOST" ################################################################ ## CHECK THAT THE DIRECTORY IS NOT ONE OF SEVERAL HIGH-LEVEL ## DIRECTORIES THAT ARE KNOWN TO ## HAVE TENS OF THOUSANDS OF SUB-DIRECTORIES. ################################################################ if test \( "$DIRNAME" = "/" -o "$DIRNAME" = "/usr/people" \ -o "$DIRNAME" = "/data" -o "$DIRNAME" = "/apps" \ -o "$DIRNAME" = "/engprod/data" -o "$DIRNAME" = "/engprod/apps" \ -o "$DIRNAME" = "/engprod" \) then xconfirm -c -header "FilCnt,AllSubdirs: BIG_DIRECTORY_exit" \ -B Dismiss \ -t "The directory specified" \ -t " $dirHOST : $DIRNAME" \ -t "is a directory that typically contains MANY THOUSANDS of" \ -t "sub-directories. It could take a long time to generate" \ -t "the FILE-COUNTS-in-all-sub-directories report." \ -t "And the report will likely be far more than you need." \ -t "" \ -t "Try a lower-level directory that is likely to have MANY HUNDREDS" \ -t "of sub-directories (or less), rather than MANY THOUSANDS." \ -t "" \ -t "Exiting..." \ -icon warning > /dev/null # -font $CONFIRM_FONT \ exit fi ############################################################ ## WARN THE USER ABOUT HUGE DIRECTORIES. GIVE THEM A CANCEL ## A CANCEL OPTION. ############################################################ GO_NOGO=`xconfirm -c -header "FilCnt,AllSubdirs: BIG_DIRECTORY_warning" \ -b CANCEL -B GO \ -t "IF the directory specified" \ -t " $dirHOST : $DIRNAME" \ -t "is a directory that contains MANY THOUSANDS of" \ -t "sub-directories (& even more files), it could take a long time" \ -t "to generate the FILE-COUNTS-in-all-sub-directories report." \ -t "" \ -t "And the report will likely be far more than you need." \ -t "" \ -t "One option is to try a lower-level directory" \ -t "that is likely to have MANY HUNDREDS of" \ -t "sub-directories (or less), rather than MANY THOUSANDS." \ -t "" \ -t "Cancel or Go?" \ -icon warning` # -font $CONFIRM_FONT \ if test "$GO_NOGO" = "CANCEL" then exit fi ################################################################ ## CHECK THAT THE 'find'-COMMAND UTILITY SCRIPT IS KNOWN TO ## THE HOST $dirHOST --- if it is a remote host. ## If not, 'rcp' A COPY TO A TEMP DIRECTORY ON THE REMOTE HOST. ################################################################ ##'This host' is presumed to be an SGI machine or some other ## machine that 'sees' $FEDIR. Otherwise, this script would ## not be running, since it is in that directory. ## (If for some reason the check is necessary, this ## script-existence-check could also be done for'this host', ## i.e. the host running this script -- by simply commenting ## out the 'if test ! "$dirHOST" = "$THISHOST"' statement ## and its 'then' and 'fi' statements. ################################################################ SCRIPT4FINDEXEC="$FEDIR/scripts/host2host_filecnt_4dir_1level_2stdout" if test ! "$dirHOST" = "$THISHOST" then RETcode=`rsh $dirHOST \ "if test -f $FEDIR/scripts/host2host_filecnt_4dir_1level_2stdout ; \ then echo 1 ; else echo 0 ; fi"` if test $RETcode = 0 then SCRIPT4FINDEXEC="/tmp/${USER}_host2host_filecnt_4dir_1level_2stdout" rcp engprd00:$FEDIR/scripts/host2host_filecnt_4dir_1level_2stdout \ ${dirHOST}:$SCRIPT4FINDEXEC fi ## END OF if test $RETcode = 0 fi ## END OF if test ! "$dirHOST" = "$THISHOST" ############################################################ ## NOT IMPLEMENTED; but something to consider: ## Ask for sort -- by name or file-cnt or dir-cnt? ############################################################ # # SORT_TYPE=`xconfirm -c -header "SORT by NAME or FILE-CNT or DIR-CNT?" \ # -b CANCEL -b DIR-CNT -b FILE-CNT -B NAME \ # -t "In the report, sort the files of directory" \ # -t "" \ # -t " ${dirHOST}:$DIRNAME" \ # -t "" \ # -t "by NAME -OR- FILE-CNT -OR- DIR-CNT ?" \ # -icon warning` # # # -font $CONFIRM_FONT \ # # if test "$SORT_TYPE" = "CANCEL" # then # exit # fi ############################################################ ################################################################### ## PREPARE TEMP REPORT FILENAME -- in $OUTLIST. ## Also prep temp filename $FIND_OUT, to hold 'find' output. ################################################################### . $FEDIR/scripts/set_localoutlist FIND_OUT="${OUTLIST}_find" ########################################################################## ########################################################################## ## PREPARE THE REPORT -- by running the 'find' command on $dirHOST ## with stdout directed to $OUTLIST. But first, prepare a report heading. ########################################################################## ########################################################################## ####################################################### ## PREPARE THE 'ALL-SUBDIRS-FILE-CNTS' REPORT HEADING. ####################################################### echo " ********************* `date '+%Y %b %d %a %T%p %Z'` ****************** FILE-COUNTS for all sub-directories under the directory ${dirHOST}:$DIRNAME SORTED BY ** FULLY-QUALIFIED SUB-DIRECTORY NAME **. This report was generated by running a combination of Unix commands, including the 'ls -Ap' and 'wc -l' commands, on $dirHOST . The report was assembled on and presented from $THISHOST . DIRECTORY NON-DIRECTORY FILES COUNT FILES COUNT SUB-DIRECTORY FILENAME -------------- -------------- -------------------------------------------- " > $OUTLIST ############################################# ## RUN THE 'find' COMMAND -- LOCAL OR REMOTE ## --- with stdout directed to $FIND_OUT. ############################################# ## Sort on the 3rd column -- fully-qualified dirname. ############################################# if test "$dirHOST" = "$THISHOST" then ## FOR TESTING: # set -x ## FIND_OUT=`find "$DIRNAME" ...` find "$DIRNAME" -type d \ -exec $SCRIPT4FINDEXEC {} \; | sort -k3 > $FIND_OUT ## FOR TESTING: # set - else ## FOR TESTING: # set -x ## FIND_OUT=`rsh $dirHOST find $DIRNAME ....` ## WORKS, when $DIRNAME has no embedded blanks. # rsh $dirHOST "find $DIRNAME -type d \ # -exec $SCRIPT4FINDEXEC {} \;" | sort -k3 > $FIND_OUT ## WORKS? when $DIRNAME has embedded blanks? rsh "$dirHOST" "find \"$DIRNAME\" -type d \ -exec $SCRIPT4FINDEXEC {} \;" | sort -k3 > $FIND_OUT ## FOR TESTING: # set - fi ## END OF if test "$dirHOST" = "$THISHOST" ########################################### ## Add the 'find' output to the ## report file, $OUTLIST. ########################################### cat $FIND_OUT >> $OUTLIST ########################################### ## Use 'awk' to create a TOTALS line in the ## report, $OUTLIST. ########################################### cat $FIND_OUT | awk 'BEGIN { TOTsubdirs = 0 TOTfiles = 0 CNTlines=0 } ## START of awk main body { # Skip recs with no fields. if ( NF == 0 ) {next} TOTsubdirs = TOTsubdirs + $1 TOTfiles = TOTfiles + $2 CNTlines = CNTlines + 1 } ## END OF awk main body END { printf ("\nTOTALS (Subdirs,Files,Lines-count) \n\n%14d %14d %14d \n", \ TOTsubdirs, TOTfiles, CNTlines) } ## END OF 'END' section ## END OF *awk*' >> $OUTLIST ########################################### ## ADD A TRAILER TO THE REPORT-FILE. ########################################### echo " -------------- -------------- -------------------------------------------- DIRECTORY NON-DIRECTORY SUB-DIRECTORY FILENAME FILES COUNT FILES COUNT ..................... `date '+%Y %b %d %a %T%p'` ............................ The output above was generated by the script $0 ---------------- REPORT CONTENTS: For a given Host:Directory, in this case, ${dirHOST}:$DIRNAME this script creates a FILE-COUNTS report -- for ALL sub-directories of the Host:Directory --- at ALL LEVELS under the directory. (to the ends of the directory 'branches') For each sub-directory, the report shows TWO separate file-counts: DIRECTORY-file-counts and NON-DIRECTORY-file-counts. NOTE: On the TOTALS line, the 'Lines' count is often ONE MORE THAN the Subdirs Total. 'Lines' count includes the 'parent' directory, $DIRNAME . But 'links' between directories may result in exceptions to this rule. ------------- REPORT USAGE: This utility is meant mainly as a comparison/difference utility to look for differences in two large/huge directory hierarchies. It is meant as a first step --- looking at summary file counts for sub-directories, rather than huge lists of filenames & file-sizes in those directories. If differences in sub-directory file-counts are found -- when comparing to a file-count report for another directory that should have the same files -- then one can 'zero in' on specific sub-directories where there are differences in counts --- to look at differences in two lists of file-names (& file-sizes), for the 2 differing sub-directories. If files that should be equal in size are different in size, one can 'zero in' further, if necessary, doing a 'diff' on the two files. Or, proceed to 're-equalize' the directory structure. ----------------------------- SELECTING LINES OF THE REPORT -- via a string-pattern: When you browse the report with the 'xpg' utility, you may want to extract/see just those lines that contain a certain string. To do that, you can use the 'ShowAllMatches' button with the plus-or-minus N lines option, with N set to zero (0). Example: To see all files containing the string 'out', put the string 'out' (without the quotes) in the String entry field of the 'xpg' GUI. Set N to 0 and click the 'ShowAllMatches' button. ----------------- PROCESSING METHOD: The script uses a 'find' command, on the host of the specified directory, to find all the sub-directories of the directory. For each sub-directory, this utility uses a combination of Unix commands --- 'ls -Ap' and 'wc -l' and 'echo' and 'grep' and 'sort' --- to generate a counts-line for each sub-directory. ---------------------------------------------------------------------------- " >> $OUTLIST ################################################################### ## SHOW the FILE-COUNTS report. ################################################################### $FEDIR/scripts/shofil $OUTLIST