#!/bin/ksh ## ## SCRIPT NAME: find_big_ideas_datamgmt_fils4dataprod_dir ## ## WHERE : in $FEDIR/scripts where FEDIR=/apps/nns_com/fea ## ## Created 20Sep2000 from ## /apps/ideas/cron/find_big_ideas_datamgmt_fils4parent ############################################################################## ## PURPOSE: ## Generates a big-files report, sorted by file-size, ## to help solve disk-space crunches in file-systems like ## /data/subs, /data/genesis, or /data (the 3 /data file-systems ## around 1999-mid2000). ## ## Interactive. Prompts for two query parameters: ## - a /data/ dir ## - a threshold level in Megabytes (to define 'big' files). ## ## Creates a sorted BIG file size listing -- of I-DEAS files, like ## .mf1 and .mf2 model files -- in a set of ## /data//ideas/team/datamgmt* ## directories for a given 'product-line'. ## ## Uses commands like 'find', 'ls', and 'sort' on SERVER=engprd00. ## ############################################################################## ## SIMILAR SCRIPTS: ## Similar to the more general utility-script ## ## diruse_files_all_levs_sizesort ## [ and find_big_or_old_files4dir ] ## in /apps/nns_com/fea/scripts ## ## which 1) handles multiple levels of a given directory (rather than ## multiple directories at the same level, which we want), ## and 2) allows the user (Admin) to run at any SGI host, ## not just SERVER, ## so that /local directories can be queried. ## ## Some 'awk' reformatting of file-size to col.1 can be seen in ## 'diruse_files_all_levs_sizesort' ## called by ## /apps/nns_com/fea/scripts/diruse_files_all_levs_sizesort_bygui. ## ## 'diruse_files_all_levs_sizesort_bygui' is called by the 'spacetools' ## toolchest drawer named ## "Show FILE-SIZES 4aDir@AnyHost (ALL levels,SIZE-SORT)". ## ## The main difference between this script and the "FILE-SIZES 4aDir - ## ALL levels" utility is that this script is oriented ## to SEVERAL (usually much less than 30) directories called ## 'ideas/team/datamgmt*' under a /data/ parent directory. ## ## This script avoids finding files under the 'nearby' ## 'ideas/team/shared*' and 'ideas/team/projects*' directories. ## ##------- ## One could use the 'spacetools' big/old-files option ## "Show BIG&OLD SUBDIRS&FILES 4aDir@AnyHost ('local' filsdirs)" ## to get a similar report to this, but the user would have to know ## that (1) that utility could be used to satisfy a 'datamgmt'-type ## query and (2) then key in long directory name like ## /data/xxxx/ideas/team ## (or know how to choose the appropriate dir with the BrowseDir or ## ManyDir buttons). ############################################################################## ## THE DATA DIRECTORIES: ## ## The commands ## ls -d1 /data/*/ideas/team/datamgmt* ## and ls -d1 /data/*/*/ideas/team/datamgmt* ## will show ESSENTIALLY ALL current (1999-2000) ## '.../ideas/team/datamgmt' directories. ## ## Note that the first asterisk (*) represents 'product-line' ## names like (in alphabetical order) ## - carrier ## - cvn65 ## - cvn74 ## - cvn75 ## - cvn76 ## - cvn77 ## - cvn78 ## - genesis ## - lsv1 ## - lsv2 ## - manufacturing ## - nssn ## - nuclear ## - other ## - ssn21 ## - ssn688 ## - subs ## - test1 ## - test2 ## - test3 ## - training ## And '*/*' gives, in 1999-2000, ## - training_lan/nightschool ## - training_lan/training ## ## Note that the last asterisk (*) represents a number (1,2,...) ## or null. ## ## The lists generated by this script may be seen in file ## ${REPORTDIR}/spac__ideas_datamgmt_filsiz.lis ## where REPORTDIR=/usr/people/ideasadm/cleandir ## and is a 1-level or 2-level name like the above, ## with slashes replaced by underscores. ## ## The report is size-sorted so that the largest files ## appear at the top of the list. ## ############################################################################## ## USE OF THE LIST(s): ## ## The(se) list(s) can be shown to 'FEA-Data-Coordinators' (and ## 'CAD-Data-Coordinators') in Exx departments or in the Engineering ## Divisions (subs, carrier, nuclear, commercial) --- to be used to ## get I-DEAS users to identify old model files to be deleted --- ## especially at times of disk-space crunches in file-systems like ## /data/subs, /data/genesis, or /data (around 1999-2000). ## ## The model files should be deleted through I-DEAS to make sure ## project meta-data file entries are removed, as well as the actual ## files at the operating system level. ## ## Unfortunately, for *users* to access the files through I-DEAS, they ## need to know the project that the model file was/is in --- to answer ## the intial I-DEAS prompt for project AND model file. If there is no ## indication of the project in the model file name and if there are many ## projects in a 'data-installation', it may be difficult ## for user-groups to find the model files to remove them from their I-DEAS ## projects. ## ## An alternative would be for the I-DEAS Administrator group to create ## utilities to either ## 1) identify the project that each model file is in (and provide ## this utility or its output to user-groups) ## or ## 2) establish an I-DEAS Project 'State' (called, say, 'Delete') and ## setup jobs to be run periodically by 'ideasadm' --- to remove ## the files in the 'Delete' state. ## or ## 3) other? ## ############################################################################## ## ACCESSING/SHOWING THE LIST(S): ## ## The entire file lists from this script can be shown via 'nedit -read' ## --- or NNS 'xpg'. ## ['nedit -read' handles very large lists better-faster.] ## ## Then, *FILES LISTS* (size-sorted/age-sorted) for specific 'problem' ## 'datamgmt*' directories can be seen with 'spacetools' drawers --- ## or nnsFEAmenu 'c' options. ## ############################################################################## ## CALLED BY: ## This script is called by a drawer in 'spacetools' in $FEDIR/scripts, ## actually in 'spacetools.chestdef' in $FEDIR/scripts, ## with a drawer-name like ## "BIG FILES in I-DEAS 'datamgmt#' DIRS @SERVER ('local' fils)"## ## ## Or an ideasadm cron job could be set up to generate a list periodically ## it that were deemed more efficient than an 'on-line' query. ## As user 'ideasadm', see 'crontab -l' on engprd00. ## ## Further crontab documentation is in ## /apps/ideas/cron/READMEcrontab_ideas_engprd00 ## ## [ That crontab README file may be shown via option 'h cr' of ## the 'iad' (bmo01 I-DEAS Admin.) menu which is shown by ## /apps/nns_com/fea/scripts/iad ] ## ############################################################################## ## MAINTENANCE HISTORY: ## Written by: B.Montandon O06 2May2000 First created as ## /apps/ideas/cron/find_big_ideas_datamgmt_fils4parent ## based on 'creat_datadirsiz_lists' ## in /apps/ideas/cron. ## Updated by: B.Montandon O06 8Mar2000 Breakup first prompt to put intro to ## the prompt on the screen for view ## while the 'datamgmt' parent directory ## list is built. Also add HIbold & ## HIreset color hilites to prompts. ## ## Updated by: Blaise Montandon 08apr2004 To avoid a bug in new SGI 'sort' ## in IRIX 6.5.22, changed several cases ## of 'sort -k5nr' to ## 'sort +4 -5nr'. ############################################################################## if test "$FEDIR" = "" then FEDIR=/apps/nns_com/fea fi ############################################################################ ## SET ENV-VARS 'THISHOST' AND 'SERVER' --- for titles and to check ## whether this script is being run at the server. ############################################################################ ## Below, we use 'rsh $SERVER' to execute the 'find -local' command, ## if this script is not already running on the server. ############################################################################ THISHOST=`hostname` SERVER="engprd00" ############################################################################ ## SHOW A CANDIDATE LIST OF DIRECTORIES, for choosing a ## 'parent' of I-DEAS 'datamgmt'-file directories. ############################################################################ ## Create the directory list automatically, if feasible. ############################################################################ ## ## The NNS I-DEAS data-installations were (May99) ## in directories with names of the form ## ## '/data/*/ideas/team/datamgmt*' and ## '/data/*/*/ideas/team/datamgmt*' ## ## on the $SERVER server. ## ## Commands ## ls -1dp /data/*/ideas/team/datamgmt* | grep '/$' | sed 's|/$||' ## AND ## ls -1dp /data/*/*/ideas/team/datamgmt* | grep '/$' | sed 's|/$||' ## ## give/gave the directory names (without filenames). The 'ls -p' and ## the grep-sed pipe filter out filenames. ## ## In Apr2000, most names were of the form '/data/*/ideas/' --- one level ## between /data and /ideas. ## ## But 'ls -1d /data/*/*/ideas/' showed ## /data/e46/Lessard/ideas ## /data/e46/results/ideas ## /data/training_lan/nightschool/ideas ## /data/training_lan/training/ideas ## ## and 'ls -1d /data/*/*/ideas/team' showed ## /data/training_lan/nightschool/ideas/team ## /data/training_lan/training/ideas/team ## ############################################################################ ############################################################################ ## PRODdirLIST=`ls -1p /data | grep '/$' | sed 's|/$||'` ## ## gives names like ## ## 651 ## asf ## carrier ## configsc ## cvn65 ## cvn## ## e40 ## e46 ## foundry ## genesis ## lost+found ## lsv1 ## lsv2 ## maestro ## manufacturing ## misc ## nssn ## nuclear ## other ## prod ## simtec ## ssn21 ## ssn688 ## subs ## test# ## training ## training_lan ## vividdb ## ## This includes directories that do not contain sub-directories like ## '/ideas/team/' and '/ideas/team/' and '/ideas/team/datamgmt*'. ############################################################################ echo "${HIbold}\ *************************************************************** Show BIG FILES of ALL 'datamgmt' sub-directories for a given /data/ parent directory --- size-sorted: *************************************************************** ${HIreset} This interactive utility prompts for two query parameters: - a /data/ dir - a threshold level in Megabytes (to define 'big' files). It generates a big-files report, sorted by file-size. The report shows I-DEAS work-in-process files, like .mf1 and .mf2 model files and checked-out .dwg binary Drafting files --- in a set of /data//ideas/team/datamgmt* directories. [If this utility proves better to use than the 'Big/Old-SubDirs&Files 4aDir@AnyHost' utility with host=$SERVER and Dir=/data//ideas/team, then the following two prompts could be replaced by a GUI with - a listbox widget for selecting parent directory and - a slider-bar widget for selecting 'big' threshold.] ${HIbold} Choose/paste an I-DEAS 'datamgmt' 'parent' directory from the following list:${HIreset}" # PRODdirLIST=`ls -1dp /data/*/ideas/team/datamgmt* | grep '/$' | sed 's|/$||'` PRODdirLIST=`ls -1dp /data/*/ideas/team/datamgmt* | grep '/$' | \ sed 's|/ideas/team/datamgmt.*/$||' | uniq` PRODdirLIST=${PRODdirLIST}' '`ls -1dp /data/*/*/ideas/team/datamgmt* | grep '/$' | \ sed 's|/ideas/team/datamgmt.*/$||' | uniq` echo " $PRODdirLIST ${HIbold} 'Parent' Directory ===>${HIreset} \c" read PRODdir if test "$PRODdir" = "" then exit fi ############################################################################ ## PROMPT FOR A Threshold file size, in Meg. ## Convert $BIGinMEG to $BIGinBYTES. ############################################################################ echo "${HIbold} Enter a Threshold file size, in Meg. Examples: 10 or 100 or 1 or 0 Megabytes ===>${HIreset} \c" read BIGinMEG if test "$BIGinMEG" = "" then exit fi BIGinBYTES=`expr $BIGinMEG \* 1000000` ################################################## ## BUILD THE REPORT FILE NAME. ## Replace the '/'es in ${PRODdir} by '_'. ################################################## ## In /local/scratch/$USER if possible, ## otherwise in $HOME. ################################################## . $FEDIR/scripts/set_localoutlist REPORTSTR=`echo ${PRODdir} | sed 's|/|_|g'` OUTLIST2=${OUTLIST}${REPORTSTR}_ideas_datamgmt_BIGfils rm -f $OUTLIST2 ############################################################################ ## PREPARE REPORT HEADER FOR FILE SIZE LISTINGS ## -- for the /data//ideas/team/datamgmt# files. ############################################################################ echo "\ ..................... `date '+%Y %b %d %a %T%p'` ...................... *BIG FILES* in DIRECTORIES ${SERVER}:$PRODdir/ideas/team/datamgmt* 'BIG' = FILES BIGGER THAN $BIGinMEG Meg. FILE SIZES are shown in MEGabytes. The list is SIZE-SORTED. The list shows files 'local' to server ' $SERVER ' Disk usage Last-Modified in MEGabytes Permissions Owner Group Date-Time/Yr Filename ------------- ----------- -------- -------- ------------ ---------------------- GigMeg.KilByt | | | |" > $OUTLIST2 ##################################################### ## GENERATE THE BODY OF THE FILE-SIZE REPORT. ##################################################### ## NOTE: This does not give fully-qualified filenames, ## so which datamgmt* directory contains each ## file is not clear. ##################################################### ## ls -l ${PRODdir}/ideas/team/datamgmt* | sort -k5nr >> $OUTLIST2 ## or ## ls -l ${PRODdir}/ideas/team/datamgmt* | sort +4 -5nr >> $OUTLIST2 ##################################################### ################################################## ## MAKE SURE THERE IS A $HOME/.rhosts FILE ## for the userid running this job. ################################################### ## NOT NEEDED, if we are running on SERVER. ################################################### ## ## # echo "+ $USER" > $HOME/.rhosts . $FEDIR/scripts/mak_rhosts ################################################### ## FOR TESTING: # set -x ##################################################################### ## AN ALTERNATIVE USING 'find' --- to get fully-qualified ## filenames --- AND to see files in lower directory levels, ## like in /data/test3/ideas/team/datamgmt. ##################################################################### ## Some possible options (stderr to /dev/null): ## ## -exec ls -l {} \; 2> /dev/null" | sort ... ## ##################################################################### rsh $SERVER \ "find ${PRODdir}/ideas/team/datamgmt* -local -type f \ -size +${BIGinBYTES}c \ -exec ls -l {} \;" \ | sort +4 -5nr | \ awk '{printf ("%13.6f %-10s %-8s %-8s %-3s %2s %5s %s\n", $5/1000000, $1, $3, $4, $6, $7, $8, $9 )}' \ >> $OUTLIST2 # | sort -k5nr | \ ## FOR TESTING: # set - ################################################## ## ADD TRAILER TO REPORT. ################################################## echo "\ | | | | GigMeg.KilByt ------------- ----------- -------- -------- ------------ ---------------------- Disk usage Permissions Owner Group Date-Time/Yr Filename in MEGabytes Modified ..................... `date '+%Y %b %d %a %T%p'` ...................... The output above (files bigger than $BIGinMEG Meg) was created by script $0 For fast performance and low file I/O impact on the network, the 'find' command in this script is run at the server $SERVER . -------- A 'pipe' of several commands (find, sort, awk) was used, of the form: find ${PRODdir}/ideas/team/datamgmt* -local -type f \\ -size +${BIGinBYTES}c -exec ls -l {} \; \\ | sort +4 -5nr | awk '{printf ( ... )}' where {} represents a filename. The Unix 'find' command was used to travel through the directories matching the mask ${PRODdir}/ideas/team/datamgmt* It also travels through sub-directories, if any. There generally are none under the datamgmt* directories. An exception is below. The 'find' command executes the 'ls -l {}' command to - provide a list with fully-qualified filenames AND - see files in lower directory levels. Example: VibroAcoustics files in /data/test3/ideas/team/datamgmt. I.e. a list is produced that is suitable for size-sorting --- retaining the identity of the parent directory in each filename, In contrast, output of the 'ls -lR' command does not provide output suitable for size-sorting. See the script and its comments for actual commands used. ----------------------------------------------------------------------- ACCESSING/SHOWING THE REPORT FILES: This report file can be seen with 'nedit -read' or NNS 'xpg', namely: nedit -read $OUTLIST2 OR xpg $OUTLIST2 on host $THISHOST. 'nedit -read' is generally faster on really huge files, but 'xpg' has an easy-to-use (no setup required) print utility --- and a phenomenally useful 'Show All Matches' button/function. ------------------------------------------------------------------ USAGE OF THESE REPORTS These I-DEAS 'datamgmt' file reports (size-sorted) are meant to be used IN OUT-OF-DISK-SPACE CONDITIONS in a /data file system on the SERVER ' $SERVER '. They can be used to alert System Administrators, Application Administrators, and specific User Groups of the specific files, with the most disk-space pay-back, to remove (via I-DEAS session, to update the I-DEAS TDM = Team-Data-Management system properly). ......................................................................... " >> $OUTLIST2 ##################################################################### ## SHOW THE REPORT. ##################################################################### echo " ******************************************************* Close the report file window and this window will close. ******************************************************* " ##################################################################### ## $FEDIR/scripts/shofil Does not work in an 'xwsh' from toolchest. ## Apparently, ## because of '&' batch invocation of shofil.tk within this script. ##################################################################### # $FEDIR/scripts/shofil $OUTLIST ##################################################################### SHOFILENAME=$OUTLIST2 export SHOFILENAME XLPHP_FORMAT="AV" export XLPHP_FORMAT ## $FEDIR/tkGUIs/shofil.tk & ## DOES NOT SHOW UP in an 'xwsh'. $FEDIR/tkGUIs/shofil.tk