HSI Checksum Feature

HSI versions and beyond have the capability of storing checksums as part of the HPSS metadata for files, of verifying the checksums during file retrievals, and listing the checksum hashes.

Checksum Hashing Algorithms

HSI supports the following hashing algorithms:









The default algorithm is defined when the HSI package is built, but can be overridden in the public/private hsirc file(s) and via command line options on the <put> and <hashcreate> commands, as described below.

Transfer Rate Performance Impact

The checksum algorithms that are used are very CPU-intensive.  Although the checksum code is compiled with a high level of compiler optimization, transfer rates can be significantly reduced when checksum creation or verification is in effect.  The amount of degradation in transfer rates depends on several factors, such as  processor speed, network transfer speed, and speed of the local filesystem. 

New Commands

The following new commands were added as part of the checksum feature.  The minimum abbreviation is followed by the remainder of the full command in square brackets.

hashconv[ert] -[deprecated - will be removed in HSI 5] converts VFS-style or HPSSSUM-style checksum hash to standard-style

hashcr[eate] - creates a checksum hash for existing HPSS file(s)

hashdel[ete] - deletes checksum hash for file(s)

hashli[st] - list the checksum hash for file(s)

lshash - this is an alias for the "hashlist" command

hashver[ify] - verifies checksum hash for file(s)

The usage syntax for all of these commands can be obtained interactively by issuing the command name by itself, for example:

 ? hashdel

Usage: hashdel[ete] [-h]  [-s] [-v][-R]  path ...

  -A  : display absolute pathname for files

  -h  : delete HSI-style checksum info (default)

  -s  : delete HPSSSUM-style checksum info

  -v  : delete VFS-style checksum info

  -R  : [standard option]recursively delete hash entries for files in the specified path(s)

New hsirc Options

The following new options have been added to the global and/or private hsric files:

  cksum_enabled = on | off (default = off) 

        Automatically enables or disables checksum creation for <put> commands and 

        checksum verification for <get> commands.

  cksum_type = algorithm 

     Specifies the checksum hashing algorithm to use when creating new checksums 

         for <put> or <hashcreate> commands.   Valid checksum values are shown at the

         top of this page.  If not specified, the build-time default algorithm specified by

         the HSI administrator is used.  The release default is "cksum_type = md5".

These options can be specified in either the global or site stanza(s), or both, within the hsirc file(s).  Site-stanza settings take precedence over global stanza settings, if both are specified.  In addition, the private .hsirc file settings for these options take precedence over the host-global hsirc file in $HPSS_CFG_FILE_PATH/hsirc, if one is found on the HSI client machine during startup.

New Command Line Options for PUT and GET commands

The following command line options have been added to the PUT and GET  commands:

-c on | off - enables or disables checksum creation and verification for the put and get family of commandsl, overriding the build time option and hsirc setting.

- H algorithm  - specifies the checksum hashing algorithm to use (PUT command).  For GET commands, the hashing algorithm stored in HPSS metadata is used.

-Y style - [deprecated - will be removed in HSI 5] specifies the checksum hashing style to use (GET command).  Valid styles are "hsi", "vfs", or "hpsssum" (these are case-insensitive).  The default is "hsi".  Note that only hsi-style checksums can be created by HSI.  Also note that the correct style must be specified in order to be found when HSI tries to obtain an existing checksum. The "hashlist -hsv" command may be used to list all of the style(s) of checksum hashes that are stored in HPSS metadata for a file.