HSI Checksum Feature

HSI versions and beyond have the capability of storing checksums as part of the HPSS metadata for files, of verifying the checksums during file retrievals, and listing the checksum hashes.

Checksum Hashing Algorithms

HSI supports the following hashing algorithms:









The default algorithm is defined when the HSI package is built, but can be overridden in the public/private hsirc file(s) and via command line options on the <put> and <hashcreate> commands, as described below.

Transfer Rate Performance Impact

The checksum algorithms that are used are very CPU-intensive.  Although the checksum code is compiled with a high level of compiler optimization, transfer rates can be significantly reduced when checksum creation or verification is in effect.  The amount of degradation in transfer rates depends on several factors, such as  processor speed, network transfer speed, and speed of the local filesystem. 

New Commands

The following new commands were added as part of the checksum feature.  The minimum abbreviation is followed by the remainder of the full command in square brackets.

hashcr[eate] - creates a checksum hash for existing HPSS file(s)

hashdel[ete] - deletes checksum hash for file(s)

hashli[st] - list the checksum hash for file(s)

lshash - this is an alias for the "hashlist" command

hashver[ify] - verifies checksum hash for file(s)

The usage syntax for all of these commands can be obtained interactively by issuing the command name by itself, for example:

 ? hashdel

Usage: hashdel[ete] [-A] [-R]  path ...

  -A  : display absolute pathname for files

  -R  : [standard option]recursively delete hash entries for files in the specified path(s)

New hsirc Options

The following new options have been added to the global and/or private hsric files:

  cksum_enabled = on | off (default = off) 

        Automatically enables or disables checksum creation for <put> commands and 

        checksum verification for <get> commands.

  cksum_type = algorithm 

     Specifies the checksum hashing algorithm to use when creating new checksums 

         for <put> or <hashcreate> commands.   Valid checksum values are shown at the

         top of this page.  If not specified, the build-time default algorithm specified by

         the HSI administrator is used.  The release default is "cksum_type = md5".

These options can be specified in either the global or site stanza(s), or both, within the hsirc file(s).  Site-stanza settings take precedence over global stanza settings, if both are specified.  In addition, the private .hsirc file settings for these options take precedence over the host-global hsirc file in $HPSS_CFG_FILE_PATH/hsirc, if one is found on the HSI client machine during startup.

New Command Line Options for PUT and GET commands

The following command line options have been added to the PUT and GET  commands:

-c on | off - enables or disables checksum creation and verification for the put and get family of commandsl, overriding the build time option and hsirc setting.

- H algorithm  - specifies the checksum hashing algorithm to use (PUT command).  For GET commands, the hashing algorithm stored in HPSS metadata is used.