intorods option reference
Source options
-o|--source_optionsComma-separated list of attr=value options for connecting to the source filesystem. The supported options are documented in Import data sources
iRODS options
-d|--irods_optionsComma-separated list of attr=value options for connecting to the destination irods instance. If no options are supplied, the configured irods_environment.json will be used. For supported options, refer to: iRODS options
-R|--resourceThe iRODS resource to use for the destination dataobjects. If omitted, the configured default resource will be used.
-S|--sslUse ssl to conect to iRODS.
Source select options
--scanScan the input directory for files to copy. This is the default, this option will normally not be needed. It is only present for backward compatibility.
--searchWhen specified, the source directory will be search for directories to import. This options works in combination with the -P option to specify a search pattern, and the -n option that specifies the mimimum search level.
-a|--minageMinimum age of all objects in the input in order to start copying.
-A|--maxageMaximum age of all objects in the input in order to start copying.
-f|--flag_filenameSpecify the name of a file in the source directory that has to be present in order to start copying. (See also -g)
-g|--flag_ageMinimum age of the flag file in seconds (See the -f option) before it is considered valid and copying will start.
-n|--skip_subdirsIn combination with the –search option, this will skip the supplied number of subdirs before scanning for the directory pattern. For example, if your import data is under /data/department1/datasets and /data/department2/datasets, you could use:
$ intorods --search -n 2 /data /demoZone/data
-O|--completion_avuThis option requires an argument, which is the attribute name of an AVU. It will check if this attribute is present on destination collection,and if so, skip the sync action. This can mainly be used to mark imports as complete using the -m and skipping them on subsequent runs.
-P|--folder_patternThis options is used to supply a regular expression to match the import directory names with. It is only used in combinatio with the –search option. Example: to import directories generated by Illumina sequencers, that generally have names like 230517_NB502001_0011_AHK3YLAFX5, you can use:
$ intorods --search -P [0-9]{6}_.{8}_[0-9]{4}_.{10} /data /demoZone/data
--scan_filter_fileName of a filter file that is used to filter files when scanning the input directory. See the section on filtering for the file format.
-T|--timestamp_ageTODO
-w|--last_writeIn order to select for copying, the last write to files has to be longer ago than the specified number.
Metadata options
-m|--metadataThis option can be used to add metadata to a completely synchronized collection. It can be used multiple times, and it requires an arg=value parameter. See for details the section on metadata
Copy and compare options
-c|--checksum_fileThe checksum file is expected to be in the source directory. It contains a list of files to sync, with corresponding checksums. Each line should contain a sha256 hash, blank space and the relative filename. See also Checksum files
$ cat checksumfile 3e7ad645dd20348351d3a7ffa2a61b80b8944daf280a7a0089819d66fc705453 test_checksumfile_parsing.py c2cba4b79d42a37717fff37c52808d09c6b08f24956f0905f4deaf33d4b76707 test_sync_functions.py c57a3d4adbfb348d5f4db53b2ec0d90cbb4a758115251a5891d26739a40107dc test_intorods_functions.py 0f218d4f5147fec04ca763fa4a58e8288b070951e6aa462c691d52bb90671dd9 output2/file2
-cf|--checksum_file_formatThis opiton is used to support some very specific checksum file formats. Normally, you will want to use the default FILE_FORMAT_TEXT.
-cs|--checksum_file_schemaThis option is only used with some very specific checksum file formats, and is not documented here.
-cf|--checksum_filter_file|--filter_fileName of a filter file that is used to filter files in the checksum file. See the section on filtering for the file format.
-t|--copy_procsNumber of parallel copy processes to start.
-x|--verify_checksumsVerify the sha256 checksum on all files that are synchronized. If this option is not specified, file comparison will only be by size and timestamp. The most efficient way to use this option is:
Create an irods rule that automatically adds checksums to all dataobjects. (This is a good idea anyway!)
Create a checksum file for your source directory. See the section on checksum files
-X|--excludeExclude these dirs/files from the synchronization. Can be supplied multiple times. This should be a regular expression that matches the relative path of the file(s) to exclude
Logging options
--data_source_nameData source name used when logging to syslog
--debuglevelA number in the range 1..5 that sets the debug level.
--syslogLog to syslog