The SeqWare jar file uses a simple configuration file that has been setup for you already on the VM. By default the location is ~/.seqware/settings. You can control this location using an environment variable:
SEQWARE_SETTINGS=~/.seqware/settings
This file contains the web address of the RESTful web service, your username and password, and you Amazon public and private keys that will allow you to push and pull data files to and from the cloud, etc. Here is the example settings file from the VM, this will be ready to work on the VM but keep in mind, this is where you would change settings if you, for example, setup the Web Service and MetaDB on another server or you launched a VM on the cloud and wanted to use the local VM command line jar to control the remote server. Another common thing you may want to do is use the ProvisionFiles module (described later) to push and pull data into/out of the cloud. This is the file where you would supply your access and secret keys that you got when signing up for Amazon (keep those safe!). For this tutorial the config file available on the VM should be ready to go, you will not need to modify it.
Note that the sections for the Oozie Workflow Engine, General Hadoop, Query Engine, and Amazon Cloud Settings are all optional, so they do not need to be filled in for every deployment of SeqWare, just those using these tools. Also note that the settings file needs to have read and write permissions for only the owner for security reasons. Our tools will abort and refuse to run if this is not set.
The format for the settings file is based on Java properties files.
# SEQWARE PIPELINE SETTINGS
# The settings in this file are tagged by when they are used.
# COMMON: Used by all components
# INSTALL: Used when installing a workflow bundle
# SCHEDULE: Used when a user wants to schedule a workflow run
# LAUNCH: Used when a workflow run is to be launched (or dry-run)
# DELETION: Used for the admin web service supporting deletion
#
# Remote users need COMMON and SCHEDULE.
# Workflow developers need COMMON and LAUNCH for testing.
# Administrators need COMMON, DELETION, and INSTALL.
# Cronjobs/daemon processes will need COMMON and LAUNCH.
# Keys that are required for a typical Oozie-sge installation with metadata via web service are marked as required.
# Note that this document was auto-generated using the UserSettingsPlugin
# COMMON
# Common Seqware settings
# required: SeqWare MetaDB communication method, can be 'database' or 'webservice' or 'inmemory' or 'none'
SW_METADATA_METHOD=webservice
# optional: Amazon cloud settings. Only used if reading and writing to S3 buckets.
AWS_ACCESS_KEY=FILLMEIN
# optional: Amazon cloud settings. Only used if reading and writing to S3 buckets.
AWS_SECRET_KEY=FILLMEIN
# COMMON_WS
# Seqware webservice settings. Only used if SW_METADATA_METHOD=webservice
# required: Specify the URL for the seqware-webservice
SW_REST_URL=http://localhost:8080/SeqWareWebService
# required: Specify the username for the seqware-webservice
SW_REST_USER=admin
# required: Specify the password for the seqware-webservice
SW_REST_PASS=admin@admin.com
# COMMON_DB
# Seqware database settings. Only used if SW_METADATA_METHOD=database and by the database check utility
# optional: JDBC user for the seqware metadb
SW_DB_USER=seqware
# optional: JDBC password for the seqware metadb
SW_DB_PASS=seqware
# optional: Host for the metadb
SW_DB_SERVER=localhost
# optional: database name
SW_DB=seqware_meta_db
# SCHEDULE_LAUNCH
# Settings used by scheduling and launching bundles
# required: the default engine to use if otherwise unspecified (one of: oozie, oozie-sge, whitestar, whitestar-parallel, whitestar-sge)
SW_DEFAULT_WORKFLOW_ENGINE=oozie-sge
# INSTALL_LAUNCH
# Settings used by both installing and launching bundles
# required: The directory containing bundle directories (into which bundle archives are unzipped)
SW_BUNDLE_DIR=/home/seqware/SeqWare/provisioned-bundles
# INSTALL
# Settings used to configure the installation of workflow bundles
# required: The directory containing bundle archives (into which a bundle archive is first copied during install)
SW_BUNDLE_REPO_DIR=/home/seqware/SeqWare/released-bundles
# optional: Default is to use compression, this can be set to OFF to disable compression
BUNDLE_COMPRESSION=ON
# LAUNCH
# Oozie engine settings. Only used for both 'oozie' and 'oozie-sge' engines.
# required: URL for the Oozie webservice
OOZIE_URL=http://localhost:11000/oozie
# required: HDFS directory for storing workflow xml
OOZIE_APP_ROOT=seqware_workflow
# required: Hadoop job tracker, used to schedule jobs for oozie-hadoop engine
OOZIE_JOBTRACKER=localhost:8021
# required: Hadoop name node, possibly redundant (should be refactored)
OOZIE_NAMENODE=hdfs://localhost:8020
# required: Hadoop queue onto which to schedule jobs
OOZIE_QUEUENAME=default
# required: Working directory where your workflow steps execute and where we store generated scripts and logs
OOZIE_WORK_DIR=/usr/tmp/seqware-oozie
# optional: Number of times that Oozie and Whitestar will retry user steps in workflows
OOZIE_RETRY_MAX=5
# optional: Minutes to wait before retry for user steps in workflows
OOZIE_RETRY_INTERVAL=5
# optional: Above this threshold, provision file events on the same job/workflow will be batched together
OOZIE_BATCH_THRESHOLD=10
# optional: Number of provision file events that should be batched together
OOZIE_BATCH_SIZE=100
# WHITESTAR
# WhiteStar engine settings. Only used for the 'whitestar' series of engines.
# optional: Restrict the number of parallel jobs invoked in WhiteStar to this amount of memory
WHITESTAR_MEMORY_LIMIT=2147483647
# LAUNCH
# Oozie engine settings. Only used for both 'oozie' and 'oozie-sge' engines.
# required: HDFS implementation class
FS.HDFS.IMPL=org.apache.hadoop.hdfs.DistributedFileSystem
# optional: Only used for 'oozie-sge' engine. Format of qsub flag for specifying number of threads. If present, ${threads} will be replaced with the job-specific value.
OOZIE_SGE_THREADS_PARAM_FORMAT=-pe serial ${threads}
# required: Format of qsub flag for specifying the max memory. If present, ${maxMemory} will be replaced with the job-specific value.
OOZIE_SGE_MAX_MEMORY_PARAM_FORMAT=-l h_vmem=${maxMemory}M
# ADMIN
# Settings used for administrators
# optional: In atypical environments, the default h_vmem constraint for SGE is too stringent. Override them using this (units in megabytes)
SW_CONTROL_NODE_MEMORY=3000
# optional: Location of the admin web service, currently used for deletion
SW_ADMIN_REST_URL=http://localhost:38080/seqware-admin-webservice
# optional: Used to override the JUnique lock used to ensure that utilities don't run concurrently
SW_LOCK_ID=seqware
# optional: Legacy key used to encrypt provisioned files
SW_ENCRYPT_KEY=seqware
# optional: Legacy key used to decrypt provisioned files
SW_DECRYPT_KEY=seqware
# LAUNCH
# Oozie engine settings. Only used for both 'oozie' and 'oozie-sge' engines.
# optional: Used to determine whether provisioned (out) files should be run through MD5 before and after provisioning
SW_PROVISION_FILES_MD5=true
# TESTING
# Used for regression testing
# optional: Used to designate a database for integration tests
BASIC_TEST_DB_HOST=localhost
# optional: Used to designate a database name for integration tests
BASIC_TEST_DB_NAME=seqware_meta_db
# optional: Used to designate a database username for integration tests
BASIC_TEST_DB_USER=seqware
# optional: Used to designate a database password for integration tests
BASIC_TEST_DB_PASSWORD=seqware
# optional: Used to designate a database for extended integration tests
EXTENDED_TEST_DB_HOST=localhost
# optional: Used to designate a database name for extended integration tests
EXTENDED_TEST_DB_NAME=seqware_meta_db
# optional: Used to designate a database username for extended integration tests
EXTENDED_TEST_DB_USER=seqware
# optional: Used to designate a database password for extended integration tests
EXTENDED_TEST_DB_PASSWORD=seqware
In addition to the the user’s ~/.seqware/settings file the only other configuration is that required for automatic retry. Like the Pegasus workflow engine, it is possible to control the number of attempts that should be made before a job is considered failed in a workflow.
Edit the Oozie site XML and add and/or add to the error codes that are listed.
<property>
<name>oozie.service.LiteWorkflowStoreService.user.retry.error.code.ext</name>
<value>SGE137</value>
</property>
<property>
<name>oozie.service.LiteWorkflowStoreService.user.retry.max</name>
<value>30</value>
</property>
After restarting Oozie, Oozie will use the listed error codes in combination with the OOZIE_RETRY_MAX parameter to determine how many times steps will be retried in case of a specific error. For example, in the above jobs that return with an SGE error code of SGE137 will automatically be retried 30 or OOZIE_RETRY_MAX times, whatever is higher. The actual error codes will likely be dependent on your site.
For versions of the oozie-sge plugin from 1.0.3 onwards, two kinds of error codes are possible. Error codes of the form SGE[0-9]+ refer to the exit status of the actual Bash scripts that form steps in your workflows. Error codes of the form SGEF[0-9]+ refer to the failure code of the SGE infrastructure itself.
For example, the following output from “qacct -j” refers to a workflow step which failed with an error code of 1 (which would correspond to SGE1 for the Oozie XML parameter above).
$ qacct -j 3702
==============================================================
qname main.q
hostname master
group seqware
owner seqware
project NONE
department defaultdepartment
jobname annotate_5
jobnumber 3702
taskid undefined
account sge
priority 0
qsub_time Fri Aug 29 16:40:08 2014
start_time Fri Aug 29 16:40:20 2014
end_time Fri Aug 29 16:40:21 2014
granted_pe NONE
slots 1
failed 0
exit_status 1
ru_wallclock 1
ru_utime 1.468
ru_stime 0.072
ru_maxrss 112212
ru_ixrss 0
ru_ismrss 0
ru_idrss 0
ru_isrss 0
ru_minflt 42375
ru_majflt 0
ru_nswap 0
ru_inblock 0
ru_oublock 168
ru_msgsnd 0
ru_msgrcv 0
ru_nsignals 0
ru_nvcsw 726
ru_nivcsw 269
cpu 1.540
mem 0.306
io 0.006
iow 0.000
maxvmem 557.734M
arid undefined
The following output from “qacct -j” refers to a workflow step where the actual qsub failed since a logging directory was unavailable (leading to a Eqw state). This would correspond to an Oozie error code of SGEF26.
$ qacct -j 3801
==============================================================
qname main.q
hostname master
group seqware
owner seqware
project NONE
department defaultdepartment
jobname start_0
jobnumber 3801
taskid undefined
account sge
priority 0
qsub_time Fri Sep 12 15:03:02 2014
start_time -/-
end_time -/-
granted_pe NONE
slots 1
failed 26 : opening input/output file
exit_status 0
ru_wallclock 0
ru_utime 0.000
ru_stime 0.000
ru_maxrss 0
ru_ixrss 0
ru_ismrss 0
ru_idrss 0
ru_isrss 0
ru_minflt 0
ru_majflt 0
ru_nswap 0
ru_inblock 0
ru_oublock 0
ru_msgsnd 0
ru_msgrcv 0
ru_nsignals 0
ru_nvcsw 0
ru_nivcsw 0
cpu 0.000
mem 0.000
io 0.000
iow 0.000
maxvmem 0.000
arid undefined