workflows-tutorial GitHub

Tapis Jobs

Tapis Jobs service

Tapis Job service aims at launching applications directly on hosts or as job submitted to schedulers (currently only Slurm). The Tapis v3 Jobs service is specialized to run containerized applications on any host that supports container runtimes. Currently, Docker and Singularity containers are supported. The Jobs service uses the Systems, Apps, Files and Security Kernel services to process jobs.

Life cycle of Jobs

When a job request is received as the payload of an POST call, the following steps are taken:

  • Request authorization - The tenant, owner, and user values from the request and Tapis JWT are used to authorize access to the application, execution system and, if specified, archive system.
  • Request validation - Request values are checked for missing, conflicting or improper values; all paths are assigned; required paths are created on the execution system; and macro substitution is performed to finalize all job parameters.
  • Job creation - A Tapis job object is written to the database.
  • Job queuing - The Tapis job is queue on an internal queue serviced by one or more Job Worker processes.
  • Response - The initial Job object is sent back to the caller in the response. This ends the synchronous portion of job submission.

After these synchronous steps job processing proceeds asynchronously. Each job is assigned a worker thread and job proceeds until it completes successfully, fails or gets blocked.

Job Status

PENDING - Job processing beginning
PROCESSING_INPUTS - Identifying input files for staging
STAGING_INPUTS - Transferring job input data to execution system
STAGING_JOB - Staging runtime assets to execution system
SUBMITTING_JOB - Submitting job to execution system
QUEUED - Job queued to execution system queue
RUNNING - Job running on execution system
ARCHIVING - Transferring job output to archive system
BLOCKED - Job blocked
PAUSED - Job processing suspended
FINISHED - Job completed successfully
CANCELLED - Job execution intentionally stopped
FAILED - Job failed

Simple job submission example:

job_response_vm=client.jobs.submitJob(name='mpm-job-vm',description='material point method',appId=app_id,execSystemId=system_id_vm,appVersion= 'dev')

  • appId - The Tapis application to execute. This must be a valid application that the user has permission to run.
  • name - The user selected name for the job.
  • appVersion - The version of the application to execute.
  • execSystemId - Tapis execution system ID. It can be inherited from the app
  • parameterSet - Runtime parameters organized by category
    appId, name and appVersion are required parameters.

Please refer to all the job submission parameters here Job Submission Parameters

Exercise: Running mpm app on VM

Application Arguments

With appArgs parameter you can specify one or more command line arguments for the user application.
Arguments specified in the application definition are appended to those in the submission request. Metadata can be attached to any argument.

MPM app needs two arguments:

  • directoryInputFlag
  • directoryInput

Submit a job on VM Host

job_response_vm=client.jobs.submitJob(name='mpm-job-vm',description='mpm-job',appId=app_id,execSystemId='tapisv3-exec-<userid>',appVersion= '0.0.1')
print(job_response_vm.uuid)

Everytime a job is submitted, a unique job id (uuid) is generated. We will use this job id with tapipy to get the job status, and download the job output.

# Get job uuid from the job submission response
print("****************************************************")
job_uuid_vm=job_response_vm.uuid
print("Job UUID: " + job_uuid_vm)
print("****************************************************")

Jobs List

Now, when you do a jobs-list now, you can see your jobUuid.

client.jobs.getJobList()

Jobs Status

Job status allows you to see the current state of the job.

# Check the status of the job
print("****************************************************")
print(client.jobs.getJobStatus(jobUuid=job_uuid_vm))
print("****************************************************")

Job enters into different states throughout the execution. Details about different job states are given here JOB STATES

Jobs Output

To download the output of job you need to give it jobUuid and output path. You can download a directory in the jobs’ outputPath in zip format. The outputPath is relative to archive system specified.

# Download output of the job
print("Job Output file:")

print("****************************************************")
jobs_output_vm= client.jobs.getJobOutputDownload(jobUuid=job_uuid_vm,outputPath='stdout')
print(jobs_output_vm)
print("****************************************************")

What’s next?

If you made it this far, you have successfully created a new app within a container and have deployed that tool on an HPC like system, and now you can run that tool through the cloud from anywhere! That is quite a lot in one workshop.

At this point, it would be a good idea to connect with other developers that are publishing apps and running workflows through Tapis by joining the Tapis API Slack channel: tacc-cloud.slack.com

Next-> Workflows