Scheduling Data Science Job Runs

In this tutorial, you use Data Integration to schedule job runs for your Data Science jobs.

Key tasks include how to:

  • Create a job with a Data Science job artifact.
  • Set up a REST task to create a job with the same specifics as the job created with the artifact.
  • Set up a schedule and assign it to the REST task.
  • Have the task scheduler create the Data Science jobs.
A diagram of a user connected from a local machine to an Oracle Cloud Infrastructure compartment called data-science-work compartment. The user creates a job artifact, hello_world_job.py, and sends the job to a Data Science project. The Data Science project is called DS Project and the job is called, hello_world_job. In another workflow, from a Data Integration workspace called hello_world_workspace, a hello_world_REST_task is published to the Scheduler Application of the workspace. Scheduler Application contains  hello_world_task_schedule that sends hello_world_job instances to the DS Project. The hello_world_task_schedule contains a hello_world_task and a hello_world_schedule, suggesting that the schedule for the task comes from the  hello_world_schedule. The DS Project displays scheduled job runs coming from the Scheduler application, called HELLO WORLD JOB RUN.

1. Prepare

Create and set up dynamic groups, policies, a compartment, and a Data Science project for your tutorial.

2. Set Up a Job Run

3. Set Up the Task

For a visual relationship of the components, refer to the scheduler diagram.

4. Schedule and Run the Task

Create a schedule to run the published hello_world_REST_task.