BYU

Office of Research Computing

Designing Your Overall Workflow

Designing an overall workflow refers to how a user goes about getting his or her work done. It involves a large number of considerations, including how you divide up your tasks into jobs, how big or long those jobs are, and how predictable and consistent those jobs are in terms of running time.

This page is to introduce users to the concepts behind all of these considerations, and make recommendations for how to approach things in these situations.

Concepts to consider

Number of tasks

The first thing to consider is the total number of tasks that you need to calculate. By task, what we mean is the smallest chunk of work that needs to be done. A task might be a single case, or set of input data.

For example, let's say you need to run some calculation based on x and y, where x and y each range from 1 to 216. You could define a task as the calculations involved on a specific combination of x and y. In this case, since you're talking about all the combinations of x and y, you now have 216 * 216 = 232 total tasks.

This doesn't have anything to do with how many jobs you submit. You could, theoretically, submit a single job for each task, but that's not always the best idea. For example, in the example above, that would represent 232 or 4,294,967,296 total jobs. Please don't do that. That's way too many jobs. In this case you're going to want to do some aggregation of tasks into jobs. For example, if you could have each job handle about 225 tasks each, then you're only talking about 27 or 128 jobs total. And, depending on how long it takes to process 225 tasks, and how many other resources it uses, this might be feasible.

Length of each task

As you evaluate, be sure to get some idea as to how long each task will take to calculate, and how many other resources it takes.

Let's extend the example above, where there are 232 total tasks. Let's assume that the software is only capable of using 1 processor for each of those task calculations. And, let's assume that each of the tasks takes 0.5 seconds on 1 processor. That means that the total processing time is approximately 231 or 2,147,483,648 processor-seconds. That means that if you only had 1 processor available to you, it would take 2,147,483,648 seconds, or about 68 years. If you had 1024 processors, it would be more like 24 days. And so on.

However, this gives you an idea of where you should spend time optimizing. For example, if you can reduce the running time for each task by a factor of 4 (from 0.5 seconds each to 0.125 seconds each), then you only have about 229 or 536,870,912 processor-seconds, which would only be about 17 years on 1 processor, or about 6 days on 1024 processors.

Predictability of task length

Many times, users have a large number of tasks, but there is either no way to know how long each one will run before you run them, or even if you do know, there is a large amount of variation in the job run times. In this case, it's difficult to just divide up the tasks among jobs statically, since you don't know what's going to happen.

For example, let's say that you have 232 or 4,294,967,296 total tasks, and you know that only on average 1 in 16 (6.25%) will take longer than some amount of time to run, say 10 seconds. But you don't know which ones those are. In that case, we recommend starting out with a filtration approach. Basically, build a mechanism where you can run each calculation with a timer, such that if the task completes before the timer expires, then it's done, but if the timer is reached, that task is aborted. Then submit a single job that does this for all the cases. When you're done, you've completed more than 93% of the cases, leaving the remainder for further processing.

You can repeat this process as many times as you want, with higher thresholds. When you feel comfortable with the number of tasks you have left, then go ahead and start submitting one or more of them per job, and have larger numbers of jobs.

How many resources your software can utilize

It's also important to understand the limitations of the software you're using to process your data. If your software can only utilize one processor, then it doesn't make any sense to request more than one processor per job. If you do, then the extra processors will sit there idle, and you won't actually get any benefit.

However, there is nothing wrong with having more than one job running at a time, each using one processor. If you have a large number of independent tasks that don't depend on one another, then this is totally a viable option.

For more information, be sure to read this page.

Efficiency of each job on the specified resources

If your software can take advantage of multiple processors, it's worth considering what happens to the running time as you increase the number of processors. Most software is sublinear, which means that the total running time decreases slower than the processor count increases. For example, if something takes 30 minutes on 1 processor and 20 minutes on 2 processors, that's a sublinear speedup; each case runs faster on 2 processors, but running on one processor is more efficient.

For more information, be sure to read this page.