input4mips_validation.parallelisation#
input4mips_validation.parallelisation
#
Support for parallelisation
This always feels so much harder than it should be
run_parallel(func_to_call, iterable_input, input_desc, n_processes, mp_context=multiprocessing.get_context('fork'), *args, **kwargs)
#
Run a function in parallel
Yet another abstraction for this, because the ones we had weren't doing what we wanted.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
func_to_call
|
Callable[Concatenate[U, P], T]
|
Function to call |
required |
iterable_input
|
Iterable[U]
|
Input with which to call the function. |
required |
input_desc
|
str
|
Description of the input (used to make the progress bars more helpful) |
required |
n_processes
|
int
|
Number of processes to use during the processing. If set to |
required |
mp_context
|
BaseContext | None
|
Multiprocessing context to use. By default, we use a spawn context.
If The whole multiprocessing context universe is a bit complex, particularly given we also have logging. In short, spawn is slower, but safer and is supported by windows. Yet forking seems to be the only thing that allows our logging to come through without issue (although maybe we're doing something wrong, it's a bit unclear). Full docs on multiprocessing contexts are here: https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods. |
get_context('fork')
|
*args
|
args
|
Arguments to use for every call of |
()
|
**kwargs
|
kwargs
|
Keyword arguments to use for every call of |
{}
|
Returns:
| Type | Description |
|---|---|
tuple[T, ...]
|
Result of calling |
Source code in src/input4mips_validation/parallelisation.py
24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 | |