Delayed Workers and Reruns

The startup time of CI machines may experience delays due to the unavailability of machines or infrastructure issues. The worker that has been delayed will have a considerably shorter effective compared to other workers. Consequently, there is a risk of important tests being missed.

Repeating the execution of a single worker can yield similar results.

Worker Identifier

To address the issue of delays and potential missed tests due to CI machine unavailability or infrastructure issues, you can utilize the worker_id optional parameter.

To implement this solution, use the following command:

redefine install --pytest --worker --worker-id=$WORKER_ID

Ensure that the WORKER_ID environment variable contains the worker identifier specific to your CI system.

Handling worker rerun

it's important to note that if you have repeated executions of a single worker without re-running the orchestrator, the effective time limit for each subsequent execution may be reduced to zero. To prevent this, you can include the rerun attempt as part of the worker_id.

Here's an example:

redefine install --pytest --worker --worker-id="${WORKER_ID}_${ATTEMPT}"

In this case, make sure to set the ATTEMPT environment variable to the rerun attempt counter. This approach guarantees that each attempt is assigned a unique identifier, avoiding a zero effective time limit for repeated executions.

By implementing these strategies, you can mitigate delays, manage worker execution, and minimize the risk of missing important tests in your CI workflow.

Last updated