Race condition in CPU JIT configuration
We use pystencils in concurrent build steps (ninja -j N
) to generate different kernels in parallel.
Every once in a while our CI fails with the error
Traceback (most recent call last):
File "/builds/hyteg/hog/generate_all_operators.py", line 33, in <module>
from hog.cse import CseImplementation
File "/builds/hyteg/hog/hog/cse.py", line 21, in <module>
import pystencils as ps
File "/builds/hyteg/hog/env/lib/python3.10/site-packages/pystencils/__init__.py", line 12, in <module>
from .kernelcreation import create_kernel, create_staggered_kernel
File "/builds/hyteg/hog/env/lib/python3.10/site-packages/pystencils/kernelcreation.py", line 10, in <module>
from pystencils.cpu.vectorization import vectorize
File "/builds/hyteg/hog/env/lib/python3.10/site-packages/pystencils/cpu/__init__.py", line 1, in <module>
from pystencils.cpu.cpujit import make_python_function
File "/builds/hyteg/hog/env/lib/python3.10/site-packages/pystencils/cpu/cpujit.py", line 236, in <module>
_config = read_config()
File "/builds/hyteg/hog/env/lib/python3.10/site-packages/pystencils/cpu/cpujit.py", line 197, in read_config
loaded_config = json.load(json_config_file)
File "/usr/lib/python3.10/json/__init__.py", line 293, in load
return loads(fp.read(),
File "/usr/lib/python3.10/json/__init__.py", line 346, in loads
return _default_decoder.decode(s)
File "/usr/lib/python3.10/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python3.10/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
My first guess is that this is a race condition between concurrent pystencils invocation. It tries to find a config file and if it does not exist, then creates it. There is a chance that the next build job (which we run in parallel) finds a partly written config file.
IMHO pystencils should not write config files to disk automatically. Especially, considering that we do not even use the JIT. In any case, it should be robust to concurrent invocations.
Edited by Daniel Bauer