Skip to content
Snippets Groups Projects

Thread safety

Merged Markus Holzer requested to merge holzer/pystencils:threadSafety into master
1 unresolved thread

This MR introduces a simple check for thread safety in the kernel constrains check. If the assignments are not thread safe, optimizations like OpenMP or GPU should fail.

Merge request reports

Pipeline #67072 passed

Pipeline passed for 1570d896 on holzer:threadSafety

Test coverage 87.24% (-0.02%) from 1 job
Approved by

Merged by Frederik HennigFrederik Hennig 10 months ago (Jul 5, 2024 12:24pm UTC)

Merge details

  • Changes merged into master with 5e4c4ad7 (commits were squashed).
  • Deleted the source branch.

Pipeline #67328 failed

Pipeline failed for 5e4c4ad7 on master

Test coverage 87.25% (-0.02%) from 1 job

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • Markus Holzer requested review from @kuron

    requested review from @kuron

  • assigned to @holzer

  • How does this tie in with the existing check_independence_condition/check_double_write_condition? It seems like they also check for things that make loops thread-unsafe.

    Edited by Michael Kuron
  • added 1 commit

    • 86a6112e - Fuse thread safety check with existing check

    Compare with previous version

  • This is honestly a very good question. I just noticed that I am allowed in pystencils to generate a basic kernel with a 5 point stencil update without writing to a 'dst' field. This results in a race condition; thus, the code can not be parallelised. Therefore, I implemented this check quickly. Thinking about it again (and especially with the previous checks in mind), I think that check_independence_condition should exactly check that. Therefore, I fused my implementation with the independence checks.

  • Thanks, this looks better now. Could you please add a test case that checks for the new exception to be raised when it‘s supposed to be raised? Also, right now a bunch of tests are failing -- either your check is too strict or it is finding an actual problem in existing tests.

  • added 1 commit

    • 212380d6 - Remove indepence check for some kernels

    Compare with previous version

  • added 1 commit

    Compare with previous version

  • I checked all the failing tests, and I would say they were actually problematic. They were all Jacobi kernels with no src, dst split. For all of them, I added a dst field and then performed a swap after every timestep. Furthermore, I noticed that indexed kernels fail with the check now. It makes sense because, for indexed kernels, you never know where you actually read or write because the information is created at runtime. Thus I deactivated the check in the boundary handling.

  • Michael Kuron
    Michael Kuron @kuron started a thread on an outdated change in commit 212380d6
232 232
233 233 config = ps.CreateKernelConfig(data_type=dtype,
234 234 default_number_float=dtype,
235 cpu_vectorize_info=opt)
235 cpu_vectorize_info=opt,
236 skip_independence_check=True)
  • Why does the new check fire for this assignment?

  • The reads and writes of the source vector are not independent. If the thread updating, for example, src[1, 0], is faster or slower , src[0,0] is or isn't updated yet. Thus OpenMP or GPU simulations running this kernel will not be deterministic. This is what the independence check is looking for.

    My proposed enhancement checks if reads and writes only occur at the exact same position. Only in these cases, thread safety is guaranteed if I am not overlooking something.

    Thus only assignments like src[0,0] <-- 2*src[0,0] or dst[0,0] <-- src[1,0] are possible. In the later case using src on the lhs would fire the check

    Edited by Markus Holzer
  • Oh, I didn't realize that the lhs of the assignment used the same field. Since this is just a regression test for #62 (closed), I guess it doesn't matter whether it's correct/threadsafe. Did #62 (closed) only occur with this thread-unsafe assignment or can we make the regression test threadsafe by introducing a dst field?

  • Sorry for the delay. #62 (closed) appeared due to the squared variables in the denominator. Usually, we simplify squares beforehand but this did not work, because the squared variable appeared in a Unevaluated node, because the rational got simplified beforehand.

    This means we can make this test thread safe

  • Markus Holzer changed this line in version 5 of the diff

    changed this line in version 5 of the diff

  • Please register or sign in to reply
  • Michael Kuron approved this merge request

    approved this merge request

  • added 1 commit

    Compare with previous version

  • mentioned in commit 5e4c4ad7

  • Please register or sign in to reply
    Loading