address darwin system parallel threading issues, without numpydoc fixes#708
address darwin system parallel threading issues, without numpydoc fixes#708s01st wants to merge 2 commits intoAnswerDotAI:mainfrom
Conversation
|
@jph00 Thoughts? |
|
Thanks for this. Can you run I don't actually know what's happening in this PR. Can you tell me about the problem and solution? |
|
@jph00 can do, apologies, that I forgot to do so |
|
@jph00 I think this is an issue with Specifically the cell: test_eq(urljson('https://httpbin.org/get')['headers']['User-Agent'], url_default_headers['User-Agent'])Re-running this multiple times on both main and on this branch I've found non-deterministic behavior. |
|
@jph00 thoughts? Or how else should I address this? |
jph00
left a comment
There was a problem hiding this comment.
I'm not sure about these changes, but certainly interested in discussing them. Would love to hear your more detailed thoughts.
| @wraps(f) | ||
| def _f(*args, **kwargs): | ||
| res = (Thread,Process)[process](target=g, args=args, kwargs=kwargs) | ||
| if process: |
There was a problem hiding this comment.
I don't think we want this, do we? I'd expect process=True to give us a normal Process. Why would we want something different on Mac?
There was a problem hiding this comment.
Disclaimer my understanding of this is not the best, but I know a bit about concurrency // distributed systems.
My understanding is that what Process does depends on default start method of the platform (which I suppose, could change in an update, but not likely).
Anyway, I believe for true linux systems Process forks a copy while macOS (apple silicon, maybe the old intel chips too) spawns a new interpreter.
So supposing threaded(process=True) linux ends up with the forked process and macOS does spawn which carries the implications of a fresh interpreter, reloading the module, and requires picklablity (which is its own headache in its own way)
There is a push to using spawn, but at the moment picklability + nested functions will likely lead to errors (which I've encountered)
So the below get_context('fork').Process is the special-case to make process=True behave the same way on macOS as it does on Linux — not to make it special, but to avoid macOS being the odd one out.
Provided that I understand.
| if threadpool: pool = ThreadPoolExecutor | ||
| else: | ||
| if not method and sys.platform == 'darwin': method='fork' | ||
| if not method and sys.platform == 'darwin': |
There was a problem hiding this comment.
export OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES is a good workaround in practice, in our experience. How about we check for that first?
There was a problem hiding this comment.
That certainly helps. Do we want to force that as fix? or UserWarning?
| def run_procs(f, f_done, args): | ||
| "Call `f` for each item in `args` in parallel, yielding `f_done`" | ||
| processes = L(args).map(Process, args=arg0, target=f) | ||
| Proc = get_context('fork').Process if sys.platform == 'darwin' else Process |
There was a problem hiding this comment.
fork feels a bit unexpected to me here too.
I re-ran the tests - sorry about that! |
|
@jph00 thoughts? What needs be done? |
Need to resolve the questions I asked above. I'll take a look at my end and see if I can find a clean fix. |
|
I think I put my comments under review. They are marked as pending. Do they not show up? |
No they don't - I think there must be another button you have to hit to move them from pending to sent... |
|
Funnily enough I actually hit this today when running a test on my mac, so just popped in a fix while I was there - I haven't checked, but I suspect it's probably basically the same as you had here. Take a look - if you think it's missing anything, lemme know. |
|
@jph00 where is the fix? |



No description provided.