small-steps integration of multithreading

there's the "one big step" multithreading branch and it is a pain to keep it updated with changes from master.

while thinking about the issues there (ordering, race conditions, crypto) the idea of "sequential threading" connected by queue.Queue came to mind (it intentionally does not use parallelism on same phase of processing, thus only 1 thread per stage):

```
finder -q- reader -q- id-hasher -q- compressor -q- encryptor -q- writer
```

finder: just discovers pathnames to back up (obeying includes, excludes, --one-file-system, etc.)

reader: reads and chunks a file

hasher: computes id-hash of a chunk so we can check whether we already have it

compressor: compresses a chunk

encryptor: encrypts a chunk

writer: writes stuff to the repo

A side effect of such a staged processing with workers approach is that the code gets untwisted, stages clearly separated and they communicate over well-defined data structures passed over the queues.

The full-blown implementation of this needs not to be done in one go, we can start with lesser stages, e.g.:

```
finder/reader -q- hasher/compressor/encryptor -q- writer
```

this can solve: cpu sitting more or less idle while waiting for I/O to complete (read/seek time, write/sync time), i/o sitting idle while waiting for cpu-bound stuff to complete.

this can not (and should not) solve: very slow compression algorithms needing same-stage parallelism.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

small-steps integration of multithreading #929

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

small-steps integration of multithreading #929

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions