Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
d74c91c
Implement RowMaximum Pivoting Strategy for Distributed LU Factorization
AkhilAkkapelli Jul 8, 2025
c671828
Add linear algebra functions for matrix inversion 'inv' and triangula…
AkhilAkkapelli Aug 10, 2025
6d5a8af
DArray: Add AutoBlocks support to view
jpsamaroo Dec 9, 2025
b1e27c7
DArray: Overload LinearAlgebra._zeros
jpsamaroo Dec 10, 2025
ec05aca
DArray: Add LAPACK.chkfinite method
jpsamaroo Dec 12, 2025
cd5b999
DArray/matvecmul: Properly repartition B
jpsamaroo Dec 12, 2025
0fb9f28
DArray: Force allowscalar during printing
jpsamaroo Dec 13, 2025
144593a
DArray: Add LinAlg 1.12 dispatches
jpsamaroo Dec 14, 2025
25f7dc7
LU: Don't generate empty views
jpsamaroo Dec 16, 2025
78a1e74
tests: Add linear solve tests
jpsamaroo Feb 4, 2026
5a7e34a
DArray/LU: Fix inaccuracies by avoiding ChunkView
jpsamaroo Feb 10, 2026
eb5c2b7
DArray/solver: Use Cholesky factors instead of L/U
jpsamaroo Feb 10, 2026
18860aa
tests/LU: Don't compare exact factors with BLAS
jpsamaroo Feb 10, 2026
be97b63
DArray: Support arbitrary block sizes for Cholesky/ishermitian/issymm…
jpsamaroo Feb 10, 2026
9447445
DArray/LU: generic_lufact doesn't take allowsingular
jpsamaroo Feb 11, 2026
514bf08
CI: Bump limit to 2 hours
jpsamaroo Feb 11, 2026
c8ff89f
tests/LU: Use higher tolerance for FP32
jpsamaroo Feb 11, 2026
0003cc4
DArray/LU: Fixes for 1.9/1.10
jpsamaroo Feb 22, 2026
c98e867
docs/DArray: Add ldiv and inv to supported linalg ops
jpsamaroo Feb 23, 2026
1b8ee85
docs: Fix non-unique Options references
jpsamaroo Feb 23, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 6 additions & 6 deletions .buildkite/pipeline.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@

steps:
- label: Julia 1.9
timeout_in_minutes: 90
timeout_in_minutes: 120
<<: *test
plugins:
- JuliaCI/julia#v1:
Expand All @@ -32,7 +32,7 @@ steps:
codecov: true

- label: Julia 1.10
timeout_in_minutes: 90
timeout_in_minutes: 120
<<: *test
plugins:
- JuliaCI/julia#v1:
Expand All @@ -43,7 +43,7 @@ steps:
codecov: true

- label: Julia 1.11
timeout_in_minutes: 90
timeout_in_minutes: 120
<<: *test
plugins:
- JuliaCI/julia#v1:
Expand All @@ -54,7 +54,7 @@ steps:
codecov: true

- label: Julia 1
timeout_in_minutes: 90
timeout_in_minutes: 120
<<: *test
plugins:
- JuliaCI/julia#v1:
Expand All @@ -65,7 +65,7 @@ steps:
codecov: true

- label: Julia nightly
timeout_in_minutes: 90
timeout_in_minutes: 120
<<: *test
plugins:
- JuliaCI/julia#v1:
Expand All @@ -76,7 +76,7 @@ steps:
codecov: true

- label: Julia 1 (macOS)
timeout_in_minutes: 90
timeout_in_minutes: 120
<<: *test
agents:
queue: "juliaecosystem"
Expand Down
4 changes: 3 additions & 1 deletion docs/src/darray.md
Original file line number Diff line number Diff line change
Expand Up @@ -704,7 +704,9 @@ From `LinearAlgebra`:
- `*` (Out-of-place Matrix-(Matrix/Vector) multiply)
- `mul!` (In-place Matrix-Matrix and Matrix-Vector multiply)
- `cholesky`/`cholesky!` (In-place/Out-of-place Cholesky factorization)
- `lu`/`lu!` (In-place/Out-of-place LU factorization (`NoPivot` only))
- `lu`/`lu!` (In-place/Out-of-place LU factorization (`NoPivot` and `RowMaximum`))
- `\`/`ldiv!` (In-place/Out-of-place Linear solving with LU and Cholesky factorizations)
- `inv` (Out-of-place matrix inversion)

From `AbstractFFTs`:
- `fft`/`fft!`
Expand Down
10 changes: 5 additions & 5 deletions docs/src/task-spawning.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,15 +29,15 @@ it'll be passed as-is to the function `f` (with some exceptions).

!!! note "Task / thread occupancy"
By default, `Dagger` assumes that tasks saturate the thread they are running on and does not try to schedule other tasks on the thread.
This default can be controlled by specifying [`Options`](@ref) (more details can be found under [Task and Scheduler options](@ref)).
This default can be controlled by specifying [`Options`](@ref Dagger.Options) (more details can be found under [Task and Scheduler options](@ref)).
The section [Changing the thread occupancy](@ref) shows a runnable example of how to achieve this.

## Options

The [`Options`](@ref Dagger.Options) struct in the second argument position is
optional; if provided, it is passed to the scheduler to control its
behavior. [`Options`](@ref Dagger.Options) contains option
key-value pairs, which can be any field in [`Options`](@ref)
key-value pairs, which can be any field in [`Options`](@ref Dagger.Options)
(see [Task and Scheduler options](@ref)).

## Simple example
Expand Down Expand Up @@ -125,7 +125,7 @@ The [`Options`](@ref Dagger.Options) struct in the second argument position is
optional; if provided, it is passed to the scheduler to control its
behavior. [`Options`](@ref Dagger.Options) contains a `NamedTuple` of option
key-value pairs, which can be any of:
- Any field in [`Options`](@ref) (see [Task and Scheduler options](@ref))
- Any field in [`Options`](@ref Dagger.Options) (see [Task and Scheduler options](@ref))
- `meta::Bool` -- Pass the input [`Chunk`](@ref) objects themselves to `f` and
not the value contained in them.

Expand Down Expand Up @@ -228,7 +228,7 @@ Note that, as a legacy API, usage of the lazy API is generally discouraged for m

While Dagger generally "just works", sometimes one needs to exert some more
fine-grained control over how the scheduler allocates work. There are two
parallel mechanisms to achieve this: Task options (from [`Options`](@ref)) and
parallel mechanisms to achieve this: Task options (from [`Options`](@ref Dagger.Options)) and
Scheduler options (from [`Sch.SchedulerOptions`](@ref)). Scheduler
options operate globally across an entire DAG, and Task options operate on a
task-by-task basis.
Expand Down Expand Up @@ -258,7 +258,7 @@ delayed(+; single=1)(1, 2)

## Changing the thread occupancy

One of the supported [`Options`](@ref) is the `occupancy` keyword.
One of the supported [`Options`](@ref Dagger.Options) is the `occupancy` keyword.
This keyword can be used to communicate that a task is not expected to fully
saturate a CPU core (e.g. due to being IO-bound).
The basic usage looks like this:
Expand Down
5 changes: 3 additions & 2 deletions src/Dagger.jl
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,10 @@ import SparseArrays: sprand, SparseMatrixCSC
import MemPool
import MemPool: DRef, FileRef, poolget, poolset

import Base: collect, reduce
import Base: collect, reduce, view

import LinearAlgebra
import LinearAlgebra: Adjoint, BLAS, Diagonal, Bidiagonal, Tridiagonal, LAPACK, LowerTriangular, PosDefException, Transpose, UpperTriangular, UnitLowerTriangular, UnitUpperTriangular, diagind, ishermitian, issymmetric
import LinearAlgebra: Adjoint, BLAS, Diagonal, Bidiagonal, Tridiagonal, LAPACK, LU, LowerTriangular, PosDefException, Transpose, UpperTriangular, UnitLowerTriangular, UnitUpperTriangular, Cholesky, diagind, ishermitian, issymmetric, I
import Random
import Random: AbstractRNG

Expand Down Expand Up @@ -125,6 +125,7 @@ include("array/sort.jl")
include("array/linalg.jl")
include("array/mul.jl")
include("array/cholesky.jl")
include("array/trsm.jl")
include("array/lu.jl")

# GPU
Expand Down
14 changes: 14 additions & 0 deletions src/array/alloc.jl
Original file line number Diff line number Diff line change
Expand Up @@ -184,6 +184,18 @@ function Base.zero(x::DArray{T,N}) where {T,N}
return _to_darray(a)
end

# Weird LinearAlgebra dispatch in `\` needs this
function LinearAlgebra._zeros(::Type{T}, B::DVector, n::Integer) where T
m = max(size(B, 1), n)
sz = (m,)
return zeros(auto_blocks(sz), T, sz)
end
function LinearAlgebra._zeros(::Type{T}, B::DMatrix, n::Integer) where T
m = max(size(B, 1), n)
sz = (m, size(B, 2))
return zeros(auto_blocks(sz), T, sz)
end

function Base.view(A::AbstractArray{T,N}, p::Blocks{N}) where {T,N}
d = ArrayDomain(Base.index_shape(A))
dc = partition(p, d)
Expand All @@ -192,3 +204,5 @@ function Base.view(A::AbstractArray{T,N}, p::Blocks{N}) where {T,N}
chunks = [tochunk(view(A, x.indexes...)) for x in dc]
return DArray(T, d, dc, chunks, p)
end
Base.view(A::AbstractArray, ::AutoBlocks) =
view(A, auto_blocks(size(A)))
68 changes: 38 additions & 30 deletions src/array/cholesky.jl
Original file line number Diff line number Diff line change
Expand Up @@ -15,27 +15,31 @@ function LinearAlgebra._chol!(A::DArray{T,2}, ::Type{UpperTriangular}) where T
rzone = one(real(T))
rmzone = -one(real(T))
uplo = 'U'
Ac = A.chunks
mt, nt = size(Ac)
iscomplex = T <: Complex
trans = iscomplex ? 'C' : 'T'

mb, nb = A.partitioning.blocksize
min_bs = min(mb, nb)
info = [convert(LinearAlgebra.BlasInt, 0)]
try
Dagger.spawn_datadeps() do
for k in range(1, mt)
Dagger.@spawn potrf_checked!(uplo, InOut(Ac[k, k]), Out(info))
for n in range(k+1, nt)
Dagger.@spawn BLAS.trsm!('L', uplo, trans, 'N', zone, In(Ac[k, k]), InOut(Ac[k, n]))
end
for m in range(k+1, mt)
if iscomplex
Dagger.@spawn BLAS.herk!(uplo, 'C', rmzone, In(Ac[k, m]), rzone, InOut(Ac[m, m]))
else
Dagger.@spawn BLAS.syrk!(uplo, 'T', rmzone, In(Ac[k, m]), rzone, InOut(Ac[m, m]))
maybe_copy_buffered(A => Blocks(min_bs, min_bs)) do A
Ac = A.chunks
mt, nt = size(Ac)
Dagger.spawn_datadeps() do
for k in range(1, mt)
Dagger.@spawn potrf_checked!(uplo, InOut(Ac[k, k]), Out(info))
for n in range(k+1, nt)
Dagger.@spawn BLAS.trsm!('L', uplo, trans, 'N', zone, In(Ac[k, k]), InOut(Ac[k, n]))
end
for n in range(m+1, nt)
Dagger.@spawn BLAS.gemm!(trans, 'N', mzone, In(Ac[k, m]), In(Ac[k, n]), zone, InOut(Ac[m, n]))
for m in range(k+1, mt)
if iscomplex
Dagger.@spawn BLAS.herk!(uplo, 'C', rmzone, In(Ac[k, m]), rzone, InOut(Ac[m, m]))
else
Dagger.@spawn BLAS.syrk!(uplo, 'T', rmzone, In(Ac[k, m]), rzone, InOut(Ac[m, m]))
end
for n in range(m+1, nt)
Dagger.@spawn BLAS.gemm!(trans, 'N', mzone, In(Ac[k, m]), In(Ac[k, n]), zone, InOut(Ac[m, n]))
end
end
end
end
Expand All @@ -56,27 +60,31 @@ function LinearAlgebra._chol!(A::DArray{T,2}, ::Type{LowerTriangular}) where T
rzone = one(real(T))
rmzone = -one(real(T))
uplo = 'L'
Ac = A.chunks
mt, nt = size(Ac)
iscomplex = T <: Complex
trans = iscomplex ? 'C' : 'T'

mb, nb = A.partitioning.blocksize
min_bs = min(mb, nb)
info = [convert(LinearAlgebra.BlasInt, 0)]
try
Dagger.spawn_datadeps() do
for k in range(1, mt)
Dagger.@spawn potrf_checked!(uplo, InOut(Ac[k, k]), Out(info))
for m in range(k+1, mt)
Dagger.@spawn BLAS.trsm!('R', uplo, trans, 'N', zone, In(Ac[k, k]), InOut(Ac[m, k]))
end
for n in range(k+1, nt)
if iscomplex
Dagger.@spawn BLAS.herk!(uplo, 'N', rmzone, In(Ac[n, k]), rzone, InOut(Ac[n, n]))
else
Dagger.@spawn BLAS.syrk!(uplo, 'N', rmzone, In(Ac[n, k]), rzone, InOut(Ac[n, n]))
maybe_copy_buffered(A => Blocks(min_bs, min_bs)) do A
Ac = A.chunks
mt, nt = size(Ac)
Dagger.spawn_datadeps() do
for k in range(1, mt)
Dagger.@spawn potrf_checked!(uplo, InOut(Ac[k, k]), Out(info))
for m in range(k+1, mt)
Dagger.@spawn BLAS.trsm!('R', uplo, trans, 'N', zone, In(Ac[k, k]), InOut(Ac[m, k]))
end
for m in range(n+1, mt)
Dagger.@spawn BLAS.gemm!('N', trans, mzone, In(Ac[m, k]), In(Ac[n, k]), zone, InOut(Ac[m, n]))
for n in range(k+1, nt)
if iscomplex
Dagger.@spawn BLAS.herk!(uplo, 'N', rmzone, In(Ac[n, k]), rzone, InOut(Ac[n, n]))
else
Dagger.@spawn BLAS.syrk!(uplo, 'N', rmzone, In(Ac[n, k]), rzone, InOut(Ac[n, n]))
end
for m in range(n+1, mt)
Dagger.@spawn BLAS.gemm!('N', trans, mzone, In(Ac[m, k]), In(Ac[n, k]), zone, InOut(Ac[m, n]))
end
end
end
end
Expand Down
7 changes: 5 additions & 2 deletions src/array/darray.jl
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
import Base: ==, fetch

export DArray, DVector, DMatrix, Blocks, AutoBlocks
export DArray, DVector, DMatrix, DVecOrMat, Blocks, AutoBlocks
export distribute


Expand Down Expand Up @@ -148,6 +148,7 @@ const WrappedDMatrix{T} = WrappedDArray{T,2}
const WrappedDVector{T} = WrappedDArray{T,1}
const DMatrix{T} = DArray{T,2}
const DVector{T} = DArray{T,1}
const DVecOrMat{T} = Union{DVector{T}, DMatrix{T}}

# mainly for backwards-compatibility
DArray{T, N}(domain, subdomains, chunks, partitioning, concat=cat) where {T,N} =
Expand Down Expand Up @@ -252,7 +253,9 @@ function Base.getindex(A::ColorArray{T,N}, idxs::NTuple{N,Int}) where {T,N}
if !haskey(A.seen_values, idxs)
chunk = A.A.chunks[sd_idx]
if chunk isa Chunk || isready(chunk)
value = A.seen_values[idxs] = Some(getindex(A.A, idxs))
value = A.seen_values[idxs] = allowscalar() do
Some(getindex(A.A, idxs))
end
else
# Show a placeholder instead
value = A.seen_values[idxs] = nothing
Expand Down
Loading