Skip to content

Prepare Buscar for pip and conda packaging#87

Open
axiomcura wants to merge 25 commits intoWayScience:mainfrom
axiomcura:buscar-pip-packaging
Open

Prepare Buscar for pip and conda packaging#87
axiomcura wants to merge 25 commits intoWayScience:mainfrom
axiomcura:buscar-pip-packaging

Conversation

@axiomcura
Copy link
Copy Markdown
Member

@axiomcura axiomcura commented Mar 22, 2026

This PR prepares Buscar for distribution through standard Python packaging workflows, with updates aimed at making the project ready for both pip/PyPI installation and conda-based installation through the existing recipe/meta.yaml recipe. It also refreshes the project documentation so users and contributors can clearly see package status, supported Python versions, test status, and coverage outputs.

In this PR:

  • Updated packaging metadata in pyproject.toml for pip-installable distribution.
  • Added and refreshed uv.lock for reproducible uv-managed environments.
  • Removed the old Poetry lockfile in favor of uv-based dependency locking.
  • Updated recipe/meta.yaml so the conda recipe matches the package dependencies and supported Python range.
  • Added README badges for DOI, tests, supported Python versions, and coverage artifacts.
  • Documented local development and coverage commands using uv.
  • Updated CI coverage reporting to generate XML and HTML reports.
  • Added CI artifact uploads for coverage reports.
  • Verified that the package builds into a source distribution and wheel.
  • Verified package metadata with twine check

This closes #78 and #88

@review-notebook-app
Copy link
Copy Markdown

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

axiomcura and others added 5 commits March 31, 2026 14:03
* removed cluster refinement module

* removed heterogeneity

* updated metrics

* created buscar module (separated notebook utils and buscar software)

* updated test module

* updated imports to reflect module changs

* ignore vscode cached files

* updated metrics

* Update notebooks/2.cfret-analysis/nbconverted/1.cfret-pilot-buscar-analysis.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update notebooks/2.cfret-analysis/nbconverted/1.cfret-pilot-buscar-analysis.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update buscar/metrics.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update notebooks/4.cpjump1-analysis/nbconverted/4.run_buscar_rankings_base_on_moa.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update notebooks/2.cfret-analysis/nbconverted/1.cfret-pilot-buscar-analysis.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update buscar/metrics.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update notebooks/1.compound-prioritization/nbconverted/4.measure-phenotypic-activity.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update notebooks/4.cpjump1-analysis/nbconverted/3.calculate-on-off-scores.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* applied copilot's changes

* update processing module

* reran buscar on cfret data

* reran nbconvert

* removed pdf

* update cpjump1 analysis module

* updates

* added mitocheck analysis module

* added label shuffling method

* mitocheck buscar ranking analysis

* removed cpjump analysis folder, this should be in another PR

* fixed bugs

* fix bugs and updates

* updated plots

* fixes and updates

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* updated all plots

* updated plots

* name changes on axis update

* update

* Update notebooks/2.cfret-analysis/plots/nbconverted/1.on-and-off-pca-and-umap-plots.r

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update notebooks/2.cfret-analysis/plots/nbconverted/3.cfret-signature-effect-size-vs-sig-plot.r

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update notebooks/2.cfret-analysis/plots/nbconverted/1.on-and-off-pca-and-umap-plots.r

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update notebooks/2.cfret-analysis/plots/nbconverted/4.cfret-compound-scores-plot.r

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update r_buscar_env.yaml

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* updated plots and comments

* update

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@axiomcura axiomcura marked this pull request as ready for review May 5, 2026 20:05
@axiomcura axiomcura requested review from d33bs May 5, 2026 20:18
@axiomcura axiomcura changed the title Buscar pip packaging Prepare Buscar for pip and conda packaging May 5, 2026
Copy link
Copy Markdown
Member

@d33bs d33bs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job! I left a few comments. I'm requesting changes because I'm concerned mostly about the environment management and how you're planning to publish to PyPI. Please don't hesitate to let me know if you have any questions.

Comment thread recipe/meta.yaml Outdated
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest a follow up PR which focuses on the conda universe, abiding a separation of concerns. There could be, for instance, completely different needs with this.

Comment thread pyproject.toml Outdated

[project]
name = "buscar"
version = "0.1.0"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using a dynamic version so you don't need to use static references. This would free you up to use git tags for the versioning, making for a more flexible release process. Reference this file for an example of how to do this with poetry.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added hatch to handle dynamic versioning:
https://github.com/pypa/hatch/tree/master/backend

it has support for uv. Let me know what you think!

Comment thread src/buscar/__init__.py Outdated
Comment thread src/buscar/data_utils.py Outdated
Comment thread .github/workflows/ci.yml Outdated
Comment thread buscar/preprocess.py
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file looks important. What happened to it? Consider keeping it in place and focusing on only pip + conda installation for this PR.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file was moved to the notebooks. It likely got cached when I ported the helper functions into the utils. Preprocessing is not part of buscar.

There was a lot of moving parts when decoupling the notebooks and the buscar package. I am sorry.

Comment thread recipe/meta.yaml Outdated
Comment thread tests/conftest.py
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This and other changes don't directly pertain to the pip and conda packaging focus, distracting from the goals. I suggest avoiding this pattern in the future, it complicates review, maintenance, and history for the project.

Comment thread uv.lock
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a poetry lockfile instead of uv?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These were remnants of Buscar's initial Poetry setup, but I have since migrated to using uv. All should be supported by uv

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, thanks for the clarification!

Comment thread tests/conftest.py
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I checked out the code in this PR, installed, and ran tests I found there were failures: ERROR collecting tests/test_checks.py. I specifically used poetry, abiding the current pyproject.toml designations.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still show tests are not passing. I followed the directions in the contributing file. Did I miss a step? If not, are additional changes needed to ensure this is working?

user@machine buscar: uv sync --frozen --group dev
Audited 64 packages in 7ms
user@machine buscar: uv run --frozen pytest
============================================================================================================ test session starts =============================================================================================================
platform darwin -- Python 3.13.5, pytest-9.0.3, pluggy-1.6.0
rootdir: /Users/buntend/Documents/work/buscar
configfile: pyproject.toml
testpaths: tests
plugins: cov-7.1.0
collected 0 items / 3 errors                                                                                                                                                                                                                 

=================================================================================================================== ERRORS ===================================================================================================================
___________________________________________________________________________________________________ ERROR collecting tests/test_checks.py ____________________________________________________________________________________________________
ImportError while importing test module '/Users/buntend/Documents/work/buscar/tests/test_checks.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
../../../.local/share/uv/python/cpython-3.13.5-macos-aarch64-none/lib/python3.13/importlib/__init__.py:88: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
tests/test_checks.py:5: in <module>
    from buscar.checks import check_for_nans
src/buscar/__init__.py:9: in <module>
    from buscar._data_utils import add_cell_id_hash as add_cell_id_hash
E   ModuleNotFoundError: No module named 'buscar._data_utils'
___________________________________________________________________________________________________ ERROR collecting tests/test_metrics.py ___________________________________________________________________________________________________
ImportError while importing test module '/Users/buntend/Documents/work/buscar/tests/test_metrics.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
../../../.local/share/uv/python/cpython-3.13.5-macos-aarch64-none/lib/python3.13/importlib/__init__.py:88: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
tests/test_metrics.py:4: in <module>
    from buscar.metrics import calculate_buscar_scores, compute_earth_movers_distance
src/buscar/__init__.py:9: in <module>
    from buscar._data_utils import add_cell_id_hash as add_cell_id_hash
E   ModuleNotFoundError: No module named 'buscar._data_utils'
_________________________________________________________________________________________________ ERROR collecting tests/test_signatures.py __________________________________________________________________________________________________
ImportError while importing test module '/Users/buntend/Documents/work/buscar/tests/test_signatures.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
../../../.local/share/uv/python/cpython-3.13.5-macos-aarch64-none/lib/python3.13/importlib/__init__.py:88: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
tests/test_signatures.py:4: in <module>
    from buscar.signatures import identify_signatures
src/buscar/__init__.py:9: in <module>
    from buscar._data_utils import add_cell_id_hash as add_cell_id_hash
E   ModuleNotFoundError: No module named 'buscar._data_utils'
========================================================================================================== short test summary info ===========================================================================================================
ERROR tests/test_checks.py
ERROR tests/test_metrics.py
ERROR tests/test_signatures.py
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: 3 errors during collection !!!!

axiomcura and others added 2 commits May 6, 2026 09:27
Co-authored-by: Dave Bunten <ekgto445@gmail.com>
Co-authored-by: Dave Bunten <ekgto445@gmail.com>
@axiomcura
Copy link
Copy Markdown
Member Author

Hey @d33bs Thank you for your thorough review. It's ready for a second round. Hopefully I was able to answer and catch all your comments.

Copy link
Copy Markdown
Member

@d33bs d33bs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing some of the comments @axiomcura ! I am requesting changes again due to tests (CI within the context of this PR and local test failures) and a strong recommendation to include trusted publishing as part of the work in this PR (instead of a suggestion to manually upload builds).

Comment thread README.md
Comment on lines +96 to +99
Publish to PyPI after configuring a PyPI token:

```bash
poetry install
uv publish
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I strongly urge removing this manual procedure from any guidance. It has has implicit security risks and generally might convey that the project is less mature or professional than it is. Trusted publishing is the recommended path for production-grade Python software and I highly recommend following this approach within the context of this PR (you're not far away from it). Concretely, this means adding another CI job definition following the guide here https://docs.pypi.org/trusted-publishers/using-a-publisher/ and configuring your PyPI account to receive the release on trigger from GitHub. Alongside this, you might consider adding more than one maintainer or the Way Lab group for access to the entry on PyPI (providing redundancy for access to the PyPI releases and avoiding a potential blocked future state).

Comment thread tests/conftest.py
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still show tests are not passing. I followed the directions in the contributing file. Did I miss a step? If not, are additional changes needed to ensure this is working?

user@machine buscar: uv sync --frozen --group dev
Audited 64 packages in 7ms
user@machine buscar: uv run --frozen pytest
============================================================================================================ test session starts =============================================================================================================
platform darwin -- Python 3.13.5, pytest-9.0.3, pluggy-1.6.0
rootdir: /Users/buntend/Documents/work/buscar
configfile: pyproject.toml
testpaths: tests
plugins: cov-7.1.0
collected 0 items / 3 errors                                                                                                                                                                                                                 

=================================================================================================================== ERRORS ===================================================================================================================
___________________________________________________________________________________________________ ERROR collecting tests/test_checks.py ____________________________________________________________________________________________________
ImportError while importing test module '/Users/buntend/Documents/work/buscar/tests/test_checks.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
../../../.local/share/uv/python/cpython-3.13.5-macos-aarch64-none/lib/python3.13/importlib/__init__.py:88: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
tests/test_checks.py:5: in <module>
    from buscar.checks import check_for_nans
src/buscar/__init__.py:9: in <module>
    from buscar._data_utils import add_cell_id_hash as add_cell_id_hash
E   ModuleNotFoundError: No module named 'buscar._data_utils'
___________________________________________________________________________________________________ ERROR collecting tests/test_metrics.py ___________________________________________________________________________________________________
ImportError while importing test module '/Users/buntend/Documents/work/buscar/tests/test_metrics.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
../../../.local/share/uv/python/cpython-3.13.5-macos-aarch64-none/lib/python3.13/importlib/__init__.py:88: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
tests/test_metrics.py:4: in <module>
    from buscar.metrics import calculate_buscar_scores, compute_earth_movers_distance
src/buscar/__init__.py:9: in <module>
    from buscar._data_utils import add_cell_id_hash as add_cell_id_hash
E   ModuleNotFoundError: No module named 'buscar._data_utils'
_________________________________________________________________________________________________ ERROR collecting tests/test_signatures.py __________________________________________________________________________________________________
ImportError while importing test module '/Users/buntend/Documents/work/buscar/tests/test_signatures.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
../../../.local/share/uv/python/cpython-3.13.5-macos-aarch64-none/lib/python3.13/importlib/__init__.py:88: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
tests/test_signatures.py:4: in <module>
    from buscar.signatures import identify_signatures
src/buscar/__init__.py:9: in <module>
    from buscar._data_utils import add_cell_id_hash as add_cell_id_hash
E   ModuleNotFoundError: No module named 'buscar._data_utils'
========================================================================================================== short test summary info ===========================================================================================================
ERROR tests/test_checks.py
ERROR tests/test_metrics.py
ERROR tests/test_signatures.py
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: 3 errors during collection !!!!

Comment thread .github/workflows/ci.yml
on:
push:
branches: [main]
pull_request:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem to be triggering for this PR. I'd expect tests to show as passing for this PR based on the implied configuration here. Consider adding the main branch here as a specifier to help encourage the execution within the context of this PR.

Comment thread .github/workflows/ci.yml
Comment on lines +21 to +28
- name: Set up uv
uses: astral-sh/setup-uv@v8.1.0

- name: Install dependencies
run: uv sync --group dev --frozen

- name: Run pre-commit (required)
run: uv run --frozen pre-commit run --all-files --show-diff-on-failure
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of using the dev environment for this could you leverage pre-commit-action to install pre-commit and run the checks? This might also free up the dependency within the pyproject.toml file.

Comment thread .github/workflows/ci.yml

steps:
- name: Checkout
uses: actions/checkout@v4
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This and other actions are out of date (checkout is for example at v6). Consider setting up a dependabot configuration to help keep this updated automatically. See here for an example of what this might look like.

Comment thread .github/workflows/ci.yml
Comment on lines +58 to +70
- name: Upload coverage artifact
if: matrix.os == 'ubuntu-latest' && matrix.python-version == '3.12'
uses: actions/upload-artifact@v4
with:
name: coverage-xml
path: coverage.xml

- name: Upload HTML coverage artifact
if: matrix.os == 'ubuntu-latest' && matrix.python-version == '3.12'
uses: actions/upload-artifact@v4
with:
name: coverage-html
path: htmlcov
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What purpose do these files serve (do you need them)? If you don't need them, consider ditching these steps. Otherwise, consider indicating within the contributing guide what they're to be used for. You could also consider a service like Codecov to help track and communicate code coverage as you move forward on a PR to PR basis.

Comment thread pyproject.toml
[tool.poetry]
[dependency-groups]
dev = [
"pre-commit>=4.2.0",
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rebumping this comment. Please see the comments in the CI yml configuration as well.

Comment thread uv.lock
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, thanks for the clarification!

Comment thread .github/workflows/ci.yml
shell: bash
run: |
uv pip install --force-reinstall dist/*.whl
uv run --frozen python -c "import buscar; print(buscar.__name__)"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After this point in this job you could add additional steps to upload the package to PyPI automatically using trusted publishing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Make buscar pip installable

2 participants