Prepare Buscar for pip and conda packaging#87
Prepare Buscar for pip and conda packaging#87axiomcura wants to merge 25 commits intoWayScience:mainfrom
Conversation
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
* removed cluster refinement module * removed heterogeneity * updated metrics * created buscar module (separated notebook utils and buscar software) * updated test module * updated imports to reflect module changs * ignore vscode cached files * updated metrics * Update notebooks/2.cfret-analysis/nbconverted/1.cfret-pilot-buscar-analysis.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update notebooks/2.cfret-analysis/nbconverted/1.cfret-pilot-buscar-analysis.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update buscar/metrics.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update notebooks/4.cpjump1-analysis/nbconverted/4.run_buscar_rankings_base_on_moa.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update notebooks/2.cfret-analysis/nbconverted/1.cfret-pilot-buscar-analysis.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update buscar/metrics.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update notebooks/1.compound-prioritization/nbconverted/4.measure-phenotypic-activity.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update notebooks/4.cpjump1-analysis/nbconverted/3.calculate-on-off-scores.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * applied copilot's changes * update processing module * reran buscar on cfret data * reran nbconvert * removed pdf * update cpjump1 analysis module * updates * added mitocheck analysis module * added label shuffling method * mitocheck buscar ranking analysis * removed cpjump analysis folder, this should be in another PR * fixed bugs * fix bugs and updates * updated plots * fixes and updates --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* updated all plots * updated plots * name changes on axis update * update * Update notebooks/2.cfret-analysis/plots/nbconverted/1.on-and-off-pca-and-umap-plots.r Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update notebooks/2.cfret-analysis/plots/nbconverted/3.cfret-signature-effect-size-vs-sig-plot.r Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update notebooks/2.cfret-analysis/plots/nbconverted/1.on-and-off-pca-and-umap-plots.r Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update notebooks/2.cfret-analysis/plots/nbconverted/4.cfret-compound-scores-plot.r Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update r_buscar_env.yaml Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * updated plots and comments * update --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
d33bs
left a comment
There was a problem hiding this comment.
Great job! I left a few comments. I'm requesting changes because I'm concerned mostly about the environment management and how you're planning to publish to PyPI. Please don't hesitate to let me know if you have any questions.
There was a problem hiding this comment.
I suggest a follow up PR which focuses on the conda universe, abiding a separation of concerns. There could be, for instance, completely different needs with this.
|
|
||
| [project] | ||
| name = "buscar" | ||
| version = "0.1.0" |
There was a problem hiding this comment.
Consider using a dynamic version so you don't need to use static references. This would free you up to use git tags for the versioning, making for a more flexible release process. Reference this file for an example of how to do this with poetry.
There was a problem hiding this comment.
I added hatch to handle dynamic versioning:
https://github.com/pypa/hatch/tree/master/backend
it has support for uv. Let me know what you think!
There was a problem hiding this comment.
This file looks important. What happened to it? Consider keeping it in place and focusing on only pip + conda installation for this PR.
There was a problem hiding this comment.
This file was moved to the notebooks. It likely got cached when I ported the helper functions into the utils. Preprocessing is not part of buscar.
There was a lot of moving parts when decoupling the notebooks and the buscar package. I am sorry.
There was a problem hiding this comment.
This and other changes don't directly pertain to the pip and conda packaging focus, distracting from the goals. I suggest avoiding this pattern in the future, it complicates review, maintenance, and history for the project.
There was a problem hiding this comment.
Should this be a poetry lockfile instead of uv?
There was a problem hiding this comment.
These were remnants of Buscar's initial Poetry setup, but I have since migrated to using uv. All should be supported by uv
There was a problem hiding this comment.
Sounds good, thanks for the clarification!
There was a problem hiding this comment.
When I checked out the code in this PR, installed, and ran tests I found there were failures: ERROR collecting tests/test_checks.py. I specifically used poetry, abiding the current pyproject.toml designations.
There was a problem hiding this comment.
I still show tests are not passing. I followed the directions in the contributing file. Did I miss a step? If not, are additional changes needed to ensure this is working?
user@machine buscar: uv sync --frozen --group dev
Audited 64 packages in 7ms
user@machine buscar: uv run --frozen pytest
============================================================================================================ test session starts =============================================================================================================
platform darwin -- Python 3.13.5, pytest-9.0.3, pluggy-1.6.0
rootdir: /Users/buntend/Documents/work/buscar
configfile: pyproject.toml
testpaths: tests
plugins: cov-7.1.0
collected 0 items / 3 errors
=================================================================================================================== ERRORS ===================================================================================================================
___________________________________________________________________________________________________ ERROR collecting tests/test_checks.py ____________________________________________________________________________________________________
ImportError while importing test module '/Users/buntend/Documents/work/buscar/tests/test_checks.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
../../../.local/share/uv/python/cpython-3.13.5-macos-aarch64-none/lib/python3.13/importlib/__init__.py:88: in import_module
return _bootstrap._gcd_import(name[level:], package, level)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
tests/test_checks.py:5: in <module>
from buscar.checks import check_for_nans
src/buscar/__init__.py:9: in <module>
from buscar._data_utils import add_cell_id_hash as add_cell_id_hash
E ModuleNotFoundError: No module named 'buscar._data_utils'
___________________________________________________________________________________________________ ERROR collecting tests/test_metrics.py ___________________________________________________________________________________________________
ImportError while importing test module '/Users/buntend/Documents/work/buscar/tests/test_metrics.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
../../../.local/share/uv/python/cpython-3.13.5-macos-aarch64-none/lib/python3.13/importlib/__init__.py:88: in import_module
return _bootstrap._gcd_import(name[level:], package, level)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
tests/test_metrics.py:4: in <module>
from buscar.metrics import calculate_buscar_scores, compute_earth_movers_distance
src/buscar/__init__.py:9: in <module>
from buscar._data_utils import add_cell_id_hash as add_cell_id_hash
E ModuleNotFoundError: No module named 'buscar._data_utils'
_________________________________________________________________________________________________ ERROR collecting tests/test_signatures.py __________________________________________________________________________________________________
ImportError while importing test module '/Users/buntend/Documents/work/buscar/tests/test_signatures.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
../../../.local/share/uv/python/cpython-3.13.5-macos-aarch64-none/lib/python3.13/importlib/__init__.py:88: in import_module
return _bootstrap._gcd_import(name[level:], package, level)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
tests/test_signatures.py:4: in <module>
from buscar.signatures import identify_signatures
src/buscar/__init__.py:9: in <module>
from buscar._data_utils import add_cell_id_hash as add_cell_id_hash
E ModuleNotFoundError: No module named 'buscar._data_utils'
========================================================================================================== short test summary info ===========================================================================================================
ERROR tests/test_checks.py
ERROR tests/test_metrics.py
ERROR tests/test_signatures.py
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: 3 errors during collection !!!!Co-authored-by: Dave Bunten <ekgto445@gmail.com>
Co-authored-by: Dave Bunten <ekgto445@gmail.com>
|
Hey @d33bs Thank you for your thorough review. It's ready for a second round. Hopefully I was able to answer and catch all your comments. |
d33bs
left a comment
There was a problem hiding this comment.
Thanks for addressing some of the comments @axiomcura ! I am requesting changes again due to tests (CI within the context of this PR and local test failures) and a strong recommendation to include trusted publishing as part of the work in this PR (instead of a suggestion to manually upload builds).
| Publish to PyPI after configuring a PyPI token: | ||
|
|
||
| ```bash | ||
| poetry install | ||
| uv publish |
There was a problem hiding this comment.
I strongly urge removing this manual procedure from any guidance. It has has implicit security risks and generally might convey that the project is less mature or professional than it is. Trusted publishing is the recommended path for production-grade Python software and I highly recommend following this approach within the context of this PR (you're not far away from it). Concretely, this means adding another CI job definition following the guide here https://docs.pypi.org/trusted-publishers/using-a-publisher/ and configuring your PyPI account to receive the release on trigger from GitHub. Alongside this, you might consider adding more than one maintainer or the Way Lab group for access to the entry on PyPI (providing redundancy for access to the PyPI releases and avoiding a potential blocked future state).
There was a problem hiding this comment.
I still show tests are not passing. I followed the directions in the contributing file. Did I miss a step? If not, are additional changes needed to ensure this is working?
user@machine buscar: uv sync --frozen --group dev
Audited 64 packages in 7ms
user@machine buscar: uv run --frozen pytest
============================================================================================================ test session starts =============================================================================================================
platform darwin -- Python 3.13.5, pytest-9.0.3, pluggy-1.6.0
rootdir: /Users/buntend/Documents/work/buscar
configfile: pyproject.toml
testpaths: tests
plugins: cov-7.1.0
collected 0 items / 3 errors
=================================================================================================================== ERRORS ===================================================================================================================
___________________________________________________________________________________________________ ERROR collecting tests/test_checks.py ____________________________________________________________________________________________________
ImportError while importing test module '/Users/buntend/Documents/work/buscar/tests/test_checks.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
../../../.local/share/uv/python/cpython-3.13.5-macos-aarch64-none/lib/python3.13/importlib/__init__.py:88: in import_module
return _bootstrap._gcd_import(name[level:], package, level)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
tests/test_checks.py:5: in <module>
from buscar.checks import check_for_nans
src/buscar/__init__.py:9: in <module>
from buscar._data_utils import add_cell_id_hash as add_cell_id_hash
E ModuleNotFoundError: No module named 'buscar._data_utils'
___________________________________________________________________________________________________ ERROR collecting tests/test_metrics.py ___________________________________________________________________________________________________
ImportError while importing test module '/Users/buntend/Documents/work/buscar/tests/test_metrics.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
../../../.local/share/uv/python/cpython-3.13.5-macos-aarch64-none/lib/python3.13/importlib/__init__.py:88: in import_module
return _bootstrap._gcd_import(name[level:], package, level)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
tests/test_metrics.py:4: in <module>
from buscar.metrics import calculate_buscar_scores, compute_earth_movers_distance
src/buscar/__init__.py:9: in <module>
from buscar._data_utils import add_cell_id_hash as add_cell_id_hash
E ModuleNotFoundError: No module named 'buscar._data_utils'
_________________________________________________________________________________________________ ERROR collecting tests/test_signatures.py __________________________________________________________________________________________________
ImportError while importing test module '/Users/buntend/Documents/work/buscar/tests/test_signatures.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
../../../.local/share/uv/python/cpython-3.13.5-macos-aarch64-none/lib/python3.13/importlib/__init__.py:88: in import_module
return _bootstrap._gcd_import(name[level:], package, level)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
tests/test_signatures.py:4: in <module>
from buscar.signatures import identify_signatures
src/buscar/__init__.py:9: in <module>
from buscar._data_utils import add_cell_id_hash as add_cell_id_hash
E ModuleNotFoundError: No module named 'buscar._data_utils'
========================================================================================================== short test summary info ===========================================================================================================
ERROR tests/test_checks.py
ERROR tests/test_metrics.py
ERROR tests/test_signatures.py
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: 3 errors during collection !!!!| on: | ||
| push: | ||
| branches: [main] | ||
| pull_request: |
There was a problem hiding this comment.
This doesn't seem to be triggering for this PR. I'd expect tests to show as passing for this PR based on the implied configuration here. Consider adding the main branch here as a specifier to help encourage the execution within the context of this PR.
| - name: Set up uv | ||
| uses: astral-sh/setup-uv@v8.1.0 | ||
|
|
||
| - name: Install dependencies | ||
| run: uv sync --group dev --frozen | ||
|
|
||
| - name: Run pre-commit (required) | ||
| run: uv run --frozen pre-commit run --all-files --show-diff-on-failure |
There was a problem hiding this comment.
Instead of using the dev environment for this could you leverage pre-commit-action to install pre-commit and run the checks? This might also free up the dependency within the pyproject.toml file.
|
|
||
| steps: | ||
| - name: Checkout | ||
| uses: actions/checkout@v4 |
There was a problem hiding this comment.
This and other actions are out of date (checkout is for example at v6). Consider setting up a dependabot configuration to help keep this updated automatically. See here for an example of what this might look like.
| - name: Upload coverage artifact | ||
| if: matrix.os == 'ubuntu-latest' && matrix.python-version == '3.12' | ||
| uses: actions/upload-artifact@v4 | ||
| with: | ||
| name: coverage-xml | ||
| path: coverage.xml | ||
|
|
||
| - name: Upload HTML coverage artifact | ||
| if: matrix.os == 'ubuntu-latest' && matrix.python-version == '3.12' | ||
| uses: actions/upload-artifact@v4 | ||
| with: | ||
| name: coverage-html | ||
| path: htmlcov |
There was a problem hiding this comment.
What purpose do these files serve (do you need them)? If you don't need them, consider ditching these steps. Otherwise, consider indicating within the contributing guide what they're to be used for. You could also consider a service like Codecov to help track and communicate code coverage as you move forward on a PR to PR basis.
| [tool.poetry] | ||
| [dependency-groups] | ||
| dev = [ | ||
| "pre-commit>=4.2.0", |
There was a problem hiding this comment.
Rebumping this comment. Please see the comments in the CI yml configuration as well.
There was a problem hiding this comment.
Sounds good, thanks for the clarification!
| shell: bash | ||
| run: | | ||
| uv pip install --force-reinstall dist/*.whl | ||
| uv run --frozen python -c "import buscar; print(buscar.__name__)" |
There was a problem hiding this comment.
After this point in this job you could add additional steps to upload the package to PyPI automatically using trusted publishing.
This PR prepares Buscar for distribution through standard Python packaging workflows, with updates aimed at making the project ready for both
pip/PyPI installation and conda-based installation through the existingrecipe/meta.yamlrecipe. It also refreshes the project documentation so users and contributors can clearly see package status, supported Python versions, test status, and coverage outputs.In this PR:
pyproject.tomlfor pip-installable distribution.uv.lockfor reproducible uv-managed environments.recipe/meta.yamlso the conda recipe matches the package dependencies and supported Python range.uv.twine checkThis closes #78 and #88