OSS Roundup

This is our ongoing series of articles in which we will give weekly highlights of open-source software (OSS) activities here at Anaconda. We will list various achievements of the last week or so, link to interesting things and give brief details of ongoing work and plans. Each team will only write something when they have something to say.

Conda (language-agnostic, multi-platform package management ecosystem)

We’ve released updates to conda-package-handling and conda-package-streaming to reduce memory usage. This will be the first conda-package-handling released to PyPI once PyPI admins free the name, using PyPI’s new organization feature.

Distdatacats Team (remote bytes, file formats, catalogs and data processing)

  • fsspec 2023.5.0 and friends are out
  • Reference Filesystem can now write references directly to parquet, allowing for combine of parquet-to-parquet, which should have a much smaller memory footprint for very large reference sets.
  • dask-awkward optimisations are finally in a good place: layer merging works and is fast, and we can run column optimisation on only one partition to avoid scaling issues. Upstream dask cull() remains an issue, scaling with number of partitions, and we are looking at ways to avoid this. The high-energy physics workflows prompting this introspection are easily the biggest dask task graphs in existence.
  • Article on this very blog about benchmarking a particular dask-parquet-s3 workflow and what we learned.

Numba (JIT-compiling python code to make it fast)

Jupyter (in-browser IDE for python and others)

The team has released the 1.0.0 version of nbclassic, a package that allows the “classic” Jupyter Notebook (equivalent to Notebook 6.5) to be installed and used alongside JupyterLab or Notebook 7 in an environment. We’ve also put out a release of the jupyter-nbextensions-configurator which fixes some compatibility issues with nbclassic. This week the team will be at JupyterCon, so stop by the Anaconda booth and say hello!

BeeWare (deploy python projects to mobile and elsewhere)

This week, the BeeWare team has been cleaning up after the PyCon US sprints. The sprints generated dozens of major and minor feature contributions; this week we’ve been able to merge nearly all of them.