admin管理员组

文章数量:1335617

The Google Artifact Registry documentation to Manage Python packages says that when running pip install PACKAGE against a virtual repository:

If you request a version that is available in more than one upstream repository, Artifact Registry chooses an upstream repository to use based on the priority settings configured for the virtual repository.

In the "Virtual repositories overview", the section explaining How virtual repositories select an upstream repository even explicitly discusses the case of pip:

For example, if you configure the Python pip tool to search PyPI and a virtual repository, your package might be downloaded directly from PyPI because pip will always choose the latest version of a package, regardless of which repository it comes from. If pip is configured to only search the virtual repository, you can then control the priority of all upstream repositories, including an upstream remote repository that acts as a proxy for PyPI.

This works as documented when specifying a package version with ==, but when using other requirement specifiers (or no requirement specifiers), pip installs the highest version that matches the specifier across all upstreams, completely ignoring priority settings.

How do I configure the virtual repository in order to get packages that exist in an upstream with a higher priority, only from that upstream, regardless of available versions in upstreams with lower priorities?

For instance, I have created a standard repo to store my own packages (python-repo), a remote repo to access PyPi (pypi-proxy), and a virtual repo that aggregates python-repo and pypi-proxy with respective priorities 100 and 10 (virtual-python-repo):

PROJECT_ID=my-project-123456
LOCATION=us-west1
gcloud artifacts repositories create python-repo --repository-format=python \
      --location="$LOCATION" --description="local repo"
gcloud artifacts repositories create pypi-proxy --repository-format=python \
    --location="$LOCATION" --description="PyPi proxy" \
    --mode=remote-repository --remote-repo-config-desc="PyPi" \
    --remote-python-repo=PYPI --project="$PROJECT_ID"
gcloud artifacts repositories create virtual-python-repo --repository-format=python \
    --location="$LOCATION" --description="Virtual repo" \
    --mode=virtual-repository \
    --upstream-policy-file=policies.json --project="$PROJECT_ID" \

With policies.json:

[{
  "id": "python-repo",
  "repository": "projects/my-project-123456/locations/us-west1/repositories/python-repo",
  "priority": 100
 },{
  "id": "pypi-proxy",
  "repository": "projects/my-project-123456/locations/us-west1/repositories/pypi-proxy",
  "priority": 10
 }]

I am setting pip.conf and .pypirc to point to virtual-python-repo.

With this setting, if I create a new project named "sampleproject" (which already exists on PyPi with version from 1.2.0 to 4.0.0) this is the behavior that I get:

If I set the version to 1.0.0, build, and push sampleproject-1.0.0 to python-repo:

  • pip install sampleproject==1.0.0 installs the version 1.0.0 from python-repo
  • pip install sampleproject installs version 4.0.0 from PyPi

The desired behavior, would be to always install sampleproject from python-repo and ignore the versions from pypi. I know that this isn't possible with pip alone, but I was hoping that the virtual repo would enable enforcing such policies.

The Google Artifact Registry documentation to Manage Python packages says that when running pip install PACKAGE against a virtual repository:

If you request a version that is available in more than one upstream repository, Artifact Registry chooses an upstream repository to use based on the priority settings configured for the virtual repository.

In the "Virtual repositories overview", the section explaining How virtual repositories select an upstream repository even explicitly discusses the case of pip:

For example, if you configure the Python pip tool to search PyPI and a virtual repository, your package might be downloaded directly from PyPI because pip will always choose the latest version of a package, regardless of which repository it comes from. If pip is configured to only search the virtual repository, you can then control the priority of all upstream repositories, including an upstream remote repository that acts as a proxy for PyPI.

This works as documented when specifying a package version with ==, but when using other requirement specifiers (or no requirement specifiers), pip installs the highest version that matches the specifier across all upstreams, completely ignoring priority settings.

How do I configure the virtual repository in order to get packages that exist in an upstream with a higher priority, only from that upstream, regardless of available versions in upstreams with lower priorities?

For instance, I have created a standard repo to store my own packages (python-repo), a remote repo to access PyPi (pypi-proxy), and a virtual repo that aggregates python-repo and pypi-proxy with respective priorities 100 and 10 (virtual-python-repo):

PROJECT_ID=my-project-123456
LOCATION=us-west1
gcloud artifacts repositories create python-repo --repository-format=python \
      --location="$LOCATION" --description="local repo"
gcloud artifacts repositories create pypi-proxy --repository-format=python \
    --location="$LOCATION" --description="PyPi proxy" \
    --mode=remote-repository --remote-repo-config-desc="PyPi" \
    --remote-python-repo=PYPI --project="$PROJECT_ID"
gcloud artifacts repositories create virtual-python-repo --repository-format=python \
    --location="$LOCATION" --description="Virtual repo" \
    --mode=virtual-repository \
    --upstream-policy-file=policies.json --project="$PROJECT_ID" \

With policies.json:

[{
  "id": "python-repo",
  "repository": "projects/my-project-123456/locations/us-west1/repositories/python-repo",
  "priority": 100
 },{
  "id": "pypi-proxy",
  "repository": "projects/my-project-123456/locations/us-west1/repositories/pypi-proxy",
  "priority": 10
 }]

I am setting pip.conf and .pypirc to point to virtual-python-repo.

With this setting, if I create a new project named "sampleproject" (which already exists on PyPi with version from 1.2.0 to 4.0.0) this is the behavior that I get:

If I set the version to 1.0.0, build, and push sampleproject-1.0.0 to python-repo:

  • pip install sampleproject==1.0.0 installs the version 1.0.0 from python-repo
  • pip install sampleproject installs version 4.0.0 from PyPi

The desired behavior, would be to always install sampleproject from python-repo and ignore the versions from pypi. I know that this isn't possible with pip alone, but I was hoping that the virtual repo would enable enforcing such policies.

Share Improve this question edited Nov 25, 2024 at 16:53 Come Raczy asked Nov 19, 2024 at 23:34 Come RaczyCome Raczy 1,6801 gold badge18 silver badges27 bronze badges 5
  • Have a look at this Documentation. – Sandeep Vokkareni Commented Nov 20, 2024 at 8:05
  • @SandeepVokkareni Thanks. Yes, the sentence at the end of that section makes me think that when pip is configured to search only the virtual repository, I should be able to control the priority of all upstream repositories, including an upstream remote repository that acts as a proxy for PyPI (paraphrasing the doc). But that's happening only when I am using '==' as a requirements specifier, which isn't much better than using the standard repo as "index-url" and pypi as "extra-index-url". Or did I miss something in that documentation? – Come Raczy Commented Nov 20, 2024 at 16:10
  • Are you following this Documentation for creating virtual repositories? If not please share the Documentation you are following to create the Virtual Repository. – Sandeep Vokkareni Commented Nov 21, 2024 at 10:04
  • @SandeepVokkareni yes, that's the documentation I used, in particular, the command line and policies.json that I included above were written according to the section Create a virtual repository using gcloud CLI – Come Raczy Commented Nov 21, 2024 at 19:23
  • It seems like it would require a project inspection to find what caused the issue. It’s better to contact Google support to find the resolution – Sandeep Vokkareni Commented Nov 26, 2024 at 12:17
Add a comment  | 

1 Answer 1

Reset to default 0

After discussion with Google support, it appears that the behavior described above is expected. The virtual repository still collects the union of all the relevant versions across all upstreams, regardless of their priorities. The priority of the upstream repositories is only used to select a specific upstream when a specific version is available across multiple upstreams.

The implication is that with python virtual repositories, in order to ensure that a package will be installed from the private repo, it is necessary to use "==" as version specifier (and make sure that the specified version is indeed in the private repo, and that the private repo has the highest priority).

本文标签: