Copied to clipboard

Setting up a private python package repository on S3

Open-sourcing a tool to install Python packages with pip from your private repository.

22/10/2020
April 5, 2016
-
5
min read

If you write software in Python, you’ve definitely used pip – Python’s package manager – before. Pip – a recursive acronym for “pip installs packages” – allows you to install Python packages that are publicly available.


Pip defaults to searching in and downloading from PyPI, the official package repository managed by the Python Software Foundation. PyPI is great for publicly available open source packages, such as the one we have open-sourced, but it’s not the go-to choice for distributing private or closed source packages. For that purpose, you should use a privately hosted package repository. There are a number of tools available, such as Gemfury (commercial) or pypiserver (open source). Commercial or open-sourced, all of these options require you to either run your own server or pay for someone else to do that for you.


At November Five, we use many of Amazon’s Web Services, such as EC2, RDS, ElastiCache, SQS, SNS, SES, Route 53, Lambda, DynamoDB, Cloudformation, Cloudfront, and of course, S3. S3 (Simple Storage Service) is an online service for file storage that allows you to store terabytes of data, but can also be used for static hosting. It has the double advantage of being both low in maintenance and cheap; you’d only be paying a couple of cents for the storage and bandwidth used by your site.


The tool we’re open-sourcing, s3pypi, allows you to more easily create your own package repository on S3.


Getting started

There are a few prerequisites when setting up a Python package repository on S3:

  • An AWS account. If you don’t have one already, go sign up.
  • A domain or subdomain, e.g. pypi.example.com. You should be able to create or modify the DNS record for the (sub)domain you want to use.
  • An SSL certificate for the domain you’re using.


AWS Setup

In your AWS account, you need to setup an S3 bucket configured for website hosting, as well as a Cloudfront distribution for serving the content in your S3 bucket over a secure (HTTPS) connection, which is required by pip (by default).


We’ve created a Cloudformation template that configures these resources for you.

  • Download the s3pypi template here.
  • Upload your SSL certificate to your AWS account and keep note of the ServerCertificateId.
  • Open the AWS console in your preferred AWS region, select Cloudformation, and click “Create stack”.
  • Give the stack a meaningful name and enter the subdomain and ID of the server certificate.
  • Skip through the next steps – unless you want to tag your resources – and create the stack.
  • When finished, click the “Outputs” tab, and copy the value of the “CNAMERecordValue” output parameter.
  • Create a CNAME (or alias if you’re using Route 53) record for your subdomain and point it to this Cloudfront distribution (CNAMERecordValue).


Block Quote
Got an AWS specific question?

S3pypi installation

Install the s3pypi command line tool by running

{% c-block language="ini" %}
$ (sudo) pip install -U s3pypi
{% c-block-end %}


in your console. If everything goes well, you should be able to run the s3pypi command line tool now:

{% c-block language="js" %}
$ s3pypi -v
{% c-block-end %}



Using S3pypi to publish a package

Now you’re ready to publish your first Python package to your private repository. Make sure you have your AWS credentials set up in your environment, and that you have permission to upload files to the S3 bucket that you created in the previous step.

In order to upload your package to your repository, cd to the root directory of your project, and run:

{% c-block language="js" %}
$ s3pypi --bucket pypi.example.com
{% c-block-end %}


For this to work, your project should contain a setup script. The official Python documentation contains a detailed guide on creating the setup script, and on distributing packages in general.


Installing packages

Install your packages using pip by pointing the --extra-index-url to your subdomain:

{% c-block language="js" %}
$ pip install my-project --extra-index-url https://pypi.example.com/
{% c-block-end %}


Alternatively, you can configure the index URL in ~/.pip/pip.conf:

{% c-block language="js" %}
[global]
extra-index-url = https://pypi.example.com/
{% c-block-end %}


Security

Access control for publishing files to your brand-new pypi repository is regulated entirely using your AWS Identity and Access Management. You could give an IAM user publish rights by assigning the managed IAM policy “PublishS3PyPIPackages” that is created by Cloudformation to their user profile.

Pip supports basic authentication for authenticating against a private PyPI server. Unfortunately, S3 does not. This increases the risk of your private packages leaking. To reduce this risk, you can take some additional measures:


If these security measures are insufficient for your needs, you could take a look at the open source project s3auth.com, but you should also consider hosting your Python repository elsewhere.


All done? We hope private packages will never make you say ni again…

Stay up-to-date with November Five

Follow us on LinkedIn for insights, learnings, use cases and more.

No items found.

Read further

Written by

Ruben Van den Bossche

Ruben

Chief Operating Officer

With his strong background in cloud computing and a PhD in Computer Science, Ruben knows his way around business after founding and selling his own startup. On top of that, he’s gradually becoming one of our premium food suppliers, since he’s often sharing his home-grown pumpkins, zucchini, and asparagus at the office.