Copied to clipboard

Setting up a private python package repository on S3

Open-sourcing a tool to install Python packages with pip from your private repository.

14/11/2019
-
5
min read
Setting up a private python package repository on S3

If you write software in Python, you’ve definitely used pip – Python’s package manager – before. Pip – a recursive acronym for “pip installs packages” – allows you to install Python packages that are publicly available.


Pip defaults to searching in and downloading from PyPI, the official package repository managed by the Python Software Foundation. PyPI is great for publicly available open source packages, such as the one we have open-sourced, but it’s not the go-to choice for distributing private or closed source packages. For that purpose, you should use a privately hosted package repository. There are a number of tools available, such as Gemfury (commercial) or pypiserver (open source). Commercial or open-sourced, all of these options require you to either run your own server or pay for someone else to do that for you.


At November Five, we use many of Amazon’s Web Services, such as EC2, RDS, ElastiCache, SQS, SNS, SES, Route 53, Lambda, DynamoDB, Cloudformation, Cloudfront, and of course, S3. S3 (Simple Storage Service) is an online service for file storage that allows you to store terabytes of data, but can also be used for static hosting. It has the double advantage of being both low in maintenance and cheap; you’d only be paying a couple of cents for the storage and bandwidth used by your site.


The tool we’re open-sourcing, s3pypi, allows you to more easily create your own package repository on S3.


Getting started

There are a few prerequisites when setting up a Python package repository on S3:


AWS Setup

In your AWS account, you need to setup an S3 bucket configured for website hosting, as well as a Cloudfront distribution for serving the content in your S3 bucket over a secure (HTTPS) connection, which is required by pip (by default).


We’ve created a Cloudformation template that configures these resources for you.


Got an AWS specific question?

S3pypi installation

Install the s3pypi command line tool by running

{% c-block language="js" %}
// $ (sudo) pip install -U s3pypi
{% c-block-end %}


in your console. If everything goes well, you should be able to run the s3pypi command line tool now:

{% c-block language="js" %}
// $ s3pypi -v
{% c-block-end %}



Using S3pypi to publish a package

Now you’re ready to publish your first Python package to your private repository. Make sure you have your AWS credentials set up in your environment, and that you have permission to upload files to the S3 bucket that you created in the previous step.

In order to upload your package to your repository, cd to the root directory of your project, and run:

{% c-block language="js" %}
// $ s3pypi --bucket pypi.example.com
{% c-block-end %}


For this to work, your project should contain a setup script. The official Python documentation contains a detailed guide on creating the setup script, and on distributing packages in general.


Installing packages

Install your packages using pip by pointing the --extra-index-url to your subdomain:

{% c-block language="js" %}
// $ pip install my-project --extra-index-url https://pypi.example.com/
{% c-block-end %}


Alternatively, you can configure the index URL in ~/.pip/pip.conf:

{% c-block language="js" %}
// [global]
// extra-index-url = https://pypi.example.com/
{% c-block-end %}


Security

Access control for publishing files to your brand-new pypi repository is regulated entirely using your AWS Identity and Access Management. You could give an IAM user publish rights by assigning the managed IAM policy “PublishS3PyPIPackages” that is created by Cloudformation to their user profile.

Pip supports basic authentication for authenticating against a private PyPI server. Unfortunately, S3 does not. This increases the risk of your private packages leaking. To reduce this risk, you can take some additional measures:


If these security measures are insufficient for your needs, you could take a look at the open source project s3auth.com, but you should also consider hosting your Python repository elsewhere.


All done? We hope private packages will never make you say ni again…


extra extra

Get the news. Subscribe to our newsletter to get at all the latest on business, design, technology and life at November Five.

extra extra

Get the news. Subscribe to our newsletter to get at all the latest on business, design, technology and life at November Five.

extra extra

Get the news. Subscribe to our newsletter to get at all the latest on business, design, technology and life at November Five.

extra extra

Get the news. Subscribe to our newsletter to get at all the latest on business, design, technology and life at November Five.

Read further

Written by

Ruben Van den Bossche

Ruben

Chief Operating Officer

With his strong background in cloud computing and a PhD in Computer Science, Ruben knows his way around business after founding and selling his own startup. On top of that, he’s gradually becoming one of our premium food suppliers, since he’s often sharing his home-grown pumpkins, zucchini, and asparagus at the office.