Host Low-cost Private Maven Repository on AWS

Kunal Patil
6 min readMar 3, 2020

--

Deploy a low-cost private Maven repository management solution using Apache Archiva, AWS EC2, and AWS S3.

Introduction

I was looking for an alternative solution to using Nexus and Artifactory in AWS. I accidentally found the open-source Apache Archiva project while browsing through Apache projects listing trying to find something interesting. A quick demo setup of Archiva looked good to me with its decent web console.

So, I took it further to see if I can use S3 as a backend repository storage with Archiva. This idea of using S3 as the Maven repository has been explored in some blog posts; however, they lacked the sophistication provided by repository management products like Archiva.

Archiva provides configuration option to change destination directory for storing the Maven artifacts. I thought I could override this setting to use an S3 bucket as storage if I could find how to mount an S3 bucket as file system. While searching for this, I came across this blog and found S3FS-Fuse to be the right fit for my purpose.

I gave a try combining Apache Archiva with S3FS and it worked like a charm! I have described the steps I followed to get this approach working.

So without further ado, let’s get into it!

Prerequisites

  • Running AWS EC2 instance with root access or a user with Sudo access. I am using t2.micro instance with Amazon Linux for this tutorial
  • The EC2 instance has a Security Group attached to it allowing access to SSH port 22and TCP port 8080from your machine/subnet CIDR
  • The EC2 instance should have an IAM role which has read-write permissions to access S3 buckets in your AWS account
  • Git is installed on the EC2 instance

Steps

Installation of Apache Archiva

  1. SSH into your EC2 instance and run the following commands to create installation directory and download Archiva stand-alone distribution:
$ mkdir archiva && cd archiva
$ wget http://apache.mirrors.pair.com/archiva/2.2.4/binaries/apache-archiva-2.2.4-bin.tar.gz
$ tar -zxvf apache-archiva-2.2.4-bin.tar.gz

You can refer to the Apache Archiva Installation instructions for more details. E.g. how to run Archiva on boot as a service.

2. Start Archiva

$ cd apache-archiva-2.2.4
$ ./bin/archiva console

Archiva runs on port 8080 by default.

3. Verify that Archiva web console is up and running. Browse to the Archiva URL in your browser- http://<YOUR_EC2_PUBLIC_IPV4_ADDRESS>:8080/

Image-1: Apache Archiva Web Console

4. Create an Admin user as mentioned in the Apache Archiva QuickStart.

5. Check that Archiva is able to connect to Maven Central repository using Archiva’s Proxy Connectors. This connector is configured by default. Browse to the following URL in your browser-

http://<YOUR_EC2_PUBLIC_IPV4_ADDRESS>:8080/repository/internal/junit/junit/3.8.1/junit-3.8.1.jar

This should download junit-3.8.1.jar in the ./repositories/internal directory which is located in the base directory of your Archiva installation. You can verify the download by running following command in another SSH connection-

$ tree ~/archiva/apache-archiva-2.2.4/repositories/internal
.
└── junit
└── junit
├── 3.8.1
│ ├── junit-3.8.1.jar
│ ├── junit-3.8.1.jar.md5
│ ├── junit-3.8.1.jar.sha1
│ ├── junit-3.8.1.pom
│ ├── junit-3.8.1.pom.md5
│ ├── junit-3.8.1.pom.sha1
│ ├── maven-metadata.xml
│ ├── maven-metadata.xml.md5
│ └── maven-metadata.xml.sha1
├── maven-metadata.xml
├── maven-metadata.xml.md5
└── maven-metadata.xml.sha1
3 directories, 12 files

We can override the Directory and Index Directory paths to point to a directory that mounts the AWS S3 bucket as a file system. The next section describes the steps.

Image-2: Override Directory and Index Directory paths

Mount S3 bucket on EC2 Linux Instance

We will mount an S3 bucket to our EC2 instance as an S3FS file system. You can refer to this article for more details.

Steps

  1. Create a private S3 bucket named my-archiva-repo or some unique name
  2. Create an AWS IAM role with S3 Full Access Permission by navigating to AWS Menu -> Your AWS Account Name -> My Security Credentials -> Users . If you have the user created already, ensure that it has S3 Full Access permission.
  3. Create S3FS password file & set owner only permissions. Make sure to replace ACCESS_KEY_ID and SECRET_ACCESS_KEY with your IAM user’s keys
$ echo ACCESS_KEY_ID:SECRET_ACCESS_KEY > ${HOME}/.passwd-s3fs
$ chmod 600 ${HOME}/.passwd-s3fs

it’d have been great if S3FS could use IAM role attached to the EC2 to gain access to S3 instead of storing the keys in a password file. However, it doesn’t seem to be supported as of March 2020.

4. Setup S3FS on the EC2 instance by following the steps mentioned here. In my case, I followed installation instructions for RHEL

5. Create Mount Point & mount S3 bucket on it. Make sure to change the name of the bucket (my-archiva-repo) and mount directory name (/my-archiva-repo in the commands below to match your configuration.

$ mkdir ~/my-archiva-repo$ s3fs my-archiva-repo -o use_cache=/tmp -o allow_other -o uid=1001 -o mp_umask=002 -o multireq_max=5 ~/my-archiva-repo

You can also mount on boot by adding the following line to FSTAB-

s3fs#my-archiva-repo ~/my-archiva-repo fuse _netdev,allow_other 0 0

or

my-archiva-repo ~/my-archiva-repo fuse.s3fs _netdev,allow_other 0 0

Refer to this link for more examples.

Update Archiva Directory settings

Finally, let’s update Archiva directory and Index Directory paths to point to the S3FS mount point- ~/my-archiva-repo.

Log in to Archiva console and update directory paths under theRepositories tab. Note that we need to use the absolute path to the directory:

Set Directory path=/home/ec2-user/my-archiva-repo/repositories/internal

Set Index Directory path=/home/ec2-user/my-archiva-repo/repositories/internal/.indexer

Save the changes.

Image-3: Update Repository Directory Path

Repeat the same changes for other Repositories as well, e.g. for Snapshots repository:

Set Directory path = /home/ec2-user/my-archiva-repo/repositories/snapshots

Set Index Directory path = /home/ec2-user/my-archiva-repo/repositories/snapshots/.indexer

Verify Archiva Repository Path changes

Point your browser to this URL-

http://<YOUR_EC2_PUBLIC_IPV4_ADDRESS>:8080/repository/internal/junit/junit/3.8.1/junit-3.8.1.jar

This should download a Junit JAR again. However, this time the repository backend Directory will be an S3 bucket. So, you can check the contents of your mount point as-

$ tree ~/my-archiva-repo/
/home/ec2-user/my-archiva-repo/
└── repositories
└── internal
└── junit
└── junit
├── 3.8.1
│ ├── junit-3.8.1.jar
│ ├── junit-3.8.1.jar.md5
│ ├── junit-3.8.1.jar.sha1
│ ├── junit-3.8.1.pom
│ ├── junit-3.8.1.pom.md5
│ ├── junit-3.8.1.pom.sha1
│ ├── maven-metadata.xml
│ ├── maven-metadata.xml.md5
│ └── maven-metadata.xml.sha1
├── maven-metadata.xml
├── maven-metadata.xml.md5
└── maven-metadata.xml.sha1
5 directories, 12 files

And in the repository will be created in your S3 bucket!

Notice the S3 bucket directory structure will be created automatically as shown in below screenshot-

Image-4: S3 Bucket Contents

Concluding Thoughts

This post describes how to create a low-cost private Maven repository in AWS Cloud using EC2 and S3 services. This could be a good alternative to Nexus and Artifactory. It may not be feature-reach compared to commercial products available in the market but nonetheless, Archiva has good support to common repository management tasks making it a viable choice when you want to keep the costs low.

In order to keep this post concise and focused on the approach to host a low-cost, open-source solution for Maven repository management in AWS, I have not touched the security and other aspects related to the configuration of Archiva, S3, and EC2- For example, integration with LDAP, high-availability, access control, etc. in this post. Also, I would suggest having a look at S3FS limitations.

Do share your feedback in the comments section.

Cheers!

-Kunal

--

--

Kunal Patil
Kunal Patil

Responses (2)