Host Low-cost Private Maven Repository on AWS
Deploy a low-cost private Maven repository management solution using Apache Archiva, AWS EC2, and AWS S3.
Introduction
I was looking for an alternative solution to using Nexus and Artifactory in AWS. I accidentally found the open-source Apache Archiva project while browsing through Apache projects listing trying to find something interesting. A quick demo setup of Archiva looked good to me with its decent web console.
So, I took it further to see if I can use S3 as a backend repository storage with Archiva. This idea of using S3 as the Maven repository has been explored in some blog posts; however, they lacked the sophistication provided by repository management products like Archiva.
Archiva provides configuration option to change destination directory for storing the Maven artifacts. I thought I could override this setting to use an S3 bucket as storage if I could find how to mount an S3 bucket as file system. While searching for this, I came across this blog and found S3FS-Fuse to be the right fit for my purpose.
I gave a try combining Apache Archiva with S3FS and it worked like a charm! I have described the steps I followed to get this approach working.
So without further ado, let’s get into it!
Prerequisites
- Running AWS EC2 instance with root access or a user with Sudo access. I am using
t2.micro
instance with Amazon Linux for this tutorial - The EC2 instance has a Security Group attached to it allowing access to SSH port
22
and TCP port8080
from your machine/subnet CIDR - The EC2 instance should have an IAM role which has read-write permissions to access S3 buckets in your AWS account
- Git is installed on the EC2 instance
Steps
Installation of Apache Archiva
- SSH into your EC2 instance and run the following commands to create installation directory and download Archiva stand-alone distribution:
$ mkdir archiva && cd archiva
$ wget http://apache.mirrors.pair.com/archiva/2.2.4/binaries/apache-archiva-2.2.4-bin.tar.gz
$ tar -zxvf apache-archiva-2.2.4-bin.tar.gz
You can refer to the Apache Archiva Installation instructions for more details. E.g. how to run Archiva on boot as a service.
2. Start Archiva
$ cd apache-archiva-2.2.4
$ ./bin/archiva console
Archiva runs on port 8080
by default.
3. Verify that Archiva web console is up and running. Browse to the Archiva URL in your browser- http://<YOUR_EC2_PUBLIC_IPV4_ADDRESS>:8080/
4. Create an Admin user as mentioned in the Apache Archiva QuickStart.
5. Check that Archiva is able to connect to Maven Central repository using Archiva’s Proxy Connectors. This connector is configured by default. Browse to the following URL in your browser-
http://<YOUR_EC2_PUBLIC_IPV4_ADDRESS>:8080/repository/internal/junit/junit/3.8.1/junit-3.8.1.jar
This should download junit-3.8.1.jar
in the ./repositories/internal
directory which is located in the base directory of your Archiva installation. You can verify the download by running following command in another SSH connection-
$ tree ~/archiva/apache-archiva-2.2.4/repositories/internal
.
└── junit
└── junit
├── 3.8.1
│ ├── junit-3.8.1.jar
│ ├── junit-3.8.1.jar.md5
│ ├── junit-3.8.1.jar.sha1
│ ├── junit-3.8.1.pom
│ ├── junit-3.8.1.pom.md5
│ ├── junit-3.8.1.pom.sha1
│ ├── maven-metadata.xml
│ ├── maven-metadata.xml.md5
│ └── maven-metadata.xml.sha1
├── maven-metadata.xml
├── maven-metadata.xml.md5
└── maven-metadata.xml.sha1
3 directories, 12 files
We can override the Directory
and Index Directory
paths to point to a directory that mounts the AWS S3 bucket as a file system. The next section describes the steps.
Mount S3 bucket on EC2 Linux Instance
We will mount an S3 bucket to our EC2 instance as an S3FS file system. You can refer to this article for more details.
Steps
- Create a private S3 bucket named
my-archiva-repo
or some unique name - Create an AWS IAM role with S3 Full Access Permission by navigating to
AWS Menu -> Your AWS Account Name -> My Security Credentials -> Users
. If you have the user created already, ensure that it has S3 Full Access permission. - Create S3FS password file & set owner only permissions. Make sure to replace
ACCESS_KEY_ID
andSECRET_ACCESS_KEY
with your IAM user’s keys
$ echo ACCESS_KEY_ID:SECRET_ACCESS_KEY > ${HOME}/.passwd-s3fs
$ chmod 600 ${HOME}/.passwd-s3fs
it’d have been great if S3FS could use IAM role attached to the EC2 to gain access to S3 instead of storing the keys in a password file. However, it doesn’t seem to be supported as of March 2020.
4. Setup S3FS on the EC2 instance by following the steps mentioned here. In my case, I followed installation instructions for RHEL
5. Create Mount Point & mount S3 bucket on it. Make sure to change the name of the bucket (my-archiva-repo
) and mount directory name (/my-archiva-repo
in the commands below to match your configuration.
$ mkdir ~/my-archiva-repo$ s3fs my-archiva-repo -o use_cache=/tmp -o allow_other -o uid=1001 -o mp_umask=002 -o multireq_max=5 ~/my-archiva-repo
You can also mount on boot by adding the following line to FSTAB-
s3fs#my-archiva-repo ~/my-archiva-repo fuse _netdev,allow_other 0 0
or
my-archiva-repo ~/my-archiva-repo fuse.s3fs _netdev,allow_other 0 0
Refer to this link for more examples.
Update Archiva Directory settings
Finally, let’s update Archiva directory and Index Directory paths to point to the S3FS mount point- ~/my-archiva-repo
.
Log in to Archiva console and update directory paths under theRepositories
tab. Note that we need to use the absolute path to the directory:
Set Directory path=/home/ec2-user/my-archiva-repo/repositories/internal
Set Index Directory path=/home/ec2-user/my-archiva-repo/repositories/internal/.indexer
Save the changes.
Repeat the same changes for other Repositories as well, e.g. for Snapshots
repository:
Set Directory path = /home/ec2-user/my-archiva-repo/repositories/snapshots
Set Index Directory path = /home/ec2-user/my-archiva-repo/repositories/snapshots/.indexer
Verify Archiva Repository Path changes
Point your browser to this URL-
http://<YOUR_EC2_PUBLIC_IPV4_ADDRESS>:8080/repository/internal/junit/junit/3.8.1/junit-3.8.1.jar
This should download a Junit
JAR again. However, this time the repository backend Directory will be an S3 bucket. So, you can check the contents of your mount point as-
$ tree ~/my-archiva-repo/
/home/ec2-user/my-archiva-repo/
└── repositories
└── internal
└── junit
└── junit
├── 3.8.1
│ ├── junit-3.8.1.jar
│ ├── junit-3.8.1.jar.md5
│ ├── junit-3.8.1.jar.sha1
│ ├── junit-3.8.1.pom
│ ├── junit-3.8.1.pom.md5
│ ├── junit-3.8.1.pom.sha1
│ ├── maven-metadata.xml
│ ├── maven-metadata.xml.md5
│ └── maven-metadata.xml.sha1
├── maven-metadata.xml
├── maven-metadata.xml.md5
└── maven-metadata.xml.sha1
5 directories, 12 files
And in the repository will be created in your S3 bucket!
Notice the S3 bucket directory structure will be created automatically as shown in below screenshot-
Concluding Thoughts
This post describes how to create a low-cost private Maven repository in AWS Cloud using EC2 and S3 services. This could be a good alternative to Nexus and Artifactory. It may not be feature-reach compared to commercial products available in the market but nonetheless, Archiva has good support to common repository management tasks making it a viable choice when you want to keep the costs low.
In order to keep this post concise and focused on the approach to host a low-cost, open-source solution for Maven repository management in AWS, I have not touched the security and other aspects related to the configuration of Archiva, S3, and EC2- For example, integration with LDAP, high-availability, access control, etc. in this post. Also, I would suggest having a look at S3FS limitations.
Do share your feedback in the comments section.
Cheers!
-Kunal