This is a quick post to announce that paprica, our pipeline to evaluate community structure and conduct metabolic inference, is now available on the cloud as an Amazon Machine Instance (AMI). The AMI comes with all dependencies required to execute the paprica-run.sh script pre-installed. If you want to use it for paprica-build.sh you’ll have to install pathway-tools and a few additional dependencies. I’m new to the Amazon EC2 environment, so please let me know if you have any issues using the AMI.
If you are new to Amazon Web Services (AWS) the basic way this works is:
- Sign up for Amazon EC2 using your normal Amazon log-in
- From the AWS console, make sure that your region is N. Virginia (community AMI’s are only available in the region they were created in)
- From your EC2 dashboard, scroll down to “Create Instance” and click “Launch Instance”
- Now select the “Community AMIs”
- Search for “paprica-ec2”, then select the AMI corresponding to the latest version of paprica (0.4.0 at the time of writing).
- Choose the type of instance you would like to run the AMI on. This is the real power of AWS; you can tailor the instance to the analysis you would like to run. For testing choose the free t2.micro instance. This is sufficient to execute the test files or run a small analysis (hundreds of reads). To use paprica’s parallel features select an instance with the desired number of cores and sufficient memory.
- Click “Review and Launch”, and finish setting up the instance as appropriate.
- Log onto the instance, navigate to the paprica directory, execute the test file(s) as described in the paprica tutorial. The AMI is not updated as often as paprica, so you may wish to reclone the github repository, or download the latest stable release.
NOTE: We’ve stopped updating the AWS machine because it was kind of a pain, and it wasn’t clear how much use it was getting. We will consider updating it on request.