Nimbo
Search…
Nimbo config options
Valid fields for the nimbo config file.
By default, Nimbo looks for a config file called nimbo-config.yml in the current directory. You can choose a different path by setting the NIMBO_CONFIG environment variable.

Config file fields and values

cloud_provider

Currently must be AWS.

local_datasets_path

Path to the folder where datasets are stored, relative to the current project root. E.g., if you store your datasets in the project/data/datasetsfolder, you should set local_datasets_path: data/datasets .

local_results_path

Same as local_datasets_path but for the results folder. This will typically correspond to the folder where you write outputs of experiments, logs, models, checkpoints, etc.

s3_datasets_path

Path to the S3 folder where datasets are stored. This corresponds to the remote storage location from where datasets are pulled to the instance at run time, and to where datasets are pushed when you upload data from local_datasets_path using the nimbo push datasets command.

s3_results_path

Same as s3_datasets_path but for results. This is the remote folder where the outputs of your job and Nimbo logs will be stored. After and during a job (every 5 seconds), Nimbo backs-up your results to this S3 folder, which you can sync into your computer (at local_results_path ) by using nimbo pull results or nimbo pull logs.

aws_profile

The name of the AWS profile you want to use. This profile should exist in your ~/.aws/credentials file.

region_name

Code of the AWS region you want to use. The default is eu-west-1.

instance_type

Code of the instance you want to use, e.g. p2.xlarge, g4dn.xlarge, p3.xlarge, etc. Any instance type that exists in your region is supported (not limited to GPU instances).

spot

Whether to use a spot instance. If spot: no, an on-demand instance is used. Specs and prices for on-demand GPU instances can be easily checked with the command nimbo list-gpu-prices. If spot: yes, a spot instance is used (assuming your account has the necessary permissions and quotas). Specs and prices for spot GPU instances can be checked with the command nimbo list-spot-gpu-prices.

spot_duration

Requests a fixed duration spot instance (in minutes), which will not be preempted for the duration of the request. Valid values are 60, 120, 180, 240, 300, 360.

image

Image name (for Nimbo-managed images) or AMI code for the image to use. We recommend using our default managed image, image: ubuntu18-latest-drivers. You can find more details in the Manged Images section.

disk_size

Size (in Gb) of the instance's root volume.

conda_env

Name of the Conda environment YAML file that will be used for the remote job. This file must exist in the root folder of your project. E.g. if your environment file is located at project/environment.yml, set conda_env: environment.yml. Note: This environment will be used only in the remote instance. You do not need to use it to launch Nimbo. You can install and run nimbo from any local Conda/pip/venv environment you want.

run_in_background

If run_in_background: yes, the job will run on AWS without any process running on your computer (you only need to keep an internet connection while the instance is being setup and code/environment files are synced). This option is useful if you are confident a job will run correctly and want to be able to turn off your internet/computer while jobs are running, or to run many jobs are once.
If runinbackground: no , the job logs will be streamed to your terminal session. Clicking ctrl-c at any point will store any results/logs produced so far and shut-down the instance depending on the value of the persist field (see below).
For both background and foreground jobs, failed or successful, full run logs are stored at s3_result_spath/nimbo-logs/, which can be synced locally using nimbo pull logs.

persist

Whether to keep the instance running when there's an error or after the job finishes.
If persist: no, the instance will be automatically terminated if there is an error at any point during instance setup or job running, or when the job finished successfully.
If persist: yes, the instance will remain active in those cases. This is useful if you want to be able to log onto the instance (using nimbo ssh <instance-id> ) to perform some debugging if there's an error. If you are confident the jobs will run correctly and if you are using run_in_background: yes , we recommend using persist: no, to avoid more EC2 billing than necessary.

security_group

The security group name to use in the instance. Every AWS comes with a default security group. Make sure the security group used has the necessary inbound address permissions to allow SSH from your computer into the instance.

instance_key

The name of the EC2 key pair file to be used by the instance. The name of the file should correspond to the name of the key as defined in the EC2 dashboard. E.g., if you created a key pair called example-key in your EC2 dashboard, you should set instance_key: my-ec2-key-pair.pem.

role

The name of the role or instance profile to be used by the instance. The role gives permissions to the instance to access resources, e.g., S3 buckets.

Advanced options

We support additional configuration parameters in case you want to specify more advanced options. We recommend only hanging these options if you have a good understanding of AWS.

disk_type

EBS volume type. Allowed values: standard, gp2, gp3, io1, io2, sc1, st1. Default: gp2. For more details on EBS volume types visit https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-volume-types.html.
Using a non-default value for disk_type with one of our managed images will cause the instance to take longer to finish setup.

disk_iops

EBS volume iops. This value must be specified for io1 and io2 disk types. For more details on EBS volume iops visit https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-volume-types.html.

ssh_timeout

The amount of seconds after which Nimbo will give up trying to connect to your instance. The default is 120 seconds.

ip_cidr_range

CIDR range for IP added to security group when running nimbo run.

telemetry

Nimbo collects very basic telemetry - how many times run_job was executed per AWS user. This is used by us to gauge how widely Nimbo is used. By default telemetry is set to true.
Last modified 4mo ago