Tips for using DivBase programmatically¶
Below is a set of tips for users who want to use DivBase in scripts/pipelines/programmatically.
If you have any tips or suggestions to add to this page or any desired features, please let us know!
TODO: E.G. how to wait for a query to be complete, and download the query results programmatically.
Use Personal Access Tokens to Authenticate programmatically¶
For scripts, pipelines, and HPC jobs the recommended approach is to use a Personal Access Token (PAT). A PAT is a static bearer token that you create once via the website and pass to divbase-cli via an environment variable. When using a PAT, there is no login step and no password storage required.
See Account Management — Personal Access Tokens for how to create/and remove PATs.
Once you have a token, set the DIVBASE_API_PAT environment variable to it. divbase-cli will automatically use it in every request.
What if I have both an active login session and a Personal Access Token set?
divbase-cli prioritises an active login session over a PAT. If you have both, the CLI will use the active session and ignore the PAT. To use the PAT, you would need to run divbase-cli auth logout first.
export DIVBASE_API_PAT="divbase_pat_your_token_here"
divbase-cli files ls
When DIVBASE_API_PAT is set, divbase-cli does not need you to be logged in.
Example: Slurm job script¶
The cleanest way to use a PAT in a SLURM job is to store the token in a restricted file and load it at job start:
echo "divbase_pat_your_token_here" > ~/.divbase_pat
chmod 600 ~/.divbase_pat # only readable/writeable by the owner
Then in your SLURM script:
#!/bin/bash
#SBATCH --job-name=my_divbase_job
#SBATCH --time=24:00:00
# ....
export DIVBASE_API_PAT=$(cat ~/.divbase_pat)
# Download the files you need
divbase-cli files download my_data.vcf.gz
Scope your token to what the job needs and when you need it for
When creating the PAT, restrict it to the specific project(s) you need it for. Consider also setting an appropriate expiry date for the token. You can always revoke the token immediately if needed from the divbase website.
Parse divbase-cli files ls/info output programmatically¶
-
You can make the output of the
divbase-cli files infoanddivbase-cli files lscommands in TSV format for easier parsing. Use the--tsvflag:divbase-cli files ls --tsv divbase-cli files info FILE_NAME --tsvYou can do the same for any project versions you've created for your project:
divbase-cli version ls --tsv divbase-cli version info VERSION_NAME --tsv -
Rather than first downloading a file, you can stream a file from the command line and pipe it into other tools for processing directly without saving it to disk.
divbase-cli files stream my_file.vcf.gz | zcat | lessInfo
BCFTools accepts stdin as input, so you can also pipe a VCF file directly into BCFTools without saving it first:
divbase-cli files stream my_file.vcf.gz | bcftools view -h -