Labkey lists remote backup

EDIT – This is a solution for inexperienced PostgreSQL users. A much better solution is by using pg_dump to get a full backup of the “labkey” database. However, the pg_dump solution requires on to do port forwarding in VirtualBox on the PostgreSQL port (5433 or 5432) and to modify a few configuration files to allow remote access to the database. The benefit is however important!

Recently, I began using LabKey [labkey.org] to manage shared lists of reagents and strains, previously found in Excel/LibreOffice spreadsheets. The motivation for this switch was mostly the fact that it is relatively easy to do damage a big table in Excel in ways that are not always easy to track or to find. LabKey may not have the best interface out of the box but gets the work done. I found that the easiest way to install it was on a GNU/Linux Ubuntu virtual server, currently on a Windows machine (some notes on VirtualBox can be found in one of my previous posts).

Now, it is clearly easier to manage a spreadsheet in a spreadsheet software; however sharing spreadsheets between several users is not obvious. LabKey lists solve this problem to a large extent, they are easy to build from .xls files and relatively easy to use. The interface lacks, for the moment, any kind of batch editing, but I hope this is a feature that could be implemented in future releases. Batch import is easy with xls or tsv/csv files.

The reason for this post is that I could not find easily ways to backup the data found on the LabKey web server. Doing SQL dumps and the likes of it is in itself a whole affair and I know from one of my previous adventures with WordPress and MySql, that SQL backups require work to restore. As a workaround, I found that LabKey answers very well to http requests, via wget, for example (in green the variables that one has to change for each case):

/opt/local/bin/wget --server-response --http-user=a_user_mail 
--http-password=the_labkey_password  
"http://your_labkeyserver_address/labkey/query/Project_name
/getQuery.api?schemaName=lists&query.queryName=list_name" 
--output-document="some_file_name.json"

Here, the complete path for wget on my system is shown. It is useful to have it this way if the command is run by cron, for example.

This query can now be included in a short sh script run by cron to do weekly backups of that given list. Here is the template for the bash script doing it:

#!/bin/bash
# Storage folder where to move backup files
destination=some_absolute_path/LABKEYBACKUP
suffix="labkey" #or whatever you like
dest_folder_name=`date +"%Y%m%d"`$suffix
dest_folder=$destination/$dest_folder_name
mkdir $dest_folder
cd $dest_folder
ONE OR SEVERAL OF THE ABOVE wget LINES
zipname=$dest_folder.zip
zip -rj $zipname $dest_folder
#j option removes paths from the zip archive
#remove temporary files
rm -r $dest_folder #remove the temporary folder with the downloaded files

That’s it. This solution is far from being perfect and only backups the lists that one specifies but it is easy to use and the obtained json files are very easy to parse and transform to whatever you like.