Column order and decimal point changes with awk and sed

Recently, I’ve been confronted with a simple problem that I usually solve in a spreadsheet application. In a text file, change the order of columns and shift from “,” to “.” as a decimal point separator. However, since I had many similarly formatted text files and did want to speed up the conversion, I searched for tools that could help me do theses simple tasks without too much hassle. Welcome awk, a programming language and tool for text processing and sed, a line-by-line editor, both available by default in MacOS X and Linux. As usual, stackexchange answers and question were extremely useful to quickly find a solution.

While I was awking happily around, I became aware of a problem that I did not expect – it uses line feed characters (LF) as line terminator and if the file comes from Window, it has carriage return (CR) and LF at the end of lines. Thus, the first step needed to get a clean file, was to remove those annoying CR (well visible in the following screenshot, on Geany, a fantastic text editor):

inputfile

This can be done with the following awk command that removes CR characters while leaving LF in the file:

awk '{ sub(/\r$/,""); print }' infile.txt >outfile.txt

The result is, as expected:

crremoved

Now, we can proceed with the following step, which is a change in column order. What I needed was to move 4th column in second position. Awk comes to the rescue here as well:

awk -F\t '{print $1,$4,$2,$3}' OFS=$'\t' infile.txt > outfile.txt

The result looks good, no more CRs and the order of columns is fine:

columnorderchanged

Finally, the numeric values that used “,” as decimal separator were not correctly interpreted by the clustering program. However, changing all the commas to dots was not very nice, because column 2 now contains useful text commas. Sed provided a very simple command to do that:

sed 's/\([0-9]\)\,\([0-9]\)/\1.\2/g' < infile.txt > outfile.txt

To understand how sed does its thing, one must be familiar with regular expressions.

In the end, the file looks exactly as I wanted it to be:

commadotchanged

There is no need to keep intermediate files and these commands can be chained using the “|” pipe operator. Alternatively, they can be put together in a small shell script.

Advertisements

Any alternatives to LaTeX for collaborative manuscript writing in science ?

In a world in which we read mostly on screens and in which unprintable data like videos or high resolution images are part of published papers, it makes sense to think of new ways of producing, sharing and reading research results. Finding ways to easily collaborate with co-workers and to be able to keep a manuscript in a shareable and flexible format is an ongoing quest (see for example, datacite).

When trying to see what online tools allow to write manuscripts in a collaborative way, I was very much impressed by Overleaf‘s interface and gave it a try for a real manuscript writing. While some of my colleagues had no problem in working with the system, it is still a little odd that, for example, including citations requires uploading a file in the .bib format. Some ‘infinite compiling’ errors scared one of my collaborators and were perceived as a lack of robustness of the system. Adding references and cross-references remains quite involved and one needs some time spent in LaTeX innards to be able to get to a nice end result.

Lens Writer screenshot
An image of Lens Writer in action. Installing the package from github and launching a local server is very simple.

This post was motivated by my recent enthusiasm when reading about a JavaScript library called Substance that serves as a basis for several projects, including one that is designed to allow easy writing and sharing of scientific data:  Lens Writer. The philosophy of this way of writing a scientific report is, from my very limited understanding of it, that everything revolves around web-based technologies, running with JavaScript. An early version of the editor, working with Node.js, provides a play ground for those curious to test its capabilities. Definitely a project that will be very interesting to follow and see its evolution!

LabKey – manage shared lists of reagents, oligos, strains

LabKey is a very friendly system for lab scale, or larger, sharing of common data. In our own hands, LabKey replaced a series of spreadsheets giving a much better way to edit and view things; mostly lists of reagents. The software can be obtained from:

https://www.labkey.org/

Nelson EK, Piehler B, Eckels J, Rauch A, Bellew M, Hussey P, Ramsay S, Nathe C, Lum K, Krouse K, Stearns D, Connolly B, Skillman T, Igra M. LabKey Server: An open source platform for scientific data integration, analysis and collaboration. BMC Bioinformatics 2011 Mar 9; 12(1): 71.

http://www.biomedcentral.com/1471-2105/12/71

LabKey works on on an Apache tomcat java server and it’s mostly Java on server side with some JavaScript enhanced pages on the user side. A working relational database server is also required. Not perfect but better than shared Excel files.

The installation we use is on a Windows machine with VirtualBox on which a Ubuntu 12.04 is installed. Installation was not painless but I followed the steps detailed in the python file: install-labkey.py. I had some problems with showing the Apache server to the outside world from within the virtual machine. A short description of how to install a Ubuntu machine on a Mac is here:
http://aboutfoto.bitbucket.org/rggobi_macosx.html

I just found a few notes on steps taken to configure a working LabKey on Linux (Ubuntu). These are useful especially if you are doing the install in a VirtualBox machine:

sudo apt-get install build-essential 
#installs compilers
>sudo usermod -G vboxsf -a yourusername 
#adds a user to the group -vboxsf to allow sharing of a folder
#install the server and database system (adjust numbers to actual versions)
>sudo apt-get install sendmail tomcat6 postgresql xvfb graphviz r-base
>cd /usr/local/
>sudo cp -R ~/Downloads/jre1.7.0_21 ./
>sudo ln -s /usr/local/jre1.7.0_21/ /usr/local/java
#configure the database user
>sudo -s -u postgres
>createuser -P
>sudo service tomcat6 stop
#help labkey user get access to the configuration in the tomcat server folder
>sudo chown -R tomcat6:tomcat6 labkey/
#make logs position a little bit easier to find.
>sudo ln -s /var/log/tomcat6/ /usr/share/tomcat6/logs

I’ve updated the LabKey server to the latest version (14.2, as of July 2014). While I thought it would be easy, I ended up by upgrading:
– Java (to 1.7 OpenJDK and Sun – I don’t really know which one is used – the system reports Oracle but the tomcat7 user has apparently a mind on its own),

– PostgreSQL (to 9.3 from 9.1 and kept the database safe using a script from postgres-contrib, pg_update, and indications from this blog post). The update failed if launched from the home directory. Worked if started from the /var directory because the script wants to write a log file and needs write permissions.
– Tomcat from 6 to 7 (relatively painless). Works very nicely as a service and can be stopped started at will with >sudo tomcat7 start (or stop or restart).
With the update, I changed also the way I remotely backup the data from the labkey database. Getting json files is good, but a pg_dump is much better. The only problem was that I did not know how to talk to the database – two configuration files needed to be changed to allow remote connection (postgresql.conf and pg_hba.config, as explained here.) In addition, another port forwarding needs to be added to the VirtualBox configuration.
The new LabKey version is improved compared to the older one. I feel more confident now about upgrading the thing.

ApE, a plasmid editor, installation on Linux (Debian)

UPDATE 2017-06: The easiest way to have a running ApE under Linux, which became critical for me after switching permanently to Linux at work, is to recover a Mac App bundle from the author’s web site: http://biologylabs.utah.edu/jorgensen/wayned/ape/Download/Mac/Ape_OSX_current.zip

After unzipping to a local folder, just launch the application (in my case):

wish /home/username/thefolder/ApeMac/Ape_8_6_0.app/
Contents/Resources/Scripts/AppMain.tcl

Tcl/Tk need to be installed to have the wish command available.

You can even alter the preferences of your file manager to open .ape files by using the same command followed by “%F” (on Ubuntu 17.04, MATE, Caja file manager).

 

OLDER CONTENT – safe to ignore follows:

ApE is the most useful DNA editor I know of. Although installation on Windows and Mac OS X is easy through pre-packaged binaries, Linux installation may be a little bit more complex. Don’t be discouraged – nice people, with the help from the ApE author himself, discovered all that is there to know about how to do it properly. Some useful information comes from the Ape wiki:

Information from http://pastebin.com/fJNjcW1G about how to install Ape on Linux (you might get some error results from wget, standard browser pointing to the address should work better):

# download latest windows version/package
 #
 >wget http://biologylabs.utah.edu/jorgensen/wayned/ape/Download/Windows/ApE_win_current.zip
 >unzip ApE_win_current.zip
 
 # download & setup tclkit (http://equi4.com/tclkit/index.html)
 #
 >wget http://www.equi4.com/pub/tk/8.5.1/tclkit-linux-x86.gz
 >gunzip tclkit-linux-x86.gz 

 #make the binary executable:
 >chmod +x tclkit-linux-x86
 
 # download SDX (Starkit Developer eXtension)
 # http://equi4.com/starkit/sdx.html
 wget http://equi4.com/pub/sk/sdx.kit
 
 # unwrap & run ApE
 #
 ./tclkit-linux-x86 sdx.kit unwrap ApE.exe
 ./tclkit-linux-x86 ApE.vfs/main.tcl

A recent version of the Tclkit can be recovered from:
https://code.google.com/p/tclkit/downloads/detail?name=tclkit-8.5.9-linux-ix86.gz

Two i386 libraries were required for ApE  to work on a Debian install (wheezy, 7.7, 64 bit version):

>su - root
>apt-get install libxss1:i386
>apt-get install libxft2:i386

The old Ape version for Linux works with base Tcl/Tk, but lacks some of the nice features of the ApE 2 series.
Newer versions require some extensions from the tclkit.

Don’t forget to chmod +x the tclkit binary

Proof that it works:

ApE screen shot
ApE screen shot

🙂    🙂

IMPORTANT

EDIT: The 2.0.7 version is in fact available and works directly with wish. Once unzipped, just ‘cd’ to the ‘ApE Linux’ directory and from there > wish AppMain.tcl  . This even works under Mac OS X Mavericks with Tcl/Tk 8.5.9!

Inkscape and system fonts on MacOS X

For the last year, one of the biggest trouble with using Inkscape on MacOS X was that some of the system fonts were not available in the drawings, especially Helvetica. I’m just pasting information found on the InkscapeForum with the solution:

Re: Font trouble in 0.48 on Mac

by Ufdah » Sat Oct 13, 2012 11:13 am

I registered on the forum just so that I could answer your question because I was having the same issue and I know how frustrating it is…
Right click on the Inkscape.app and “Show Package Contents”
From there go to: ‘Inkscape.app/Contents/Resources/etc/fonts/’
Using TextEdit.app (or another nice text editor), open ‘fonts.conf’ and edit:
<!--<dir>/System/Library/Fonts</dir>-->
Remove the comments so that it looks like this:
<dir>/System/Library/Fonts</dir>
This allows Inkscape to use all the installed system fonts in the FontBook app…
I would add that it is a good idea to keep a copy of the fonts.conf file, just in case.

Benefits of Android rooting

lockscreen
Customized lock-screen on Xperia V

I first heard about “rooting” an Android phone about 4 years ago but did not quite understand the benefits of spending hours with the complex procedure that was involved. For those of you unfamiliar with the term, “rooting” means getting the possibility of changing the system of an Android phone or tablet in any way, including the ability to remove system files and to find the device dead on the next boot. Rooting is equivalent with becoming the “admin” of your phone. Once rooted, there is no special user name or password. Additional apps, like SuperSU, are used to block unauthorized access to the system.

My own motivation to “root” a phone was mostly anchored in the belief that I could throw away a large number of applications that belong to the “system” and could not be uninstalled. I can confidently say now that I found two major benefits to rooting, apart the risky cleaning up of manufacturer installed apps: extreme customization and extended battery life.

Just a few words about boot loader and root, since the terms are frequently associated. An unlocked boot loader allows the installation of a system that is different from the original Android – a bit like a different Linux distribution. Such a system is found under the form of a custom ROM and Cyanogen Mod is one of the most well known custom ROMs. On Sony Xperia phones, a relatively straightforward procedure allows unlocking the boot loader. However, rooting does not require an unlocked boot loader and an unlocked boot loader does not automatically give root access to the file system of the phone or tablet.

How root opens endless customization options ?

Xposed modules
Xposed modules

The answer lies in a fantastic software that warps and bends Android to the user’s (mostly programmer’s) will. Having “root” access, allows the installation of the Xposed framework, which orchestrates its modules to do “things” to Android user interface (and more). The various modules can be installed from within the “Downloads” section of the Xposed Installer app.

GEM Xperia Xposed – allows many improvements to be added to the stock Android launcher on, you guessed, Xperia phones. Another Xperia specific module is Xperia Flip Settings, that allows the use of the original Android quick settings display instead of the customized one from Sony.

Xposes Preference Injector adds the different modules to the “Parameters” of Android, a very convenient integration of Xposed into the system.

Unicon (read un-icon) – allows customization of the icons used for the interface, without the need to install a custom launcher.

The most impressive module I’ve tried and happily use is GravityBox (JB) – the JellyBean version. It allows an impressive and ever growing number of customization to the interface, from the appearance of the quick settings:

quick_accessto the inclusion of a menu button in the navigation bar (a very convenient way to access menus in the absence of hardware buttons). GravityBox does plenty of other things, like, for example, changing the colors and positioning of status bar icons, making a L-type transparent navigation bar, etc:xperia_launcherAll in all, for people like me, squeezing the pixels to our liking for endless hours, the Xposed framework is a great and ever improving app.

How does rooting save battery life ? Llama and Greenify !

Having custom options for the interface is great, but saving battery from unnecessary electricity draining is much more important. I tried to use the “Stamina” mode of the phone. It stops some services and data connections while the screen is off. However, some services continue to wake the CPU and eat battery alive… Enter Llama and Greenify. Greenify, in its root version, allows the switch of of background applications that one chooses. Greenified apps loose the ability to continuously poke the processor or network sevices. Altogether, when the screen is off, the phone sleeps withoug draining much current.

Another important element in the battery saving battle is Llama (for location aware management of Android, I think). Llama can learn the radio antennas that are closest to my home, and allow some events to be programmen either when I come home or when I get out. My Llama, for example puts off the PIN security lock screen when at home and puts it back on when on the street (just in case the telephone is stollen or lost). Most importantly, Llama can switch off any active WiFi or data connection when the screen is locked. Thus, such connexions are only active when I need them. From 14 hours of battery life for the notoriously bad (for this) Xperia V, Greenify and Llama got me to this type of situation (about 2 days, with light usage):

battery_usageRooting is not without risks but, with the Xposed framework, great tweaks become possible on modern Android systems (Jelly Bean and up). For the Xperia V, the tweaked stock launcher is a very capable one and I don’t feel any need to change for another launcher.