bonohu blog

parallelized BLAT

Written by bonohu in misc on 火 01 5月 2018.

As I had to map assembled reads to the genomic sequence, I used BLAT (the BLAST-like Alignment Tool) for that purpose. BLAT was so fast for landing reads to genomic sequence, but it can be slow if reads are so many. I thought it would be so nice to have …

SPARQLthon67

Written by bonohu in DBCLS on 金 27 4月 2018.

Joined SPARQLthon held at DBCLS Kashiwa. This was 67th SPARQLthon.

Discussed the development of two search systems (DBCLS SRA and AOE) and the integration of these. The specification will be fixed after the coming INSDC meeting of this year at NCBI.

I tried to add contents to how to make …

Join files by key

Written by bonohu in shell on 日 25 3月 2018.

Joining two files by key in the first column of files can be easily done by using UNIX command below.

join -j 1 file1.txt file2.txt

This is very useful command, but the output is space-delimited by default. In order to get the output by tab-delimited, following option for …

Retrieve a subset of sequence dataset

Written by bonohu in shell on 土 24 3月 2018.

In order to extract a set of sequence from FASTA-formatted file (both in nucleotides and peptides), several commands can be used to do so. In recent years, I regularly use blastdbcmd in NCBI BLAST suite. To run this command, the file must be indexed by makeblastdb with the option below …

uniq -c option

Written by bonohu in shell on 金 23 3月 2018.

When I want to count the number of redundant words in a file (hoge.txt), I have used simple Perl code like this(count.pl).

#!/usr/bin/perl
while(<>) {
        my($word) = split;
        $num{$word}++;
}
foreach (sort keys %num) {
        print "$_\t$num …

Align and estimate abundance

Written by bonohu in rnaseq on 水 07 3月 2018.

If assembled transcriptome sequence set and RNA-seq reads for that are available for an organism, we can align reads and estimate transcript abundance by running the script (align_and_estimate_abundance.pl) in the Trinity software package. When the Trinity package was installed using Homebrew, that script is installed in /usr/local/Cellar …

Paper uploaded to BioRxiv for the first time

Written by bonohu in misc on 土 24 2月 2018.

Paper in collaboration with Dr. Tanimoto at Hiroshima University in Japan was published.

Differentiated Embryo Chondrocyte plays a crucial role in DNA damage response via transcriptional regulation under hypoxic conditions DOI: 10.1371/journal.pone.0192136

This study contained my work on the collective intelligence of hypoxic transcriptome from public …

SPARQLthon65

Written by bonohu in DBCLS on 木 15 2月 2018.

Joined SPARQLthon held at DBCLS Mishima located at National Institute of Genetics. This was 65th SPARQLthon.

Discussed the development of search system for public high-throughput sequencing data at Sequence Read Archive (SRA).

Increasing the usability of FANTOM5 data

Written by bonohu in misc on 火 06 2月 2018.

Last summer, our work on the use of FANTOM5 data was published in Scientific Data.

RefEx, a reference gene expression dataset as a web tool for the functional analysis of genes

This article was included in FANTOM5 collection. It was a great pleasure to be published together with FANTOM5 data …

Perl in awk mode

Written by bonohu in shell on 月 05 2月 2018.

When I want to extract data by numeric value in other column, Perl in awk mode might be useful. It is often the case that I forget the option for that in Perl, so I took a note this time.

perl -anle  'print "$F[0]\t$F[3]" if ($F …