ColorHOR - graphical algorithm for fast scan of alpha satellite
higher-order repeats

 
Vladimir Paar, Nenad Pavin, Marija Rosandić, Matko Glunčić, Ivan Basar, Robert Pezer and Sonja Durajlija Žinić
Faculty of Science, University of Zagreb, Bijenička 32, 10000 Zagreb, Croatia
Department of Internal Medicine, University Hospital Rebro, Kišpatićeva 12, Zagreb, Croatia
Rugjer Bošković Institute, Bijenička 54, Zagreb, Croatia

Motivation: Alpha satellite higher-order repeats were previously studied using restriction enzyme analysis. On the other hand, scientific literature falls short of providing a direct computational identification and analysis of higher-order repeats from GenBank data sequences. Given the fast growth of sequence databases in the centromeric region, it is of increasing interest to have efficient tools for such computational analysis.

Results: We develop a graphical user interface method ColorHOR for fast identification and analysis of higher order repeats (HORs) in a given genomic sequence, without requiring a priori information on composition of genomic sequence. Our graphical method ColorHOR is based on extension of the key-string algorithm (KSA). The choice of key-string is based on the standard consensus alpha satellite. ColorHOR program first constructs the alpha staircase, identifying alpha-satellite containing segments in a given sequence as stairs in alpha staircase, and then it constructs colored bands at positions of each stair, providing a direct visual identification of HORs (direct and/or reverse complement). We suggest that the HOR assignment obtained by ColorHOR be included into GenBank database for complete genome sequence.

Availability: Article is going to appear in Bioinformatics, doi:10.1093/bioinformatics/bti072, Oxford University Press. The program with graphical user interface application for ColorHOR is freely available at www.hazu.hr/KSA. It is developed using wxPython GUI library and at the present moment precompiled binary is available for MS Windows platforms. Full source code is available upon the request to the author.

Technical Information:

setup.exe Download MS Windows Installer Aplication
Contact software author Request for the Source Code Archive or any other enquiry
How to Install How to Install description document
How to Use How to Use description document (including sample application)

Supplementary Information

Table S1. 11mer HOR copies (complete or distorted) identified computationally in GenBank sequence of BX248407.4 in chromosome 1

Table S2. 11mer HOR copies (complete or distorted) identified computationally in GenBank sequence of BX248407.26 in chromosome 1

Table S3. Consensus sequence of the 1866-bp HOR (11mer) in chromosome 1

Table S4. Intervals of overlap between BX284928.3 and BX248407.4 in chromosome 1