Tools
grm2012
Glunčić M, Paar V, Jelovina D
What is grm2012.exe:
grm2012.exe is command line application for detection of (tandem) repeats in given genomic sequence.
How input file looks like:
Curent version suports .fasta, .fa and .txt files. Program runs in command line and file names are passed as an argument in the command line.
If sequence is passed in .txt file, sequence must be in the first line without any character (including whitespaces) before and after sequence.
How to start:
Program may need admin privileges to be able to write out data.
Download and extract files in some folder (eg. C:\New Folder\).
Open command prompt (Start->Run->cmd.exe press Enter).
In command prompt change current folder to the one where files have been exctracted (eg. cd "c:\New folder").
Start the calculation by passing file path and location as an argument to the program(eg. grm2012.exe "C:\New Folder\file.fasta").
Advanced: Additional arguments can be passed to the program to change parameters (eg. grm2012.exe -cntrl1 -cntrl2 "C:\New Folder\file.fasta").
Available parameters:
parameter descriprion
---------------------------------------------------------------------------------------
-rtf | Output .rft colourized data file with reduced data (full colour output can be very large). |
-nortf | Don't output .rft colour file with reduced data |
-txt | Output txt noncolourized data output (full data output) |
-notxt | Don't output txt noncolourized data output (full data output) |
-kslen N | Set key string length valute to the N. |
N can be in range fomr 1 to 16. | |
Default value is 8. | |
-partlen L | Set data breaking length to the value L. |
Default value is 1Mbp. | |
-GRMfilter G | Set GRM data peak filter to value G. |
Default value is 600. | |
-help | Displays this help. |
-grm | Write out GRM file. |
-grm-M | Write out GRM file, with output data len M. |
How output looks like:
Detected tandem repeats are stored in .txt and/or .rtf file (by default .txt only).
Files are stored in the same folder as your input file, and are named:
OriginalFileName_Tandems.rtf.
OriginalFileName_Tandems.txt.
Otput file is organized in tree columns:
first column contains start positins of copies in tandem repeats,
second column contains length of copy,
third column contains copy sequence (.rtf file shows only start and end of copy).
Tandems are separated from other tandems by horizontal line.
- Input case file Download input case file
- Output case file Download output case file
- grm2012.exe Download windows grm2012.exe file
- grm2012.x Download linux grm2012.x file
GRM-Total module
Glunčić M, Paar V, Basar I, Rosandić M
Computes the frequency vs. fragment length distribution for a given genomic sequence by superposing results of consecutive KSA segmentationscomputed for an ensemble of all n-bp key strings (4^n key strings). In GRM diagram each pronounced peak corresponds to one or more repeats at that length, tandem or dispersed.
- Input case file Download input case file
- Output case file Download output case file
- grm_total.exe Download grm_total.exe file
GRM-Dom module
Basar I, Glunčić M, Paar V, Rosandić M
Determines dominant key string corresponding to fragment length for each peak in the GRM diagram. An n-bp key string (or a group of n-bp key strings) that gives the largest frequency for a fragment length under consideration is referred to as dominant key string.
- Input case file Download input case file
- Output case file Download output case file
- grm_total.exe Download grm_dom.exe file
GRM-Seg module
Basar I, Glunčić M, Paar V, Rosandić M
Performs segmentation of a given genomic sequence into KSA fragments using dominant key string from GRM-Dom module. Any periodic segment within the KSA length array reveals the location of repeat and provides genomic sequences of the corresponding repeat copies.
- Input case file Download input case file
- Output case file Download output case file
- grm_total.exe Download grm_seg.exe file
ColorHOR application
Pavin N, Paar V, Rosandić M, Glunčić M, Basar I, Pezer R
Here we develop a graphical user interface method ColorHOR for fast identification and analysis of higher order repeats (HORs) in a given genomic sequence, without requiring a priori information on composition of genomic sequence. Our graphical method ColorHOR is based on extension of the key-string algorithm (KSA). The choice of key-string is based on the standard consensus alpha satellite. ColorHOR program first constructs the alpha staircase, identifying alpha-satellite containing segments in a given sequence as stairs in alpha staircase, and then it constructs colored bands at positions of each stair, providing a direct visual identification of HORs (direct and/or reverse complement). We suggest that the HOR assignment obtained by ColorHOR be included into databases for complete genome sequence.
- setup.exe Download MS Windows Installer Aplication
- Contact author Request for the Source Code Archive
- How to Install How to Install description document
- How to Use Description document (including sample application)