GRM tools

Tools

grm2012
GRM-Total module
GRM-Dom module
GRM-Seg module
ColorHOR

grm2012

Glunčić M, Paar V, Jelovina D

What is grm2012.exe:
grm2012.exe is command line application for detection of (tandem) repeats in given genomic sequence.

How input file looks like:
Curent version suports .fasta, .fa and .txt files. Program runs in command line and file names are passed as an argument in the command line. If sequence is passed in .txt file, sequence must be in the first line without any character (including whitespaces) before and after sequence.

How to start:
Program may need admin privileges to be able to write out data.
Download and extract files in some folder (eg. C:\New Folder\).
Open command prompt (Start->Run->cmd.exe press Enter).
In command prompt change current folder to the one where files have been exctracted (eg. cd "c:\New folder").
Start the calculation by passing file path and location as an argument to the program(eg. grm2012.exe "C:\New Folder\file.fasta").

Advanced: Additional arguments can be passed to the program to change parameters (eg. grm2012.exe -cntrl1 -cntrl2 "C:\New Folder\file.fasta").
Available parameters:

parameter descriprion
---------------------------------------------------------------------------------------

-rtf	Output .rft colourized data file with reduced data (full colour output can be very large).
-nortf	Don't output .rft colour file with reduced data
-txt	Output txt noncolourized data output (full data output)
-notxt	Don't output txt noncolourized data output (full data output)
-kslen N	Set key string length valute to the N.
	N can be in range fomr 1 to 16.
	Default value is 8.
-partlen L	Set data breaking length to the value L.
	Default value is 1Mbp.
-GRMfilter G	Set GRM data peak filter to value G.
	Default value is 600.
-help	Displays this help.
-grm	Write out GRM file.
-grm-M	Write out GRM file, with output data len M.

How output looks like:
Detected tandem repeats are stored in .txt and/or .rtf file (by default .txt only).
Files are stored in the same folder as your input file, and are named:
OriginalFileName_Tandems.rtf.
OriginalFileName_Tandems.txt.
Otput file is organized in tree columns:
first column contains start positins of copies in tandem repeats,
second column contains length of copy,
third column contains copy sequence (.rtf file shows only start and end of copy).
Tandems are separated from other tandems by horizontal line.

Input case file Download input case file
Output case file Download output case file
grm2012.exe Download windows grm2012.exe file
grm2012.x Download linux grm2012.x file

_________________________________________________________

GRM-Total module

Glunčić M, Paar V, Basar I, Rosandić M

Computes the frequency vs. fragment length distribution for a given genomic sequence by superposing results of consecutive KSA segmentationscomputed for an ensemble of all n-bp key strings (4^n key strings). In GRM diagram each pronounced peak corresponds to one or more repeats at that length, tandem or dispersed.

Input case file Download input case file
Output case file Download output case file
grm_total.exe Download grm_total.exe file

GRM-Dom module

Basar I, Glunčić M, Paar V, Rosandić M

Determines dominant key string corresponding to fragment length for each peak in the GRM diagram. An n-bp key string (or a group of n-bp key strings) that gives the largest frequency for a fragment length under consideration is referred to as dominant key string.

Input case file Download input case file
Output case file Download output case file
grm_total.exe Download grm_dom.exe file

citation...

GRM-Seg module

Basar I, Glunčić M, Paar V, Rosandić M

Performs segmentation of a given genomic sequence into KSA fragments using dominant key string from GRM-Dom module. Any periodic segment within the KSA length array reveals the location of repeat and provides genomic sequences of the corresponding repeat copies.

Input case file Download input case file
Output case file Download output case file
grm_total.exe Download grm_seg.exe file

ColorHOR application

Pavin N, Paar V, Rosandić M, Glunčić M, Basar I, Pezer R

Here we develop a graphical user interface method ColorHOR for fast identification and analysis of higher order repeats (HORs) in a given genomic sequence, without requiring a priori information on composition of genomic sequence. Our graphical method ColorHOR is based on extension of the key-string algorithm (KSA). The choice of key-string is based on the standard consensus alpha satellite. ColorHOR program first constructs the alpha staircase, identifying alpha-satellite containing segments in a given sequence as stairs in alpha staircase, and then it constructs colored bands at positions of each stair, providing a direct visual identification of HORs (direct and/or reverse complement). We suggest that the HOR assignment obtained by ColorHOR be included into databases for complete genome sequence.

setup.exe Download MS Windows Installer Aplication
Contact author Request for the Source Code Archive
How to Install How to Install description document
How to Use Description document (including sample application)

citation...

Global Repeat Map based programs for genomic repeats

Department of Physics, Faculty of Science, University of Zagreb

Tools

grm2012

Glunčić M, Paar V, Jelovina D

GRM-Total module

Glunčić M, Paar V, Basar I, Rosandić M

GRM-Dom module

Basar I, Glunčić M, Paar V, Rosandić M

GRM-Seg module

Basar I, Glunčić M, Paar V, Rosandić M

ColorHOR application

Pavin N, Paar V, Rosandić M, Glunčić M, Basar I, Pezer R

Related links