Grit Documentation
Description
Predict transcription factor binding sites for
orthologue genes using mixed Student's t-test statistics.
Release
Source code 1.0.2 04/18/2023
Source code 1.0.1 04/18/2023
Document
Source code 1.0.0 08/14/2022
Document
Binary Windows, Mac OS, Linux
Fork me on Github
https://github.com/thua45/grit
Install
Go the source folder and run g++ main.cpp cdflib.cpp
grit.cpp -std=c++11 -lpthread
-o grit command, a binary named grit will be produced in the folder.
Under Windows OS try g++ main.cpp cdflib.cpp grit.cpp
-std=c++11 -static -lpthread
-o grit.exe instead.
Requirement
Minimum 32GB RAM, if you install from the source code,
the g++ complier is also required.
Usage
grit -m motif -i homoseq -b bgseq -x species [-k min_sn] [-t pvalue] [- pscore] [-c cpus] [-u seed] -o
output
Options
-m PWMs for transcription factors
-i putative promoter
sequence for orthologues genes
-b background sequences
-x species code, hsapiens
for human, mmusculus for mouse, sscrofa
for pig, ggallus for chicken, and
etc.
-k the minimal number of ortholog sequences.
-t p-value threshold, TFBS with p-value less than will
p-value threshold be reported, default = 0.05
-p p-score threshold, TFBS with p-score less than will
be reported p-score threshold, default = 0
-c numbers of CPUs, for multiple threading
-u seed number, value <0 for random seed, default
-1
-o output, output file name
Data
motif file: Motiff_Jaspar-2022+HOCO-v11.txt
promoter seq file: promoter_seq_orhtorlog_-500_+50.zip, promoter_seq_orhtorlog_-1k_+100.zip
bakground seq file: rdm20000-550.txt, rdm20000-1100.txt
Data for Arabidopsis_thaliana.TAIR10
motif file: Jaspar-Plant-2022.txt
promoter seq file: homoseq-v56.txt
background seq file: bgseq-rdm20000.txt
results (Jaspar-2022): Arabidopsis_thaliana.TAIR10_v56_j22.zip
External Links
Grit Online: Search Grit result online
Flaver: Mining transcription factor using weighted rank correlation statistics
Example
Example
An example run should like: grit -m
Motiff_Jaspar-2022+HOCO-v11.txt -i
promoter_seq_orhtorlog_-1k_+100.txt -b rdm20000-1100.txt -x hsapiens
-t 0.05 -p 0 -c 8 -u 12345 -o human-result-v200.txt
This command took three input files:
Motiff_Jaspar-2022+HOCO-v11.txt, promoter_seq_orhtorlog_-1k_+100.txt,
rdm20000-1100.txt. After finished run it will produce an output file named:
human-result-v200.txt
Results:
Chicken bGalGal1.mat.broiler.GRCg7b: k30-s2-u12345-logrev3-ggallus-1100.zip, k30-s2-u12345-logrev3-ggallus-1100_RS0.3_E-3.bed; k30-s2-u12345-logrev3-ggallus-550.zip, k30-s2-u12345-logrev3-ggallus-550_RS0.3_E-3.bed
Tools
Convert Jaspar Motif to Grit Motif: jaspar2motif.py
References
Tinghua Huang, Hong Xiao, Qi Tian,
Zhen He, Min Yao. Identification of upstream transcription factor binding sites
in orthologous genes using mixed Student's t-test statistics. PloS Computation Biology, 2022
Tinghua Huang, Xinmiao
Huang, Binyu Wang, Hao He, Zhiqiang Du, Min Yao, and Xuejun
Gao. Flaver: mining transcription factors in
genome-wide transcriptome profiling data using weighted rank correlation
statistics
Contact
Dr. Tinghua Huang, thua45@126.com
Dr. Min Yao, minyao@yangtzeu.edu.cn
Dr. Jianwu Wang, wjw19802013@163.com