Grit Documentation
Description
Predict transcription factor binding sites for
orthologue genes using mixed Student's t-test statistics.
Release
Source code 1.0.2 04/18/2023
Source code 1.0.1 04/18/2023
Document
Source code 1.0.0 08/14/2022
Document
Binary Windows, Mac OS, Linux
Fork me on Github
https://github.com/thua45/grit
Install
Go the source folder and run g++ main.cpp cdflib.cpp
grit.cpp -std=c++11 -lpthread -o grit command, a binary named grit will be produced
in the folder.
Under Windows OS try g++ main.cpp cdflib.cpp grit.cpp
-std=c++11 -static -lpthread -o grit.exe instead.
Requirement
Minimum 32GB RAM, if you install from the source code,
the g++ complier is also required.
Usage
grit -m motif -i homoseq -b bgseq -x species [-k
min_sn] [-t pvalue] [- pscore] [-c cpus] [-u seed] -o output
Options
-m PWMs for transcription factors
-i putative promoter sequence for orthologues genes
-b background sequences
-x species code, hsapiens for human, mmusculus for
mouse, sscrofa for pig, ggallus for chicken, and etc.
-k the minimal number of ortholog sequences.
-t p-value threshold, TFBS with p-value less than will
p-value threshold be reported, default = 0.05
-p p-score threshold, TFBS with p-score less than will
be reported p-score threshold, default = 0
-c numbers of CPUs, for multiple threading
-u seed number, value <0 for random seed, default
-1
-o output, output file name
Data
motif file: Motiff_Jaspar-2022+HOCO-v11.txt
promoter seq file: promoter_seq_orhtorlog_-500_+50.zip, promoter_seq_orhtorlog_-1k_+100.zip
bakground seq file: rdm20000-550.txt, rdm20000-1100.txt
Data for
Arabidopsis_thaliana.TAIR10
motif file: Jaspar-Plant-2022.txt
promoter seq file: homoseq-v56.txt
background seq file: bgseq-rdm20000.txt
results (Jaspar-2022): Arabidopsis_thaliana.TAIR10_v56_j22.zip
External Links
Grit Online: Search Grit result online
Flaver: Mining transcription factor using weighted rank correlation statistics
Example
Example
An example run should like: grit -m
Motiff_Jaspar-2022+HOCO-v11.txt -i promoter_seq_orhtorlog_-1k_+100.txt -b
rdm20000-1100.txt -x hsapiens -t 0.05 -p 0 -c 8 -u 12345 -o
human-result-v200.txt
This command took three input files:
Motiff_Jaspar-2022+HOCO-v11.txt, promoter_seq_orhtorlog_-1k_+100.txt,
rdm20000-1100.txt. After finished run it will produce an output file named:
human-result-v200.txt
Results:
Chicken bGalGal1.mat.broiler.GRCg7b: k30-s2-u12345-logrev3-ggallus-1100.zip, k30-s2-u12345-logrev3-ggallus-1100_RS0.3_E-3.bed; k30-s2-u12345-logrev3-ggallus-550.zip, k30-s2-u12345-logrev3-ggallus-550_RS0.3_E-3.bed
Tools
Convert Jaspar Motif to Grit Motif: jaspar2motif.py
References
Tinghua Huang, Hong Xiao, Qi Tian, Zhen He, Min Yao.
Identification of upstream transcription factor binding sites in orthologous
genes using mixed Student's t-test statistics. PloS Computation Biology, 2022
Tinghua Huang, Xinmiao Huang, Binyu Wang, Hao He,
Zhiqiang Du, Min Yao, and Xuejun Gao. Flaver: mining transcription factors in
genome-wide transcriptome profiling data using weighted rank correlation
statistics
Contact
Dr. Tinghua Huang, thua45@126.com
Dr. Min Yao, minyao@yangtzeu.edu.cn
Dr. Jianwu Wang, wjw19802013@163.com