SIN3A Super-track Settings
 
Publicly available ChIPseq data and predictions for SIN3A Tracks

Display mode:   

 All
ChIPseq of SIN3A  ChIPseq data of SIN3A in Cistrome DB and ENCODE  
Prediction of SIN3A in adrenal gland  Virtual ChIPseq predictions of SIN3A in adrenal gland  
Prediction of SIN3A in B-cell  Virtual ChIPseq predictions of SIN3A in B-cell  
Prediction of SIN3A in CD14 monocyte  Virtual ChIPseq predictions of SIN3A in CD14 monocyte  
Prediction of SIN3A in CD4 T-cell  Virtual ChIPseq predictions of SIN3A in CD4 T-cell  
Prediction of SIN3A in CD8 alpha-beta T-cell  Virtual ChIPseq predictions of SIN3A in CD8 alpha-beta T-cell  
Prediction of SIN3A in fibroblast of skin of abdomen  Virtual ChIPseq predictions of SIN3A in fibroblast of skin of abdomen  
Prediction of SIN3A in fore-limb muscle  Virtual ChIPseq predictions of SIN3A in fore-limb muscle  
Prediction of SIN3A in heart  Virtual ChIPseq predictions of SIN3A in heart  
Prediction of SIN3A in hind-limb muscle  Virtual ChIPseq predictions of SIN3A in hind-limb muscle  
Prediction of SIN3A in kidney  Virtual ChIPseq predictions of SIN3A in kidney  
Prediction of SIN3A in large intestine  Virtual ChIPseq predictions of SIN3A in large intestine  
Prediction of SIN3A in left kidney  Virtual ChIPseq predictions of SIN3A in left kidney  
Prediction of SIN3A in left lung  Virtual ChIPseq predictions of SIN3A in left lung  
Prediction of SIN3A in left renal cortex interstitium  Virtual ChIPseq predictions of SIN3A in left renal cortex interstitium  
Prediction of SIN3A in left renal pelvis  Virtual ChIPseq predictions of SIN3A in left renal pelvis  
Prediction of SIN3A in muscle of arm  Virtual ChIPseq predictions of SIN3A in muscle of arm  
Prediction of SIN3A in muscle of back  Virtual ChIPseq predictions of SIN3A in muscle of back  
Prediction of SIN3A in muscle of leg  Virtual ChIPseq predictions of SIN3A in muscle of leg  
Prediction of SIN3A in muscle of trunk  Virtual ChIPseq predictions of SIN3A in muscle of trunk  
Prediction of SIN3A in ovary  Virtual ChIPseq predictions of SIN3A in ovary  
Prediction of SIN3A in renal cortex interstitium  Virtual ChIPseq predictions of SIN3A in renal cortex interstitium  
Prediction of SIN3A in renal pelvis  Virtual ChIPseq predictions of SIN3A in renal pelvis  
Prediction of SIN3A in right lung  Virtual ChIPseq predictions of SIN3A in right lung  
Prediction of SIN3A in right renal cortex interstitium  Virtual ChIPseq predictions of SIN3A in right renal cortex interstitium  
Prediction of SIN3A in right renal pelvis  Virtual ChIPseq predictions of SIN3A in right renal pelvis  
Prediction of SIN3A in skin fibroblast  Virtual ChIPseq predictions of SIN3A in skin fibroblast  
Prediction of SIN3A in small intestine  Virtual ChIPseq predictions of SIN3A in small intestine  
Prediction of SIN3A in spinal cord  Virtual ChIPseq predictions of SIN3A in spinal cord  
Prediction of SIN3A in spleen  Virtual ChIPseq predictions of SIN3A in spleen  
Prediction of SIN3A in stomach  Virtual ChIPseq predictions of SIN3A in stomach  
Prediction of SIN3A in T-cell  Virtual ChIPseq predictions of SIN3A in T-cell  
Prediction of SIN3A in testis  Virtual ChIPseq predictions of SIN3A in testis  
Prediction of SIN3A in thymus  Virtual ChIPseq predictions of SIN3A in thymus  

Virtual ChIP-seq

Virtual ChIP-seq Predicting transcription factor binding by learning from the transcriptome

Karimzadeh M, Hoffman MM. 2017. Virtual ChIP-seq: Predicting transcription factor binding by learning from the transcriptome. in prep; doi: https://doi.org/. (BibTeX)

The free Virtual ChIP-seq software package efficiently predicts binding of 40 TFs in any cell type with RNA-seq and ATAC-seq (or DNase-seq).

Predicting transcription factor binding

Virtual ChIP-seq uses multi-layer perceptron to predict binding of individual TFs. Virtual ChIP-seq uses data on chromatin accessibility, genomic conservation, and binding characteristics of TFs from previous experiments in other cell types. It also learns from the asso- ciation of gene expression and TF binding at different genomic regions. By incorporating existing ChIP-seq data, there is no longer a need to represent TF sequence preferences in form of position weight matrices. For a new cell type with data on chromatin accessibility and gene expression, Virtual ChIP-seq predicts indirect TF binding, as well as binding of TFs without known sequence preference.

Accuracy of predictions

To build a generalizable classifier that performs well on new cell types with only transcriptome and chromatin accessibility data, we train the multi-layer perceptron on training cell types (A549, GM12878, HCT-116, HepG2, HeLa-S3). We assess the performance of the model in validation cell types (IMR90 K562 MCF-7 NHEK H1 Ishikawa BJ T47D PANC-1 Jurkat). Below, we report median and standard deviation of performance among validation cell types.

TF Median auROC S.D auROC Median auPR S.D auPR Median MCC S.D MCC
BACH1 0.977 0.00923 0.429 0.0508 0.384 0.0923
BHLHE40 0.918 0.00224 0.378 0.0325 0.398 0.0196
BRCA1 0.991 0.00388 0.356 0.0322 0.369 0.0223
CEBPB 0.965 0.0254 0.392 0.0735 0.371 0.042
CHD2 0.98 0.0213 0.462 0.0606 0.451 0.047
CREB1 0.98 0.107 0.519 0.164 0.448 0.109
CTCF 0.989 0.0385 0.81 0.101 0.605 0.15
E2F4 0.993 0.00786 0.502 0.0867 0.322 0.161
EGR1 0.974 0.034 0.418 0.186 0.456 0.176
ELF1 0.954 0.0374 0.496 0.0709 0.455 0.0403
ESRRA 0.939 0.0288 0.308 0.047 0.309 0.0185
FOS 0.858 0.00542 0.334 0.0152 0.369 0.02
FOXA1 0.966 0.0279 0.584 0.0133 0.453 0.0903
GABPA 0.978 0.0272 0.434 0.0605 0.414 0.0533
GATA3 0.916 0.0314 0.241 0.0627 0.312 0.0597
GTF2F1 0.991 0.0123 0.29 0.0709 0.341 0.0624
H2AZ 0.932 0.0728 0.304 0.141 0.317 0.129
HCFC1 0.988 0.00668 0.499 0.0419 0.44 0.0583
JUND 0.992 0.00984 0.319 0.18 0.346 0.142
MAFF 0.964 0.00405 0.361 0.0987 0.374 0.102
MAFK 0.983 0.00458 0.523 0.0958 0.478 0.0398
MAX 0.968 0.0269 0.459 0.115 0.416 0.0645
MAZ 0.987 0.00437 0.546 0.0798 0.455 0.063
MXI1 0.991 0.00456 0.426 0.0318 0.43 0.0305
MYC 0.978 0.114 0.312 0.191 0.319 0.154
NRF1 0.997 0.0127 0.72 0.0508 0.359 0.0593
RAD21 0.986 0.0135 0.75 0.0552 0.581 0.0952
REST 0.985 0.0181 0.562 0.126 0.439 0.0759
RFX5 0.971 0.0138 0.32 0.0461 0.305 0.0536
SIN3A 0.977 0.0095 0.413 0.0399 0.394 0.0384
SMC3 0.998 0.00005 0.779 0.0177 0.723 0.0184
SRF 0.971 0.0355 0.363 0.0833 0.398 0.0584
TAF1 0.992 0.0216 0.541 0.0558 0.484 0.0457
TBP 0.982 0.00548 0.365 0.111 0.387 0.0704
TEAD4 0.947 0.0367 0.392 0.0208 0.352 0.0445
USF1 0.917 0.0223 0.411 0.0858 0.401 0.0785
USF2 0.97 0.0128 0.471 0.0371 0.409 0.0893
YY1 0.93 0.0334 0.46 0.049 0.485 0.0665

Virtual ChIP-seq accepts chromatin accessibility data in narrowPeak format and RNA-seq data in format of a matrix where rows are human gene symbols and columns are cell types (Minimum of 1 column with your cell of interest). The RNA-seq measure must be normalized to length and library (accepts RPKM, FPKM, TPM, but not raw read counts). It takes an average of 6 CPU hours (depending on TF) and a minimum RAM of 8GB to generate the input tables for your TF of interest. Applying the trained model takes less than 20 minutes for most TFs and datasets.

Track hub, file access, and software

UCSC Genome Browser

View the Virtual ChIP-seq track hub in the UCSC genome browser.

There are 40 supertracks corresponding to each transcription factor. Each supertrack contains to bigBed9 files, one showing genomic bins with TF binding in Cistrome DB datasets, and one showing Virtual ChIP-seq predictions in the Roadmap consortium datasets.

Using the track hub

There are 40 supertracks corresponding to each transcription factor. Each supertrack contains to bigBed9 files, one showing genomic bins with TF binding in Cistrome DB datasets, and one showing Virtual ChIP-seq predictions in the Roadmap consortium datasets.

View the Virtual ChIP-seq track hub in UCSC genome browser.

Direct links

Download Virtual ChIP-seq predictions in the Roadmap datasets directly:

Software and documentation

Read the documentation for Virtual ChIP-seq software, which begins with a quick start.

Support

Please ask questions about Virtual ChIP-seq on our mailing list. If you want to report a bug or request a feature, use Virtual ChIP-seq issue tracker. We are interested in all comments on the package, and the ease of use of installation and documentation.

Source code

Credits

Virtual ChIP-seq is developed by Mehran Karimzadeh during his PhD at Michael Hoffman Lab.