A: A motif has a positive activity in a sample of interest if it is predicted to upregulate its target transcripts/genes in this sample. This is regardless of whether it is due to a strengthening of an activator or a weakening of a repressor. That is to say, if you are sure that the TFs which bind a motif of your interest are repressors, then high activity in one sample means that these repressors are `less repressing' compared to other samples.
A: We included all the high quality motifs (represented as weight matrices) from the Jaspar and Transfac databases to make a curated list of motifs. In the meantime, new motifs are found and old ones are updated with new technologies/better antibodies. If you have a weight matrix of your TF from a high-quality source, please let us know and we can include it in a future release of MARA.
A: Samples are ordered in alphabetical order by filename. If you want to make sure they are listed in a particular order (e.g. ordered in time), give the files appropriate names, e.g. 00_control.CEL, 01_perturbation_1h.CEL, 02_perturbation_4h.CEL, ...
A: There is only a limited amount of space for the labels. Give your files shorter names to make sure they will fit.
A: Yes, we can currently do this by hand if you email us and request it. It is planned to have this function automated in the web page in the future.
A: This should not happen. Please contact us if you see such an error.
A: The MARA research manuscript is in the final stage of preparation. Please check this page for an update before submitting your research paper. If you are in a hurry please cite our Nature Genetics article http://www.nature.com/ng/journal/v41/n5/full/ng.375.html
A: It will soon be possible to see a scatter plot of activities vs. expression levels (mRNA levels) of TFs.
A: MARA correlates the expression of a promoter with activity of a motif in order to predict if it is a target. The more samples you have, the more reliable the prediction is. A correlation of 2 points is always +-1, and therefore MARA can't say anything apart from the fact that there is a binding site in the promoter. From 5+ samples the target predictions are quite reliable. The statements above do not apply to the activity predictions - for these, two samples are just fine.
A: It is always a matter of choice where to put a cutoff. Usually the top motifs should be quite separated from the rest. However, you should probably not take into consideration motifs with z < 2.0
A: NCBI does not enforce a standard of data format. Even if a platform is the same for two datasets, the actual files might contain differently processed expression levels and be written in a different format. MARA accepts the unprocessed microarray files and reads aligned to the mouse (mm9) and human (hg18) genomes. These can be compressed with zip, bzip2, gzip and tar file compressors.
A: We keep each processed dataset for 7 days. After this time it is deleted to save space. During this time you can download the report from the `download' section. If you want to keep site visible for your collaborators for a while please let us know.
A: We support the most popular microarrays. In case you need a new one please let us know, but we do not guarantee it will be included.
A: In case of microarray expression data the waiting time is usually shorter than one hour. However, if you have a lot of samples it might take more time. The time necessary to process NGS data varies, and it is always longer than microarray. Please be patient.
A: The Affymetrix Mouse Gene 1.0 ST and Human Gene 1.0 ST; or sequencing.
A: The full list can be downloaded from the download menu (on the left). Files are compressed and the format is tab-delimited fields: promoter, z-value, motif and target RefSeq transcript list (if associated with the promoter). One element of the RefSeq list is a "|"-separated list with fields: transcript, gene symbol, GeneBank gene ID and gene name.
A: The large networks are hard to analyze and to plot them in a readable way.
A: No. All the normalization and promoter-level expression is performed during MARA run in a controllable manner, so the spreadsheet tables are not supported.
A: Usually it involves some follow-up experiments. However if you find out that there is only one TF from the list expressed and its expression is correlated with the activity you can be fairly sure that this is the one.
A: No. Only BED files are supported at the moment.
A: Yes, it should not be a problem. We have successfully tested MARA with 20 GB of uploaded data.
A: A TSC is a set of neighboring, co-expressed Transcription Start Sites (TSSs). For detailed background information please look at: http://genomebiology.com/content/10/7/R79 Promoters can comprise of a TSC with a transcript associated or a RefSeq transcript start, which do not have any TSC. We look for TF binding sites in a symmetrical region of a length of 1000 bp around promoters.
A: No. It is possible that there will be a yeast version in the future. Interested? Please write us.
A: All the true replicates which we have looked at so far look very similar in terms of activity profiles. If it is not the case for you, please first make sure the expression levels in your replicates are close.
A: Successful strategies include knocking down the TF of interest, overexpressing it, or doing a ChIP experiment against the TF. Please use your expert knowledge.
A: Please write a letter to: swissregulon@gmail.com