- Create the world’s largest perturbation driven gene expression dataset by treating a group of cell lines with small molecule compounds and gene knock-down, knock-out, and overexpression. Process and share these transcriptional signatures with the community so scientists can use them to make discoveries in their specific fields.
- Leverage the perturbational dataset to discover new insights into biology. CMap can be used to discover new drugs (and re-purpose old ones), understand the transcriptional effect of gene mutations and gain insight into gene regulatory networks.
My work on CMap is centered around interpreting the perturbational signatures in the context of patient-derived gene expression data. The Cancer Genome Atlas (TCGA) and other projects have made a wealth of DNA and RNA sequencing data publicly available. Transcriptional data from patients with different cancers and genetic alterations just begs to be compared with CMap. I’m looking to answer questions like:
- Do tumor cells with mutations in a given gene have similar transcriptional signatures to our experiments of knock-down or inhibition?
- Can comparison with CMap be used to annotate sets of patients with similar transcriptional programs, that might otherwise not be correlated?
- Can the transcriptional signature from a patient be used to predict which pathways are active and driving tumorigenesis?
- Tumors that become resistant to drugs should activate different transcriptional pathways. Can we pick this up with CMap?
So far, the answer to most of these questions looks like yes! I’ve presented this research at a few different conferences, you can see my poster here.
If you are interested in the Connectivity Map project, you can learn more and explore the dataset at http://apps.lincscloud.org/