Looking around for hematopoietic data has been a common theme as of late for A2IDEA; obviously due to the push for immunotherapeutics in cancer.  There’s simply a wealth of leukemia and lymphoma datasets, some very large compendiums like the TCGA. However, finding good healthy controls for these disease states can be challenging.  Until, we found BloodSpots, a 2016 Nucleic Acids Research paper by F. Otzen Bagger, D. Sasivarevic, S.H. Shoi, et al., and a great website.  BloodSpot plots microarray gene expression data at different maturation stages and disease states.   It is primarily focused upon AML, but there's other stuff in there.  Start with the search bar that accepts gene name or gene alias, or even a gene signature name from the MSigDB database and click away.  

What’s really great about this web database are the visualizations.  There’s a survival plot for the AML data that displays the Kaplan-Meier analysis, bar or violin plots for gene expression for many datasets and what we find really cool are the hierarchical tree views that show the expression of any gene as a hematopoietic cell differentiates and matures.    Check out this example of STYK1, a novel oncogene with kinase activity, expression along the hierarchical differentiation tree of normal human cells. The gene shows high expression in hematopoietic stem cells, decreases in progenitor cells only to increase again but solely in the promyelocyte bone marrow compartment.  Then in a different dataset, STYK1 gene expression is high only NK cells selected as both CD56 and CD16 negative.  So, be sure to select the dataset you are interested in!

For those bioinformatic geeks out there, yes, the data has been batched corrected with ComBat, mostly consists of Affymetrix HU133 plus 2.0, you can select different T-Test filtering, do Gene Correlations and even upload your own HU133 plus 2.0 data.  Cool tool for sure!

Free Resources for the Analysis of Data Pt. I

Data Science Resources