Direct to Consumer Genetic Testing

If you didn’t catch this, it’s true.  23andMe was given FDA approval for direct to consumer genetic testing services, the FIRST.TIME.EVER.  Interestingly, the FDA hopes that this ruling will allow consumers to change lifestyle choices related to the 10 diseases 23andMe will report currently.  

I say interesting, because for many of those diseases, lifestyle choices have not been proven to affect the outcome of the genetic alteration. Seriously, how does one change the outcome of Gaucher’s disease, early-onset primary dystonia, hereditary hemophilia or hemochromatosis through personal lifestyle choices?  And really, don’t you already know from your physician through signs and symptoms that you have these diseases, like from infancy?  Maybe you decide not to have children, but I sincerely hope you talk with a genetic counselor to learn the difference between X-linked and autosomal recessive or dominant inheritance.  

Other tidbits that are interesting… this is major PRECEDENT.  According to the FDA, it intends to offer further exemptions to the premarket review…  

“ to exempt additional 23andMe GHR tests from the FDA’s premarket review, and GHR tests from other makers may be exempt after submitting their first premarket notification. A proposed exemption of this kind would allow other, similar tests to enter the market as quickly as possible and in the least burdensome way, after a one-time FDA review.”

The agency is primarily concerned, “to help ensure that they [Genetic Health Risk tests] provide accurate and reproducible results.”  The FDA will not provide exemptions for “Diagnostic Tests.”  

I appreciate the accurate and reproducible results, but really; 

How many people know the difference between a Diagnostic Test and Genetic Health Risks?

Much educating needs to be done, here’s the links;

MIT Review

FDA Press Release

And oh… 23andMe sells the data… ALL.THE.DATA!

MIT Review

The Science of Collaboration

“Alone we can do so little; together we can do so much” – Helen Keller.

A new paper put out by Nature Communications made the above quote pop into my head. The paper, “Accelerating the search for the missing proteins in the human proteome,” discusses a new database that can hopefully aid in the efforts to find all of the “missing proteins” in Homo sapiens, MissingProteinPedia.

The goal of MissingProteinPedia is to help speed up the process of classifying proteins as “real” proteins. The Human Proteome Project (HPP) is a major database that classifies human proteins. It uses a ranking system of PE1-PE5, where PE1 are those proteins that have been confirmed through mass spec, solved X-ray structures, antibody verification and/or sequencing via Edman degradation. The PE2-PE4 groups are proteins that have evidence for existence at the transcript level, are inferred to exist based on homology, or just flat out inferred to exist.

Now, although the HPP is great at ensuring protein data is very quantitative and high-stringency, those two factors can be a hindrance at times to categorizing proteins as PE1 proteins. For example, the authors bring multiple proteins with REPRODUCIBLE evidence of their impact on humans (such as prestin and interleukin-9), but are relegated to PE2-PE4 status. As there are no data “confirming” the existence of the protein that are in line with HPP requirements, the proteins will not be elevated to PE1 status. This is where MissingProteinPedia comes in.

The goal of MissingProteinPedia is two-fold. First, it should be a database where anyone can both deposit and access information. Second, the hope is that this collaborative data can be used as a platform to help researchers generate the data required by the HPP to elevate these proteins to PE1 status.

NOW, are there certain things to be wary of? Of course. The authors of the paper openly admit that there is no check on the quality of the data in the database, and the data can come from a wide variety of sources, including unpublished work. Call – Out to REPLICATE and VALIDATE.

Currently, there are just under 1500 proteins in the MissingProteinPedia database. The website itself is easy to use and has some great information. You can narrow your search to a specific gene, or you can also search by chromosome. Clicking on a protein gets you a short description of the protein, as well as all relevant data, including homology, known domains, and references.

Additionally, there are some great characteristics of the database that make it more user-friendly:

  1. The data provided for proteins includes BLAST results for sequence similarity and functional annotation. This is unique amongst databases.

  2. MissingProteinPedia pulls in mass spectra from two of the best mass spec databases, PRIDE and GPM.

  3. The database in schema-less, making it more flexible. Without any rigid requirements for formatting or structuring of data, it is much more open and inclusive of different data.

  4. It incorporates text-mining. This allows researchers to retrieve more information, as the database sifts through text to identify other possibly related and/or relevant information.

Although the quantity and quality of data varies between proteins, there is plenty of information to give a researcher a head start on characterizing these proteins. And isn’t that what we do as scientists? We constantly build off each other and look to take everything one step further. You never know who your limited data will help or what big discovery someone’s piece of information sparks you to make.

When Science isn’t an Exact Science.

Nature recently published an article that highlights one of the uglier aspects of science that at times tends to plague students, postdocs and P.I.s alike: reproducibility.

The Nature editorial article focused on the work of the Reproducibility Project: Cancer Biology, which is a group dedicated to replicating experiments from over 50 papers published in big name journals like Science and Cell. While we always hope that replication studies go smoothly, that isn’t always the case.

The editorial spent a good chunk of its time discussing the attempt made to reproduce this 2010 paper that had some breakthroughs in tumor-penetration of cancer drugs:

Ruoslahti et al., Coadministration of a Tumor-Penetrating Peptide Enhances the Efficacy of Cancer Drugs, Science 21 May 2010 : 1031-1035

Unfortunately, the reproducibility group had some different results than the original paper. And when I say different results, I mean that the replication study found no statistical significance whereas the original study found great significance for the following end-points:

  1. The permeability or penetrance of doxorubicin was not enhanced when it was co-administered with IGRD peptide.

  2. Tumor weights showed no statistical significant difference.

  3. No difference was seen in TUNEL staining.

So what do we make of a result like this?

Well, we do believe it is important to state what we should NOT do. We shouldn’t entirely disregard the results of the 2010 paper. As stated previously, REPLICATING is not REPRODUCING. In order to properly reproduce evidence-based science, there needs to be different methods and multiple observations under diverse conditions. The reproducibility project seemed to use most of the same conditions, and one would think that these experiments should be reproducible….. But it wasn’t IN.THIS.CASE.

However, maybe we should not be focusing solely on the issue of reproducibility and instead ask if the effects of the IGRD peptide are similar to the findings of the 2010 paper when it is tested with other chemotherapeutics and/or cancer models.  If the effects seen with the peptide are indicative of a true biochemical effect, the enhancements of permeability and penetration of chemotherapeutics when they are co-administered with the peptide should be seen across the board, regardless of the model.

To this end… there are currently 51 articles in PubMed that can be found with a simple search of “Tumor Penetrating Peptides”. Most of these 51 papers are not from the lab that published the 2010 paper.

NOW: Should we disregard this line of investigation and view it as bunk due to the failure to replicate? Thankfully, no. The 50 papers on PubMed indicate that this field of study is an active and growing body of research.

Unfortunately in our click-bait society, people will only read the headline and a select few sentences before drawing a conclusion. In fact, Nature spent most of the editorial on this one failure despite mentioning that 10 other labs have already validated the findings of the original 2010 paper. If 10 independent labs are able to reproduce the findings and only 1 lab has failed to do so, that’s science.

And truth be told, isn’t that the purpose of peer-review publications? Put yourself and your scientific ideas out there for the world to comment, replicate and reproduce? And then the body of evidence and then knowledge moves forward.  

Science, keeping us all on our toes.

Two news stories hit the press this morning that struck me interesting. Mostly due to the extreme opposites.  

The first; a previously employed Pfizer senior scientist has agreed to retract 6 papers.  Six, high impact, each with different first authors on slightly different subject matter, primary research papers.    OUCH!  The cause of the retraction appears to have been ‘suspicions of data manipulation’.  In this case, the images on each of the six papers were duplicates of one another but being applied to and made to believe were found independently.  

Here’s the citations of the papers:  

  1. Nassirpour et al., miR-221 Promotes Tumorigenesis in Human Triple Negative Breast Cancer Cells, 8(4) PLOS ONE, (2013);

  2. Baxi et al., Targeting 3-Phosphoinoside-Dependent Kinase-1 to Inhibit Insulin-Like Growth Factor-I Induced AKT and p70 S6 Kinase Activation in Breast Cancer Cells, 7(10) PLOS ONE (2012);

  3. Mehta et al., A novel class of specific Hsp90 small molecule inhibitors demonstrate in vitro and in vivo anti-tumor activity in human melanoma cells, 300 Cancer Letters 30 (2011);

  4. Mehta et al., Effective Targeting of Triple-Negative Breast Cancer Cells by PF-4942847, a Novel Oral Inhibitor of Hsp 90, 17(6) Clinical Cancer Research 5432, 2011; and

  5. Nassirpour et al., Nek6 Mediates Human Cancer Cell Transformation And Is A Potential Cancer Therapeutic Target, 8(5) Molecular Cancer Research 717 (2010).

I actually took 20 minutes to see if I could see the duplications, which are within blots, not between blots or between papers.  I could not see the suspicions without direction from PubPeer.  This makes me wonder, how and who are these people that review papers for PubPeer?  From their website, it appears to be crowdsourced and anonymous.  Why didn’t the reviewers assigned to EACH of these papers see it prior to publication?  

There are several voices in this argument and I won’t get into it, suffice to say that Replicating is not Reproducing.  True science, in silico or not, requires reproduction of results prior to stating conclusions as facts.   Different methods, multiple observations carried out under various diverse conditions.  And all under TRANSPARENCY.


Which leads me into the second interesting article of news this morning in TheGuardian:  “Evidence Suggests Woman's Ovaries Can Grow New Eggs.”   WHAT??!!?!?!

This title screams media abuse.  However, ALL authors are quoted as being cautious and humbled by the findings, Senior Author, Evelyn Telfer is quoted saying,

“There’s so much we don’t know about the ovary,”
“We have to be very cautious about jumping to clinical applications.”

David Albertini, is quoted,

Honestly, I think there are too many other ways to explain the results, [only].. one of which is that new eggs were made,”  

And Nick Macklon, said that the work

“raises more questions than it answers”.

If there’s one thing I can gleam positively from these bigger than life research stories is that they are under the most scrutiny.  I’m sure PubPeer will be on it, looking at all the histology panels with magnifying glasses.   

And if there’s some fact to this dogma shattering finding, HAIL.

Science, keeping us all on our toes.  

Proteomics is stretching analytical boundaries

This month, two particularly useful and imaginative papers were published in the proteomics world.  The first from Neil Kelleher’s lab, (Molecular & Cellular Proteomics 15: 10.1074/ mcp.M114.047480, 45–56, 2016.)  using an integrative approach with top-down and bottom-up proteomics and RNA sequencing to describe a model of breast cancer.  The second was published by Claus Jorgense’s lab, (Tape et al., 2016, Cell 165, 1–11 May 5, 2016.) and uses an integrative global proteomics with multivariate phosphoproteomics to describe an oncogenic KRAS mutation effect on cellular signaling inside tumor cells and the tumor’s external stromal cells.   Here’s our take on the first paper, hopefully, we’ll get time to write a summary of the second in short order.  

Ioana Ntai et al.,  set out the large goal to compare the ability of top-down (TD) and bottom-up (BU) proteomics approaches to characterize different proteoforms that include post-translational modifications (PTMs), single nucleotide polymorphisms (SNPs) and novel splice junctions (NSJs) informed from transcripts.  They completed three different experiments to determine these ends.  Interestingly, any proteoform sequence that was consistent between human and mouse was removed from summary counts.   

Not surprisingly, BU proteomics out performs TD for greater number of proteins identified.  BU also identified more NSJs and SNPs as confirmed with transcripts.  However, they show a nice example of where TD identified SNPs and protein products from heterozygous alleles.  Most interestingly, in TD approach,  closely related proteoforms, such as those with a SNP, co-elute in to the mass spectrometer and thus can be quantified pretty precisely.  This allows relative quantification confirmation of heterozygous alleles.  Although, TD approach is currently limited to lower molecular weight proteoforms, still over 1,000 proteoforms were able to be quantified with low FDR values between the two compared PDX model systems.  The trend of greater spreads in fold change estimates coupled with higher confidence of their differential expression with TD is exciting.  It begs to ask, how much relative quantification is removed due to inference of peptide quantification.   

Comforting, is the fact that many of the relative quantification for the intersection of detection in both methods,  is in agreement regarding relative abundance.  Disagreements often arise from changes in PTM stoichiometry, so they say.  Speaking of which, the TD approach really shines at detecting multiple isoforms and complex stoichiometry of PTMs that are completely lost with BU approaches. Finally, the authors document with all seriousness the effect of how utilizing less stringent thresholds for identification statistics has on the entire study’s quantification confidence.  We see this effect all the time, but it’s SO very great to see it in action in the academic world with examples.  

Well done.