I hate the expression “Experimentally Validate…” or “Validate Experimentally…” particularly when applied to computational biology studies.

There is an assumption in biological sciences that the only way to be sure of a result suggested by computational analysis is to perform an experiment.   However, it does not make sense if you have used sophisticated computational methods to analyse the results of thousands of experiments published over decades to arrive at a conclusion, to then run a handful of experiments to “validate” your findings!

In biology, since the systems being studied are complex it is hard to minimise the variables down to one under investigation.   As a result, it is usually very difficult for a single experiment to “prove” a finding.   The very low threshold of statistical significance that is acceptable in biology is a consequence of this problem.  Whereas in most physical sciences you need a 5 or 6 sigma result to publish, in biology (and medicine) the threshold is 1.6 sigma.  In other words, about a 1 in 20 chance that the result is wrong.

Accordingly, experimental biology works by starting with a hypothesis and then gathering evidence for and against that hypothesis by carrying out a series of experiments that address the question from different angles.  At some point, enough evidence is gathered to support the original hypothesis or a hypothesis modified in the light of the experimental data gathered in the laboratory and from work published or otherwise communicated from laboratories world-wide.   This may be enough evidence to justify writing up for publication of the study with some conclusions based on all the evidence accumulated in favour of the hypothesis.  Although rarely combined into a single statistic, the combination of multiple lines of evidence that are consistent provides confidence that the result is real and not a chance artifact.

Of course, many experiments don’t work or perhaps give ambiguous results, or possibly give results that contradict what all other experiments on the system seem to indicate.   Contradictory results suggest further investigation is necessary: “Can I believe these contradictory results?   Do I trust that single experiment more than all the others?”   Further experiments might be designed to test this, or perhaps in the worst case, the investigator might put those contradictory results to one side as untrustworthy and push ahead to publish the positive data.   This is not necessarily wrong if the experimental method being excluded is generally known to be unreliable.  Also, it is rarely possible to explore every possible angle in a single study; time and money are not limitless and there is value in publishing results even if not completely conclusive.  However, the emphasis in the current publishing model is that only positive results are valuable, so scientists gain skills in putting a positive “spin” on their data and conclusions when writing up.  This also requires readers of the scientific literature to “read between the lines” of the positive spin to understand the true confidence in the result.

Referees of papers may spot potential deficiencies in the justification of the results given the experimental data and so suggest further experiments or analysis.  Authors may have to do these experiments and present results that satisfy the referees in order to get their work published.  In this way, the final published account should be as good a representation of understanding as is possible with the resource and minds that have been thinking about it.

Of course, it may turn out that the one experiment that did not fit the hypothesis was actually showing something fundamentally wrong with the hypothesis.    This may not be obvious immediately, but only emerge years later after substantial resources have been spent building on the erroneous findings.

Good experimentalists are highly skilled at identifying possible flaws in their own experiments and those of others.  They are superb at suggesting further experiments to carry out to help eliminate possible artefacts in the study.   This critique of their own work and that of others is what drives science forward and while many results will be misleading and contradictory, leads ultimately to greater understanding of biology.

“Validation” in experimental terms might mean using multiple technologies that explore different aspects of the biological system.  For example, NGS to look at transcript expression, proteomics to probe protein complement, the response of both to chemical probes that are known to affect specific processes and pathways, the effect of “knocking down” a specific gene and so on.

However, “validation” is not really the right word to use.  What is being done is seeking support from complementary methods.  A better word would be “Consistent”.  In other words, that the results of analysing experiments from multiple techniques are consistent with an underlying hypothesis.   The scientific process is to “Seek consistency” towards a clearer understanding.

Computational analysis is no different in this respect.  It is fine to do some new experiments based on the computational study to look for consistency.  Indeed, the goal of an analysis is often to suggest new avenues of experimentation.  However, this is rarely a true validation of the computational analysis.