Thursday, November 14, 2013

The FDA, statistics and fundamentalism

Yesterday the FDA panel heard and voted on various matters that may or may not lead to approval of Lemtrada, a Multiple Sclerosis drug. [The press is now reporting in a way that makes you think approval is likely...]

Lemtrada had various features that made it hard - if not impossible - to design a proper double-blind test of its effectiveness. [Both Lemtrada and the counter-factual drug Rebif have substantial and identifying side-effects. The patient would know which drug they were on.]

Instead Sanofi/Genzyme designed a test in which the patients were not blind as to which drug they were taking however the people who rated their MS symptoms were blind. [This is the so-called "rater-blind" test.]

On Friday the FDA staff put out an advisory note on Lemtrada which (a) noted the bizarrely nasty side-effects of the drug, and (b) noted some anomalies in the statistics in which Sanofi/Genzyme presented to support the application for approval. This staff note was widely reported (see Bloomberg for an example) and it was thought that the drug may not be approved.

Safety was no surprise. The statistical anomalies were.

Today the FDA Panel heard the case. The best coverage came from Twitter - and I am grateful to Sarah Karlin for her wonderful Twitter account. The panel voted on many things, however I want to highlight just a couple:

Question 1: are the trials adequate and well-controlled?

Vote: 11 no, 6 yes. one abstained.

Question 2: Has the applicant provided substantial evidence of effectiveness of alemtuzumab [Lemtrada] for the treatment of patients with relapsing forms of MS?

Vote: 12 yes, 6 no.

On question #2, after the panel voted, the FDA staff asserted that if you voted no on question #1, you can't vote yes on #2.

The committee chair said they voted yes based on gut feeling which I am not sure is adequate - but the FDA staff assertion is wrong and I believe incompetent. It is perfectly acceptable to say that the trials are not "well-controlled" but there is substantial evidence of effectiveness. I present to you, yet again, the famous "parachute paper".

In that paper Gordon Smith and Jill Pell observe (with appropriate irony) that there has never been a well controlled double-blind study of parachutes as a device to "prevent death and major trauma related to gravitational challenge". Indeed because an adequate and well-controlled test is a requirement for approval parachutes are not approvable.

This is bunk.

We know why a parachute works.

We understand the physics well.

A single person who jumps out of an airplane and lands unharmed with a parachute is a convincing test that parachutes are effective.

Bluntly, you only need double-blind stats when you don't understand the physics.

The FDA staff have (as parodied in the parachute paper) raised their statistical mystique to a cult - as if it were the only valid form of epistemology. But there are other ways of knowing, they are more valid (as per the parachute example) and there is a deep underlying incompetence in the FDA cult of double-blind statistics and in the assertion highlighted above.

We are living in an age where it is possible to have deep knowledge as to how the body works. We know for instance there is a gene which codes for a protein which is necessary in the metabolism of certain selective serotonin reuptake inhibitor drugs and the presence or absence of that gene has effects. Very small sets of statistics that confirm the effects would be valid epistemology just like a very small set of statistics about people who jump out of planes with parachutes would enable you to conclude that parachutes are a good idea if you want to jump out of planes.

I read the statistics paper that the FDA staff put out on Lemtrada. It was not fun reading. It also convinced me that we should be circumspect about both our knowledge and our methods of determining what we know.

Circumspection about epistemological process however is hard for a fund manager. It is even harder for a government department.


PS. I wrote that an interest in epistemology would be used as a criteria in assessing job applicants at Bronte Capital. I was not kidding.

Disclosure: there is a traded contingent value right whose value depends on approval for and sales of Lemtrada [ticker: GCVRZ]. We own a small quantity. We also roughly doubled the position after the FDA staff paper was published.


Tyler said...

The post is still up. Should the world conclude Bronte has not expanded?

RandomFundamentalist said...

Do you have a differential opinion compared to consensus on sales potential for the Lemtrada versus other drugs on the market? The milestones laid out in the CVR seem quite the stretch when compared to consensus estimates for peak sales. Do you have high confidence that the drug, if approved, generates $400mm in the first four calendar quarters of launch? I'm sure you do, if you own these, but I'm curious what you see that others don't.

Dan Davies said...

The view that randomised controlled trials are a short way of getting round all the messy problems of development economics is also about to start doing some significant damage in the world ...

John Hempton said...

The reason why Lemtrada probably does OK relative to REBIF is that the dosing is essentially two sets over a lifetime - not continuous.

That matters. They get to charge more. Hurdle should not be difficult.

Martin Barry said...

John, I'm curious how you can wave away the statistical deficiencies with "the science is well understood" when section 4.4.1 of the FDA notes clearly states "The mechanism by which
alemtuzumab exerts its therapeutic effects in MS is unknown."?

John Hempton said...

Oh, the statistics problems are real. My objection is to the word "can't".

It is possible that the method sucks and the evidence is compelling and the FDA staff assert otherwise which speaks to a deep and ugly fundamnentalism.

The statistics paper is imbued with that the whole way through.


Two endpoints, one endpoint is a clear issue here... its just that my back got up with the assertion which is obviously wrong.


pelicans said...

Data are the only things that matter. I think the FDA position makes sense, as the physiopathology of MS is extremely poorly understood. The parachute paper is fun because it illustrates that RCTs are not the only form of data, but parachute effectiveness is supported by a tremendous amount of observational data both on parachutes themselves and, as you point out, the underlying physics that explain their (simple) action. None of this data exists for MS. For years beta-blockers were contra-indicated for heart failure, because it seemed obvious that for people whose hearts weren't pumping adequately, a negatively inotropic drug would make things worse. Once we had real clinical data, this was clearly seen to be wrong, and beta-blockers are now among first-line therapies. We still don't even know how paracetamol works in any detail, but we know it does work due to RCTs.

I know this isn't the argument you were making, but after really enjoying the parachute paper on first read, I've grown to hate it because its always used to make sweeping 'See, I told you science is wrong' arguments.

Martin Barry said...

John, I guess the real issue is that good trial design and execution along with robust statistical analysis is designed to get us from "X appears to Y" end of the confidence continuum to the "X is very likely to Y" end which then allows the use of X in a wider, less controlled context. The FDA staff's job is to attest to where on the spectrum the current application sits. The panels job is to decide if that is "good enough". In the end they both appear to be doing their job, even if some of the language reeks of CYA posturing.

dearieme said...

"you only need double-blind stats when you don't understand the physics."

It's claimed that Rutherford said that if you need to apply statistics to your experiment you should design a better experiment.

Alex said...

A few points here:

1. The difference between questions 1 and 2 appears to revolve around the definitions of adequate and substantial. Can you provide substantial evidence of effectiveness without adequate and well controlled tests?

2. Our 'deep knowledge' of the body is still far from complete. With any new drug any number of things can go wrong. For example:
- there may be unforeseen interactions which prevent a drug reaching its target.
- the drug may react with other chemicals in the stomach/blood/etc, changing the molecular structure.
- the drug may cause side effects which negate the drugs efficacy.
As a result, we need well designed drug trials to test for effects which have not been foreseen. Therefore, when approving drugs, 'adequate' trials are important.

I agree that we only need double-blind stats when we do not understand the physics well. But, we do not understand the body anywhere near as well as classical physics. We do not know everything about the body, and that is important. We cannot predict what will happen to a new drug once ingested in the same way that we can predict what happens when a skydiver pulls the cord on a parachute.

This is clear from the number of drugs that still fail clinical trials. If it was predictable, then why test a drug that was predicted to fail?

Therefore, until we can predict what will happened to a drug in the body with the same level of certainty as a parachute falling to the ground, don't we need double-blind stats?

3. Coming back to the question, perhaps it should be:

Can the FDA approve a drug where substantial evidence of effectiveness has been provided, but adequate and well controlled trials have not been completed?

For the reasons above, I think the answer is no.

John Hempton said...

All I am pointing out is that to say that you can't know without an appropriate double-blind test is fundamentalism based on ignorance.

If you wish to suggest otherwise I suggest you enrol in a double-blind test of parachute effectiveness.


jmacdon said...

John, you are arguing apples and oranges here, in a completely misleading way. You don't need a double blind test to know if a parachute is effective because the results of dropping to the ground from a great height is a deterministic issue. In other words, we can calculate exactly what will happen, because we can accurately compute cause -> effect.

On the other hand, the results from taking this drug are not deterministic, they are probabilistic. There is no clear cause -> effect, so the question at hand is 'does this drug in aggregate appear to be more effective than the conventional treatment'.

Although you claim that the researchers were blinded, that is not true. From the pdf you link:

"The three trials were unusual among pivotal trials for FDA-approved MS treatments because there was no blinding of either the trial subject or the treating physician for the two primary clinical outcomes. The degree of bias introduced by the lack of blinding depends on the objectivity of the outcome measures and the consistency and thoroughness of the trial procedures for collecting the outcome data."

Since nobody was blinded, and there is a huge wealth of studies that show both efficacy of placebos and biases inherent in unblinded studies, there may be biases in this study as well. There may not be biases too - we just don't know.

Because we don't know if the apparent efficacy of this drug is due to real biological effect, or due to the placebo/rater effect, we cannot say for sure if the drug is truly effective or not.

It then comes down to a logical question. If you believe

1.) That you cannot distinguish the apparent effectiveness of a drug from true biological effect or a placebo/rater effect.

then how can you say

2.) That you think the drug is effective?

This isn't fundamentalism, it is simple logic. And confusing deterministic outcomes with probabilistic outcomes is just silly.

horatius said...

"Very small sets of statistics that confirm the effects would be valid epistemology just like a very small set of statistics about people who jump out of planes with parachutes would enable you to conclude that parachutes are a good idea if you want to jump out of planes."

This is essentially the idea behind Bayesianism. Deep knowledge allows you to establish a strong prior, which enables inference with a limited (if any) set of data.

John Hempton said...

It is not simple logic -

There are convincing things other than statistics. A small sample of people who seem miraculously cured when those are extreme outliers in normal data would suggest that it is effective FOR SOME PEOPLE.

I am utterly unconvinced by the statistical fundamentalism shown here. Indeed I believe it to be a cult.

Also - for reference - the counterfactual was not a placebo. It was Rebif - a fairly effective drug with massive side-effects.

Against placebo it would do very nicely thank you very much.

Anonymous said...

In my view, there is/should be a continuum on bias and efficacy. There can be slight bias and huge efficacy, or vice versa. Therefore, it is possible for there to be both bias and efficacy. It seems like most of the panel believes there was strong bias, but also believes there is "likely" efficacy for sickest patients, but realize that is more of a "feel" than something proven (due to bias).

Anonymous said...

So John, is your thesis on the CVRs based on achieving the sales milestones then? I'm assuming you don't expect the FDA to approve before March 31, even if they eventually do later, so there goes a slightly discounted $1 off the valuation and a slightly greater discount on whatever future $2-12 might be coming eventually, right?

What's your view of peak sales?

jmacdon said...

We'll have to agree to disagree then. But before I go back to more important things, might I make some observations?

First, you are misusing the term counterfactual. A counterfactual is something that didn't happen, that you want to compare to something that did happen.

Second, you are using a straw man argument to make your point. The FDA are 'statistical fundamentalists' who belong to a 'cult' and are therefore not to be trusted. Whether or not there is a cult at work here is irrelevant. The question being confronted is whether or not Lemtrada is safer or more effective than Rebif in the population of patients to whom it may be given. It may well be a miracle drug for some small segment of the population, but if it is devastating for everybody else do we really care?

I assume you do, because you have a bet going that it will be approved and are angry that these nasty cultists are blocking your way.

Since we cannot know a priori if a patient is one of the few for whom this drug is a miracle drug, it doesn't matter that this may be the case. It comes down to risk/reward. If the risk is higher for Lemtrada than it is for Rebif, and the reward is confined to some small subset of MS patients, then allowing doctors to blindly give it to their patients is nonsensical.

Third, fundamentalism implies that people are sticking to beliefs for which there is little or no evidence, rather than allowing observational data to inform their thinking. So on the one hand we have the FDA who know that

A) If you give people something that they think will help, then they tend to get better. This is the well known placebo effect. Placebo in this case can refer to either the conventional treatment or a mock treatment. There are any number of studies that back this assertion.

B) If you are a researcher, and are testing a new treatment that you hope will be an improvement, you will tend to be biased towards thinking it is in fact better. This is simple human nature, and is also backed by a huge amount of studies. You might know this as confirmation bias.

In this case, the evil FDA looks at this clinical study and knows that there may well be two huge sources of bias that might make Lemtrada appear better than Rebif, since both the subjects and the observers both know who got what, and may all be biased towards thinking Lemtrada is better.

They are then asked if this study is flawed, and the majority agree that it is. And the flaw that they agree exists is that Lemtrada may appear better than Rebif simply because everybody knew who got what.

They are then asked if they think Lemtrada is effective. The logical answer is that they can't know because there is no way to say if the results are wishful thinking or a real biological result.

It is logically inconsistent to say that the clinical trial was crap, but that you think they proved something. I assume you are familiar with the term garbage in garbage out?

Anonymous said...

I completely agree with John that the insistence on a double-blind study in all cases is an non-scientific orthodoxy. In this particular case, the study compares two drugs, Lemtrada vs. Rebif, with widely different administration schedules. Lemtrada is given for 5 consecutive days initially and 3 consecutive days a year later, while Rebif is injected 3 times a week. Furthermore, and more ominously, both drugs have severe side effects that are well-known by both MS patients and treating doctors, ranging from injection site infections to thyroid problems. It is foolhardy to pretend that these differences can be ignored in a well-designed double-blind study.

Reading the 300-page FDA staff report leaves me with the distinct impression that the staff is upset that Genzyme went with its approach for study design despite the agency's objections. The review is extremely harsh for the given set of results and facts. Nobody is denying that there are severe side effects with this drug. However, arguing that Lemtrada's doubling of response rate over the gold standard MS therapy Rebif could be due to bias and placebo effect is quite ridiculous.

I don't know how the vote on question 1 breaks down with respect to committee members' specialties, but I strongly suspect that the majority of the no votes must have come from non-clinicians- epidemiologists, biostatisticians, etc. These people indeed would never have approved the parachute as a life-saving device without a double-blind study.

One final point. Making a distinction between deterministic physics and probabilistic biology as my comment writers have done, is an attempt to find a distinction that falls flat on its face in my view. In both cases, a robust double-blind study seems nearly impossible and unnecessary.

pelicans said...

"There are convincing things other than statistics. A small sample of people who seem miraculously cured when those are extreme outliers in normal data would suggest that it is effective FOR SOME PEOPLE."

How are these data not statistics? You've just cited both a cure rate in a sample of treated patients, and a cure rate from a 'normal' (presumably untreated) sample - essentially a non-equivalent comparison group. The conclusions you could draw from these data are weaker than you could from a well-designed RCT.

I don't think anyone is claiming that you can't draw any information at all from these data. The FDA are saying that the level of certainty you can obtain is not sufficient to justify drug approval (in the context of a condition with multiple existing therapies). Quoting the FDA paper - "The issues arisen from the two studies are beyond the scope of statistics and cannot be solved by
any statistical methods. It is not about the appropriateness of statistical methods or inflation of α, and p-values from analyses, large or small, are irrelevant. The only way to solve the issues raised in this review is to conduct fresh new studies that are adequately designed and well controlled."

gv said...

"We are living in an age where it is possible to have deep knowledge as to how the body works".


Bruschettaboy said...

It is at least a little bit amusing that as any discussion with a RCT fundamentalist drags on, it is more or less inevitable that they will refer to randomised controlled trials as "The Gold Standard" of scientific evidence.

Anonymous said...

What is a "counter-factual drug"?

Anonymous said...

John, thanks for the article and the link to the parachute paper.
The problem with this drug is that it doesn't work straight away. It seems to work with a delay, which can probably scientifically explained.

So in the 2nd year of treatment there is a significant health benefit. I read that this doesn't count. For any MS drug the standard measure seems to be the efficacy measured over the full trial period. In that sense the FDA might be called dogmatic.

Anonymous said...


I think this epistemological example is one of the smartest things I have read on an investing blog in a long time. Technocrats and investors alike deny the importance of judgment. Judgment has become a dirty word because if we just run more controlled experiments, we can have "proof."

The FDA staff assertion is the "death of judgment" and is a deep flaw found in current technocratic processes. It is a deep mis-use and mis-understanding of science. I call it "naive science" . . . the idea that the only things we can believe must be demonstrated by repeated (large n) controlled experiments . . . the idea that the process of inductive science is akin to deductive math . . . Euclidean science.

I see this all the time in investing as well. Analysts think they are going to "prove" that stock X is cheap or rich rather than simply creating the logical structure and describing the various judgments made.

Buffet has described this when he discussed how investing is deeply personal. That the final step almost always involves deeply personal judgments that are neither provable nor even explainable at times. I find that this is why investing in large teams is difficult because the ideas that rise to the top are the ideas that are clean and technical, but these are rarely the best ideas.

milkchaser said...

It is not true that the observers were not blind. In fact, the raters were blind as to who got which drug. They were tasked with rating improvement in MS symptoms (or lack thereof). They were not told whether the patients were getting Lemtrada or Rebif or scotch.

Anonymous said...

I am a CVR holder, so I am certainly biased. But I have a question I would like the FDA side of the argument folks to answer. Your 33 year old daughter has MS. She has failed Rebiff. She tests postive for JC virus. Do you wish Lemtrada was an option for your family? I do not mean to be confrontional, but rather truly learn how this effects your view if at all.

Anonymous said...

this link provides some great thoughts on the ethics of a placebo trial when an effective treatment is also available. How do you run a double dummy trial here when the vails of the active REBIFF clearly state REBIFF on them?

Anonymous said...

I'll answer the question about Lemtrada for a 33-year old daughter with MS. I'd never put her on a drug with inadequate trials. If Rebif failed, and she were JC+, I'd go with Tecfidera or Gilenya (both of which are nearly as effective as Lemtrada CLAIMS to be). If those failed, I'd try Aubagio, Avonex, Betaseron or Copaxone. If those failed, i'd risk PML with Tysabri. No way would I take the plethora of risks associated with Lemtrada with no proven benefit whatsoever.

Anonymous said...

This post and some of the comments seem to imagine that the FDA committee naively rely upon RCT due to blind adherence to methodology. However, the panelists worked through the role of incentives in the presentation of results:

"For example, patients who knew they were on the study drug might under-report relapses because they wanted to stay on the drug and continue getting its benefits; the study's consent form included a standard clause saying that if the patient's disease worsens during the study, they may be told they are being dropped from the study and that alternative treatment will be considered, he noted. Conversely, patients who knew they were on the comparator drug might over-report relapses in hopes of getting out of the study or being switched to alemtuzumab."

Anonymous said...

If given for years and years for not only CLL, but also induction for transplant recipients as Campath, I would MOST certainly prefer my 33 yo daughter receive Lemtrada for 5 days one year, 3 days the following year and be done with it, especially with a REMS program in place. The only way this drug doesn't get approved is dirty politics and that is the truth.

Anonymous said...

John - Any idea when the FDA is supposed to follow up on the Lemtrada discussion? Also curious if there are any free news sources that do timely and good job of updating this story?

Anonymous said...

Best place to keep updated on the Lemtrada story is Chris Demuth's forum on it.

Fishhawk Investing said...

Although this has nothing to do with the investment thesis of GCVRZ, it stood out to me as highly uninformed to state that the physical processes of drug action (or of any drug's action) are as well understood as a parachute. I strongly doubt that a good scientist who studies pharmacology would ever make such a claim. I submit to you that in fact many drugs are being studied currently because of their curious properties to discover the pathways that scientists assume they must signal through. This is because most drugs were discovered by serendipitous experimentation, and in many cases those with interesting effects were studied later. In fact, many signaling systems in the brain were discovered by studying recreational drugs. Entire signaling systems were discovered relatively recently. There are drugs that have been actively studied by scientists for decades if not centuries. Almost every drug has properties about it that are not fully understood. And even if all of that was not a problem, a lot of the time drugs will be known to act at certain receptors or enzymes or molecules that are themselves not at all or not very well studied. Particularly with high throughput screens these days people pull up target genes or target proteins that are altered by drug treatments and many of the targets will be things that no one studies or that only a few studies have ever examined. Some of the really obscure ones have names that can be 25+ long alphanumeric strings of characters.

The idea that drugs can be designed rationally based on understanding of biology is relatively recent. It's far from a sure thing at this point in time, and only really provides a jumping off point. There's still plenty of trial and error. You narrow things down to the district that you want to feel around in the dark inside of. Which is arguable a vast improvement over the previous situation.

(I like the investment idea, though.)

General disclaimer

The content contained in this blog represents the opinions of Mr. Hempton. Mr. Hempton may hold either long or short positions in securities of various companies discussed in the blog based upon Mr. Hempton's recommendations. The commentary in this blog in no way constitutes a solicitation of business or investment advice. In fact, it should not be relied upon in making investment decisions, ever. It is intended solely for the entertainment of the reader, and the author.  In particular this blog is not directed for investment purposes at US Persons.