Does Rossignol et al. show HBOT’s effective?

“Hyperbaric treatment for children with autism: a multicenter, randomized, double-blind, controlled trial” by Daniel A. Rossignol, Lanier W. Rossignol, Scott Smith, Cindy Schneider, Sally Logerquist, Anju Usman, Jim Neubrander, Eric M. Madren, Gregg Hintz, Barry Grushkin, Elizabeth A. Mumper appeared as an on-line publication 13 March 2009 and will appear in print in BMC Pediatrics. The article URL is http://www.biomedcentral.com/1471-2431/9/21

The recently published study by Rossignol and colleagues about hyperbaric oxygen therapy (HBOT) for Autism has generated lots of commentary and is sure to lead to more. Because it is a treatment study and employs more careful methods than are common in many of the therapies promoted these days, I sat up and said, “Hmm. I ought to read this one.”

So I did. And I found it to be, indeed, a cut above much of the ersatz research that’s passed off as evidence in the Autism arena. But, I found some concerns, too.

Those concerns led me to poke about a bit on the Internet to see whether there were any others who were raising questions. There are. And I still have some more poking to do. But, I thought I ought to record my concerns. Thus this post.

Others’ concerns

Steven Novella of the New England Skeptical Society (and Evidence-based Medicine) discussed several issues about the study. He questions whether

  • The sample is large enough to merit robust statements about effects. One reason for employing large samples is to ensure that researchers do not overlook small but significant effects; in this case, there were substantial effects on some measures despite the small number of participating children. Although this point mitigates the importance of small size of the sample in this study, there is a pesky issue about the variables on which the researchers found effects.
  • The participating parents may actually have been aware of whether their children were receiving the active treatment; that is, they may not have been “blind” to about the treatment. Although this is a problem in itself, it is compounded by the fact that some of the measures of improvement in children (rating scales) were provided by the parents.
  • The brevity of the time period in the study raises questions about whether, if there is a true effect, the benefits would be sustained. That is, the study could have discovered true-but-temporary improvements.

Over at Leftbrain Rightbrain, Do’C noted a concern with the nature of the independent variable (i.e., “treatment”). Do’C examined the prevailing barometric pressure at the locations of some of the sites where the therapy was administered and found that at higher-altitude sites it was likely that the children and their parents were subjected to lower levels of pressure than is reported in the study. To some extent, this may simply be a matter of rounding error; more importantly, it may indicate that even lower dose levels may result in theraputic effects. (Note that I used the word “may” in a neutral, plain way there.)

My concerns

These are important and reasoned considerations when reviewing the Rossignol et al. (2009) study. There are other issues on which other commentators may have touched, but that I’ve missed. Among those others concerns are several that I develop in the following paragraphs.

The participants included 62 children (and their families) drawn from 66 who were evaluated. They were drawn from 6 clinics apparently affiliated with the authors. I’d like to know more about this sample. Because many of the authors of the study are affiliated with clinics offering HBOT, it is likely that the sample is composed mostly of families that sought treatment at these clinics. The extent to which these families are representative of the families of children with autism is unknown. To the extent that they represent a specially selected sample of the population, the participants in this study make it impossible to generalize the results of the study to the full population. It would be helpful if the authors would explain how the participants were selected and provide additional data about the characteristics of the applicants for their services. In addition, it is important that the research team disclose fully their relationships with the participants’ families.

The measures used in the HBOT study are reasonable choices. The Clinical Global Impression (CGI) Scale is a widely used rating system. The Aberrant Behavior Checklist—Community (ABC) is an extension of an instrument originally developed for use as a means for capturing diagnostic indicators in in-patient settings where professional caregivers made ratings; in this instance, it was rated (with precedence) by parents or primary caregivers as on outcome measure. In contrast, the Autism Treatment Evaluation Checklist was designed to provide an estimate of treatments’ effectiveness; the developers’ Web site (Autism Research Institute) reports overall reliability greater than .90 (which is good) and variable but pretty high reliability for the subscales (.82-.94). I have not had the time to complete a careful review of the psychometric characteristics of these instruments (reliability, validity), but my general impression is that they are basically trustworthy. (If someone would drop a comment based on reviews from strong sources, that would help.) Perhaps the most substantial limitation to all of these instruments is that they are ratings made by people who are familiar with the participating children; they are essentially subjective rather than objective. To be sure, both children in the treatment and the control groups are rated subjectively, reducing the potential problems with the subjectivity (they are presumably equally subjective), but they do not provide the kind of objective data that produces powerful results.

Rossignol and colleagues employed procedures to mask the presence or absence of the active treatment (e.g., simulated HBOT for the controls; covering control switches). They also tested the conditions with a small number of adults and reported that these people did not report they could distinguish the conditions. These steps increase the trustworthiness of the manipulations.

Measure Experimental Group Mean Comparison Group Mean
ABC Post-test 46.4 ± 24.7 45.5 ± 17.3
ATEC Post-test 65.9 ± 16.4 70.1 ± 21.9

Table of Means (with SEM) for total or overall score on two of the measures.
bar graph of results on total scores

I am not a biostatistician, so my reflections on the procedures for analyzing the data are only observations, not a formal critique of the analysis. (Where are the epidemiologists on this? Epi Wonk? Photon?) Because the study was a randomized trial, it seems to me that the foremost analyses should be those comparing the levels of the dependent variables between the two groups. However, for each of the dependent variables, the authors lead their results with changes from pre- to post-test. The differences between the groups on the total scores (i.e., the most reliable scores) on the ABC and the ATEC are not significant. Table 1 shows the means and Figure 1 represents them graphically (with error bars); the data are from a supplemental table in Rossignol et al.

It appears that the authors depended on the gain scores to make their case. In addition, they conducted many statisticl tests, thereby increasing the chances that they would find at least some significant findings. Do they have a study-wise control for the number of significance tests they ran?

Some thoughts

Here are my tentative conclusions:

  1. Does this study demonstrate that the therapy is effective? No.
  2. Is this study so fatally flawed that it should be ignored? No.
  3. Should parents seek and professionals recommend HBOT based on this study? No.
  4. Does the therapy merit additional studies using rigorous methods? Yes.

Some research suggestions

To establish that HBOT provides benefits to children with autism, it will be important to have studies of it that

  1. Examine representative samples of children with enough participants to permit testing of both alternative hypotheses (i.e., multiple control groups) and potential participant characteristics that might interact with the HBOT treatment (e.g., degree of involvement).
  2. Collect objective measures of behavior of importance. (This is not to say that those in the Rossignol et al. study are not important, just that additional measures would help understand the effects of the therapy.) Among those that should probably be considered would be indicators of learning performance (e.g., trials to criterion on some sensible tasks), objective measures of the qualities of social interaction, forth, frequencies of stereotypies, and so forth.
  3. Incorporate controls that examine critical alternative explanations (e.g., parents are not blind to treatment conditions) Although it was valuable to include what appears to be a placebo therapy in the the original study, to assure scientists that the active ingredient is, indeed, the HBOT it is necessary to have conditions that discount other features of the therapy.
  4. Conducted by independent researchers, those who will receive no financial gain if the therapy proves effective.

Some sources

References about the utility of the ABC

  • Brown, E., Aman, M., & Havercamp, S. (2002, January). Factor analysis and norms for parent ratings on the Aberrant Behavior Checklist-Community for young people in special education. Research in Developmental Disabilities, 23(1), 45-60. Retrieved March 21, 2009, doi:10.1016/S0891-4222(01)00091-9
  • Marshburn, E., & Aman, M. (1992, September). Factor validity and norms for the Aberrant Behavior Checklist in a community sample of children with mental retardation. Journal of Autism and Developmental Disorders, 22(3), 357-373. Retrieved March 21, 2009, doi:10.1007/BF01048240
  • Aman, M., Singh, N., Stewart, A., & Field, C. (1985, March). Psychometric characteristics of the Aberrant Behavior Checklist. American Journal of Mental Deficiency, 89(5), 492-502. Retrieved March 21, 2009, from PsycINFO database.
  • Aman, M., Singh, N., & Turbott, S. (1987, September). Reliability of the Aberrant Behavior Checklist and the effect of variations in instructions. American Journal of Mental Deficiency, 92(2), 237-240. Retrieved March 21, 2009, from PsycINFO database.
  • Rojahn, J., Aman, M., Matson, J., & Mayville, E. (2003, September). The Aberrant Behavior Checklist and the Behavior Problems Inventory: Convergent and divergent validity. Research in Developmental Disabilities, 24(5), 391-404. Retrieved March 21, 2009, doi:10.1016/S0891-4222(03)00055-6
  • Aman, M., Singh, N., Stewart, A., & Field, C. (1985, March). The aberrant behavior checklist: A behavior rating scale for the assessment of treatment effects. American Journal of Mental Deficiency, 89(5), 485-491. Retrieved March 21, 2009, from PsycINFO database.
  • Aman, M., Richmond, G., Stewart, A., & Bell, J. (1987, May). The Aberrant Behavior Checklist: Factor structure and the effect of subject variables in American and New Zealand facilities. American Journal of Mental Deficiency, 91(6), 570-578. Retrieved March 21, 2009, from PsycINFO database.
  • Aman, M., Burrow, W., & Wolford, P. (1995, November). The Aberrant Behavior Checklist-Community: Factor validity and effect of subject variables for adults in group homes. American Journal on Mental Retardation, 100(3), 293-292. Retrieved March 21, 2009, from PsycINFO database.
  • Karabekiroglu, K., & Aman, M. (2009, March). Validity of the Aberrant Behavior Checklist in a clinical sample of toddlers. Child Psychiatry & Human Development, 40(1), 99-110. Retrieved March 21, 2009, doi:10.1007/s10578-008-0108-7
  • Updated 17 November 2009: Corrected typographical and style errors.

    6 Responses to “Does Rossignol et al. show HBOT’s effective?”


    • To some extent, this may simply be a matter of rounding error; more importantly, it may indicate that even lower dose levels may result in theraputic effects.

      Or, it may simply suggest that the largest effect is placebo-effect-by-proxy, and pressures (whatever pressures) under an added .3 ATM, largely irrelevant.

    • John L

      Personally I have no preconceived notions on the effectiveness, or not, of HBOT in treating autism disorders.

      I think though I will wait for professional articles and critiques of the study and its limitations before reaching the definitive conclusions that you have arrived at.

    • Howdy, folks. Thanks for the comments.

      Do’C, I agree that there’s a strong chance for proxy effects in this study. That alternative seems strengthened by the improvement in the placebo group scores. In fact, that’s part of the reason that I’m still pretty reluctant to make much from the gain scores…well, that and what Cronbach & Furby clarified about problems with gain scores. [Cronbach, L. J., & Furby, L. (1970). How should we measure “change” – or should we? Psychological Bulletin, 74, 68-80.]

      Harold, I have to admit to a substantial measure of skepticism regarding just about any intervention. To me, that’s part of the responsibility of intervention researchers: design studies so that they are clean and powerful tests of interventions. To be sure, I’d be hard pressed to point to any individual study that I’ve conducted that meets the standards I recommend, but I’m hoping to keep on plugging away at the task. And, by aggregating studies (i.e., by conducting coordinated series of studies and then integrating them), I think it’s possible to get pretty close to some definitive conclusions.

    • I can understand the desire for “professional” critiques of the study, but what about the study itself? Which of the authors of this “professional” HBOT for autism study are trained, and board-certified in:

      A: Developmental-Behavioral Pediatrics
      B: Undersea and Hyperbaric Medicine
      C: Both of the above

    • As a physician trained in both Undersea/Hyperbaric Medicine and Clinical Epidemiology (but definitely not any kind of Paediatrics!), I am very interested in and grateful for this discussion.

      I agree with the tentative conclusions and unlike a number of other efforts to persuade us of the use of HBOT in a variety of neurological conditions, this trial is of sufficient rigour to demand our serious attention. It is interesting to see the use of pre-post ‘within group’ scores versus between group scores – we have seen this before often enough to treat such analyses with heightened scepticism.

      I think another problem for interpretation of this trial might lie in the nature of the interventions. It is not at all necessary to invoke HBOT as a therapeutic modality in this study. 24% oxygen at 1.3 ATA is an equivalent dose of oxygen to 31.2% at 1 ATA. This can easily be administered without the help of the sponsors ‘low pressure’ hyperbaric chamber. Even if we were to accept the difference between the groups as real, we cannot know from this study if the benefit derived from pressure, oxygen or a combination of the two. We need a study group on 31% oxygen with sham compression to sort that out.

      Finally, it is worth noting that the great majority of the potential mechanisms of action discussed in this paper and elsewhere apply to much high doses of oxygen (between 2 and 3 ATA 100% oxygen). There is minimal biological plausability for the effect suggested in this trial.

      Extraordinary claims require extraordinary proofs….

      Mike Bennett, Sydney

    • Dr. Bennett, thanks for dropping this comment. Your points help cement the case that this study needs to be taken seriously, but not as definitive proof of treatment effects.

    Comments are currently closed.