Saturday, 22 August 2015

What is medicine's 5 sigma?

Another piece exploring the same theme as the previous post - that is the poor quality and high risk of of false positive results in much of the scientific literature. This time my source is an editorial from the Lancet a leading general medical journal. You can read it all here

Richard Horton the editor of the Lancet attended a meeting at the Welcome Trust in London. The theme of the meeting was the current state of the biomedical research literature and the endemicity of bad research. As one speaker put it - poor methods get results. Another stated that too any scientists sculpt their data to fit their preferred theory of the world or retrofit their hypotheses to fit the data (see the previous post about arrows and targets)
The current estimate is that about half of studies contain false positive results. So when proponents of the use of colour to treat visual stress ( and any number of other conditions) say that research in the peer reviewed literature has shown that ............ ( you can fill in the blank) that is not enough.These studies have to be looked at in detail.

A number of factors were identified that put studies at high risk of producing false positive associations. These were

1) Small sample sizes
2) Tiny effects
3) Invalid exploratory analysis
4) Flagrant conflicts of interest
5) Pursuing fashionable trends of dubious importance

Much of the research on visual stress in dyslexia ticks all of these boxes. While this does not remove the need to read each study in detail and with  crtitical eye the statement that research in the peer reviewed literature has shown that........... is not nearly as impressive as it sounds.
Onwards with more reviews

Sunday, 16 August 2015

More negative studies and why it is a good thing

An important study has appeared in the journal PLoS One  that shows that fewer trials in the field of cardiovascular medicine are reporting positive outcomes. Why is this good new? And what does it have to do with the treatment of visual stress?

First; some background.  As many as 50% of published studies contain false positive results and the situation is particularly bad in the field of neuroscience and psychology. While some false positive results are inevitable, the numbers at present suggest some systematic biases in the literature.
For example, in the field of fMRI studies more positive findings are reported that the study designs can support. Trials have certain power or ability to detect  significant differences which depends on the sample size and variability of the population being studied. John Ioannidis has shown evidence of  bias (in the statistical sense of the word) operating. Small studies with a lower power to detect signal changes are finding as many positive findings as larger more powerful studies. Naturally, you would expect smaller studies, with less statistical power, to find fewer positive effects.You can download the study here.
So how has this state of affairs come about? Not through fraud as you might think -although that can happen.  Human nature and the subjective biases that affect us all are probably the culprit.
The first problem is that studies reporting positive outcomes are more likely to be published; so called publication bias. One of the things you have to do when reviewing the literature in a systematic fashion is to look out for unpublished material which may not have found a home because negative findings were being reported.
Another important factor is a flexible approach to data analysis. If you do an exploratory study in which you measure multiple variables, and analyse your data in multiple different ways you are quite likely to find some positive results which pass the arbitrary criterion for statistical significance of 0.05. For example you could stratify your groups multiple different ways or you could have multiple endpoints and not declare which was the primary endpoint.
This has been compared to the man driving past a barn who sees  lots of targets with arrows in the centre and assumes the farmer must be a pretty good shot. Then, as he drives a little further he sees the another wall of the barn  where the farmer is painting targets around arrows he has already fired into the wall! It's a bit like that with trials if you allow a flexible post-hoc approach to data analysis.
This may be acceptable for early exploratory studies within a field but such studies are at best hypothesis generating and the results need to be confirmed by properly conducted RCTs.
To get round this problem many funding bodies  insist that researchers pre-register the trial, stating what  what the outcome measures will be,  what subjects are going to be studied and what statistical tests will be used. This, in crude terms, is equivalent to painting the target on the wall before the arrow is fired. With this has come a reduction in positive findings and that is a good thing. False positive studies waste resources and can endanger human life.
So what has this got to do with the treatment of visual stress? Well, many of the studies of treating 'visual stress' in dyslexia show the hallmarks of a flexible approach to data interpretation, Dividing the subjects into multiple small groups and studying lots of different outcome measures. Then pouncing on those that appear positive an ignoring the rest.
Before anyone starts to take these ideas seriously we need a randomised controlled trial with pre-registered protocols and outcome measures.


Wednesday, 29 July 2015

Postscript to -The trials they don't want you to know about (1)

As a follow up to their paper published in 2011 Ritchie and colleagues published a study looking at the subjects of their first trial, one year on.

Ritchie SJ, Sala S Della, McIntosh RD. Irlencolored filters in the classroom: A 1‐year follow‐up. Mind Brain Educ. 2012Jun;6(2):74–80.

Though not as rigorous as their previous study, outlined in my last post, it does contain some fascinating data that does nothing to support the use of of colour to treat the overlap group with poor reading and visual stress. I suspect it was a bit of an afterthought put together following criticisms of their first study - that benefits would not be seen without longer term follow up.
In the first study an Irlen screener diagnosed 47 out of 61 poor readers with 'visual stress'.
Those 47 children were then tested with prescribed overlay and placebo overlay and no difference in the rate of reading using the WRRT or reading naturalistic text measured with the Gray Oral Reading Test (GORT).
After 12 months 22 out of 47 of the original sample were still using colour in some form and 18 of these were still available to study. So this was a highly selected group who passed two criteria for visual stress; the Irlen screening tests and voluntary sustained us of overlays. According to their teachers this group had used overlays or lenses for a substantial part of the time. For this reason, if coloured overlays are effective, you would expect to see real world improvements in reading in this group.
Unfortunately the controls were a group of poor readers without visual stress from the previous study. The authors ackowledge that the ideal control group would have been individuals with visual stress who had been treated with a placebo overlay.
The visual stress group was tested with the WRRT using prescribed filter, placebo filter and clear filter. In short this part of the study was a crossover or within subject design.
The other test used was the Gray Oral Reading test which is a standardised reading test that measures accuracy and rate of reading and crucially, questions to test comprehension are included. The outcome is standardised for age to produce the Oral Reading Quotient (ORQ). This means that  if a subject progresses normally for age the ORQ should remain the same. If there is a catch up in reading, as predicted by proponents of the treatment of visual stress, the ORQ should improve and then perhaps stabilise.
Results
 The WRRT. No surprises here really. There was no significant difference in reading rate between presecribed overlay, placebo overlay and clear overlay.












GORT Oral Reading Quotient ORQ
Figure two on the right shows that for both groups, the Irlen wearers and the controls, the ORQ declined over the study period. This occured in spite of imrovements in WRRT for both groups bringing in to doubt the validity of WRRT as a reading test.



Conclusions
The authors are good scientists who are careful not to over-interpret their study and they also acknowledge its weaknesses. In particular the small sample size and the lack of a visual stress control group. Nonetheless it is one of the few studies to take a longer term look at outcomes in a group where a difference might be expected to be found. That is, those with visual stress diagnosed by an Irlen-screener and who pass the criterion of voluntary sustained use over 12 months. In spite of perceived subjective benefit no improvement in real world reading was seen.

Saturday, 25 July 2015

The trials they don't want you to know about (1)

Not only do proponents of the treatment of visual stress with coloured overlays and lens mis-represent their own trials, they keep quiet about a number of negative studies. This study is arguably the most rigorous trial to date and was performed by reputable scientists with no financial interest in coloured lenses and overlays.
The paper?
RitchieSJ, Della Sala S, McIntosh RD. Irlen colored overlays do not alleviate readingdifficulties. Pediatrics. 2011 Oct;128(4):e932–8.

Participants
75 below average readers were selected from a school setting in Glasgow.
These children were assessed by a qualified Irlen screener. 13 were unable to co-operate with the test and one moved out of the area. Therefore 61 children were assessed and  47 (77%) were diagnosed with Irlen syndrome (visual stress). Although the study has been criticised for the high number diagnosed it is entirely consistent with the case control studies published by Kruk et al and Kriss and Evans. Similarly, the observational study published by Tyrell and colleagues in 1995 found that 100% of well below average readers  and 75% below average readers chose an overlay
The study had a cross over design and subjects received a prescribed overlay, an overlay of a complementary colour and clear sheet.
Testing was with the Mini Mental State Exam MMSE, Wilkins Rate of Reading Test, and Gray Oral Reading Test.

The trial was rigorously conducted. Even though it was a cross over study, sequence generation was adequately generated and allocation concealment was was good so testers could not have foreseen whether subjects would have received experimental or placebo overlay first.
Steps were also take to ensure that subjects did no know which overlay was being used.
Data collection was complete and the analysis was on an intention to treat basis. In short there were no obvious sources of bias.

The results
 Irlen overlays had no demonstrable effect on reading compared to placebo overlays.
It can been seen from the figure on the left that the test retest variability for the WRRT was high and some individuals read more words per minute and some read fewer with their chosen overlay some as much as 30% fewer. 
Interestingly the two outliers shown in triangles both knew what their chosen tint was. Strong evidence for placebo effect.




WRRT reading rates for the Irlen group, non Irlen group and
3 children for whom the treatment was not masked
The pooled data doesn't look any better. Neither the Irlen group or the 'normals' read any better with overlays. The only group to do better were those who were not masked. Strong evidence for the placebo effect.







What about reading naturalistic text? - The Gray Oral Reading Test -here the study design was slightly different 22 of the subjects with visual stress received their chosen colour and 22 a clear overlay. There was no difference between the two groups.

Conclusions
The authors were unable to demonstrate any improvement in reading the WRRT or naturalistic text using Irlen overlays prescribed by a qualified Irlen practitioner.


There was also a one year follow up which will be discussed in a future post

2.Ritchie SJ, Della Sala S, McIntosh RD. Irlen Colored Filters in the Classroom: A 1-Year Follow-Up. Mind Brain Educ. 2012 Jun;6(2):74–80

Thursday, 28 May 2015

Prevalence of visual stress in dyslexia and controls (3)

Henderson LM, Tsogka N, Snowling MJ. Questioningthe benefits that coloured overlays can have for reading in students with and without dyslexia. J Res Spec Educ Needs. 2013 Jan;13(1):57–65. 

It is a little unfair to include this very good paper here because it does not really claim to be an epidemiological study examining the prevalence of visual stress in dyslexia and controls.
The important conclusion of this study, which will be discussed in more detail on the in a future post, is that improvements in reading the Wilkins Rate of Reading Test (WRRT) obtained with coloured overlays are neither sustained nor matched by improvements in reading naturalistic text.
However, in the process of conducting the study, the authors tested for visual stress using the WRRT and intuitive overlays in students at a higher education establishment. There were 16 students with dyslexia and 26 controls with no reading impairment.
56% of the dyslexic students had previously been exposed to coloured overlays illustrating the problem of conducting epidemiological research in the area. Because of enthusiastic marketing of coloured overlays, treatment naive subjects are now hard to find.

The diagnostic criterion for visual stress was reading the WRRT 5%,8% or 10% faster using the chosen overlay.

Results
It can be seen that the 95% confidence intervals are very wide and straddle the odds ratio of 1- no matter which criterion is used.













Conclusion
The data could be consistent with visual stress being more common among controls or subjects with dyslexia. This reflects the small sample size.
Because of the possible sources of bias in this study.

  • 56% of dyslexics previously exposed to overlays 
  • Screeners not blinded to the reading status of the individuals   

This data could not be included in a meta-analysis
In contrast to the previous study, the authors are open about the shortcomings as well as the strengths of their data.

Monday, 25 May 2015

Prevalence of visual stress in dyslexia and controls (2)

If the first study was reasonably rigorous this study is the opposite. First, the personal biases of the authors are made transparent in the citation distortions which litter the paper. Second, for reasons which I will outline, the study is at high risk of bias in the statistical sense of the word.
The paper?
Kriss I, Evans BJW. The relationship between dyslexia and Meares-Irlen Syndrome. J Res Read. 2005 Aug;28(3):350–64.

I will be listing the more blatant citation distortions in the appendix to the post. In the meantime on with the paper.

Participants
  • Controls: 32 children selected from a classroom setting who had a reading age that was appropriate for their chronological age.
  • Cases: 32 children with dyslexia recruited from classrooms in various state schools and dyslexia clubs. It can be seen straight away that there is ample room for bias to creep in. Cases and controls are not drawn from the same population. Children attending dyslexia clubs are probably not be representative of the generality of poor readers and are more likely to have been exposed to the belief that coloured overlays and lenses may be beneficial. The authors state that an additional criteria was to have been labeled as having dyslexia by an education psychologist. It is not clear what criteria the educational psychologist was using. Was the assessment part of the study or was it simply reported by the parents of the children? It is well recognised that it is possible to 'shop around' among educational psychologists to get a diagnosis of dyslexia. The selection of cases puts this study at high risk of bias and contrasts unfavourably with the study by Kruk and colleagues described in the previous post.
Diagnostic criteria
Visual stress was diagnosed using the Wilkins Rate of Reading Test (WRRT) which has been previously described in this blog. It does not consist of naturalistic text. Instead, commonly used words are presented in random order in a small font with closely spaced lines. Subjects had to read 5%, 8% or 10% faster using their chosen overlay to be diagnosed with visual stress. The problem was that the screeners were not blinded to the reading status of the subjects. As a result ascertainment bias is likely.

The potential sources of bias in this study were
  • Cases and controls not selected form the same population
  • Diagnostic criteria for dyslexia not outlined
  • Screeners not blinded to reading status of subjects
Results

Although the odds ratio for 5%, 8% and 10% faster on the WRRT are all above one, the 95% confidence intervals were very wide and the results do not reach statistical significance. Given the positive spin people like to put on data, the results have been described as approaching statistical significance. How do you know? They could have been running in the opposite direction.




Conclusions
There are so many sources of bias that are not even acknowledged by the authors,that this study can not be used in a meta-analysis. Even if the data is taken at 'face value' it remains unproven that the visual stress, as measured with WRRT, is more prevalent in the population with dyslexia.

Appendix -citation distortions

Page 351 Paragraph1 ..to date there have been two rigorous double masked randomised controlled trials (Wilkins et al 1994; Robinson Foreman 1999) These trial support the existence of this syndrome and validate the treatment with individually prescribed coloured filters.
See Holy Trinity one for a review of Wilkins et al 1994 and Holy Trinity three for a review of Robinson and Forman 1999. These studies can not be described as rigorous and neither do they support the use of coloured filters.

Page 351 Paragraph 1 This accounts for a great deal of the controversy in the literature: studies using individually prescribed filters tend to be positive whilst those that test all participants with the same colour or a very limited range of colours tend to be negative (Evans 2001)
Again, not true. the reference is to narrative review by one of the authors. RCTs consistently show that individual tints are no better or worse than placebo(1-5)

Page 351 paragraph 2 The first double-masked randomised placebo-controlled trial found that individually prescribed coloured filters (precision tinted lenses) brought about a significant reduction in symptoms of eyestrain and headache compared with control lenses of a similar but different colour (Wilkins et al., 1994). 
Oh no it didn't - data was only available for 36 out of 68 participants. The loss to follow up was so high that no valid interpretation of the results is possible. They also neglect to mention that the slightly more robust conclusion with a follow up of 45/68 was that there was no improvement in reading rate, accuracy or comprehension with optimal tint compared to control tint.



Page 351 paragraph 3 The second double-masked randomised-controlled trial investigated the effects of coloured filters on reading speed, accuracy, comprehension and self-perception of academic ability, with the widely used Neale Analysis of Reading Test (Robinson & Foreman, 1999). A total of 113 participants were divided into three groups either using placebo filters, standard blue filters or optimal (individually prescribed) filters. Compared with the other groups, the group using optimal filters increased markedly in reading accuracy and comprehension, but not in speed (see below) 
Again not true. There was no difference between optimal tint, placebo tint and blue tint for the three months that was actually an RCT. The difference was to the untreated control group. 


Page 361 paragraph 4 The current ‘gold standard’ treatment is precision tinted lenses that have been individually prescribed after systematic testing with a wide and comprehensive range of colours, for example using the Intuitive Colorimeter
Irrelevant to the subject of this paper. Not supported by RCTS (1,3). Professor Evans give lectures paid for by Cerium who manufacture the 'Intuitive colorimeter'

There is plenty more like this. It is widely accepted that making up results is scientific fraud. In my opinion a change of culture is required to stop this sort mis-representation of the research literature. It is quite prevalent and viewed as a 'fair game' by 
some it is therefore unfair to single the authors out for criticism in this regard. See my post citation distortion.


1.         Wilkins AJ, Evans BJ, Brown JA, Busby AE, Wingfield AE, Jeanes RJ, et al. Double-masked placebo-controlled trial of precision spectral filters in children who use coloured overlays. Ophthalmic Physiol Opt J Br Coll Ophthalmic Opt Optom. 1994 Oct;14(4):365–70.
2.         Robinson GL, Foreman PJ. Scotopic sensitivity/Irlen syndrome and the use of coloured filters: a long-term placebo-controlled study of reading strategies using analysis of miscue. Percept Mot Skills. 1999 Feb;88(1):35–52.
3.         Mitchell C, Mansfield D, Rautenbach S. Coloured filters and reading accuracy, comprehension and rate: a placebo-controlled study. Percept Mot Skills. 2008 Apr;106(2):517–32.
4.         Ritchie SJ, Della Sala S, McIntosh RD. Irlen colored overlays do not alleviate reading difficulties. Pediatrics. 2011 Oct;128(4):e932–938.

5.         Harries P, Hall R, Ray N, Stein J. Using coloured filters to reduce the symptoms of visual stress in children with reading delay. Scand J Occup Ther. 2015 Mar;22(2):153–60.