Errors and Omissions in Experimental Trials - 1b
THE EVANSTON STUDY
The United Kingdom Mission (1953), after having observed the Evanston study, described it as "one of the most elaborate investigations." Hill et al. (1950) considered that they had planned the study so "as to measure every variable that might exert an influence and obscure the findings." It is the only trial in which bite-wing examinations were made for all subjects examined.
The importance of X-ray examinations. Blayney and Greco (1952) reported that in this trial "the X-ray disclosed 53.84 per cent of the total number of carious lesions observed by both clinical and X-ray methods". That said: "We believe it extremely important to employ both clinical and X-ray techniques in any study program which is directed toward the determination of the prevalence or the control and reduction in the rate of caries attack." This result must throw considerable doubt on the accuracy of the caries attack rates which were reported from the test and control areas in the other studies considered; for in these, X-ray examinations were incomplete or absent.
The ideal control community. The authors of the study stated that "It seemed logical to think of Oak Park, Illinois, as the ideal control community because of its close similarity to the study area" (Blayney and Tucker, 1948). The manner in which that city resembled Evanston was not stated. The United Kingdom Mission (1953) made the important observation that in Evanston the economic level was high, and "dental care was outstandingly good."
Lower caries rates in control community. It soon became apparent that Oak Park could not be called "the ideal control community", for Hill et al. (1951) stated that "Comparison of the caries rates of all children in the study area (Evanston, Ill.) and the control area (Oak Park, Ill.) prior to the addition of sodium fluoride to the communal water supply of the study area indicated a lower caries rate for school children of the control area."
Different rates in student groups. The authors continued:
In an effort to find the source of these differences in caries prevalence, it was found to be due largely to differences in the make-up of the student groups examined in the two areas. While in the study area 22.2 per cent of the children examined were attending parochial schools, no such children were included in the control area: and while 5.6 per cent of the children in the study area were Negro children, only 0.1 per cent of the children in the control area were Negro. Statistically significant differences were found to exist between the caries rates of Negro and parochial school children on one hand and public white school children on the other hand. Generally the caries rates of parochial school children were found to be higher and those of Negro children lower than those of white children in public schools.
Exclusion of data. Hill et al. (1951) continued:
Therefore, comparisons of caries rates for the study group and the control group are based on the caries experience of public white school children only, while such comparisons involving children in only the study area are based on the caries experience of all children in total. The caries rates for the Evanston white school children in the 1946 survey and the Oak Park white school children in the 1947 survey were very similar.
Six lines later, it was stated: "In further comparing the rates for Oak Park (control) and Evanston (study area) it is apparent that the baseline figures are very similar."
The only comparisons that can be made from the paper which has just been mentioned are the figures for the children aged twelve, thirteen and fourteen years. Negro and parochial school children constituted 27.8 per cent of the Evanston children. By excluding this part of the data the rates in that city were then considerably lower than those in the control city, the rates (Table IV) being 707.51, 946.17 and 1133.33 in every 100 of the Evanston public school white children for the ages twelve, thirteen and fourteen years; those in Oak Park being 774.29, 970.00 and 1194.64 for the same three ages.
An altered explanation. A different, but, at first sight, a reasonable explanation for the exclusion of the data of Negro and parochial school children, when making comparisons with data from Oak Park, was given in the XV Report (Hill et al., 1957a): "As the control area (Oak Park) examinations included only public school white children it was necessary to evaluate the Evanston data on the basis of school groups, public white, parochial, and Foster (Negro) to make comparisons of like groups." It can be seen that in that paper the exclusion of data was attributed, not to the fact that this process was undertaken because there was "a lower caries rate for school children of the control area" (Hill et al., 1951), but to the different racial composition of, and type of school attended by the children in the two cities. Hill et al. (1950) mentioned that one of their seven "other objectives" was "to compare the dental caries experience of white with that of Negro school children." No reference was made to the possibility of a difference being found between the rates of white public and parochial school children. However, the original statement (Hill et al., 1951) makes it clear that the different school groups were taken into account only after the unsatisfactory results of the first Oak Park examination became apparent.: "In an effort to find the source of these differences in caries prevalence." In assessing the accuracy of the second (1957a) explanation, it should be realized that in the younger age group "comparisons of like groups", or even the dissection of the data into the three school groups, were not published in the reports dealing with that age group, namely the 1950, 1952, 1954, 1956 and 1957b papers, or even in the XV Report (Hill et al., 1957a) which dealt with both age ranges, but showed this dissection for the children of the older age group only. Furthermore, when, after a delay of more than ten years, the 1947 Oak Park rates for the younger children were published for the first time by Hill et al. In 1958, no "comparisons of like groups" were made by them. The reader is prevented from making this comparison by the fact that, even now, the dissection of this age range into the three school groups has not been published, despite the statement by Hill et al in 1951 that the rates for "school children" were significantly different in each type of school.
"Correction" of data. When making comparisons with the control city, the authors excluded from the three groups of data obtained in the test city the two which diverged most from the rates of the children in the control city (Hill et al., 1951). This process should be considered in connection with the following statement (Hill et al., 1950):
In order to be able to generalize from our findings, we must be certain that any such variables as effect caries experience are represented in our study to the same extent as in the population. Before drawing any ultimate conclusions, we will, therefore, correct our data in such a manner as to include only those groups of children which are representative of the population, with respect to dental caries experience. We feel that this precaution is necessary to allow the ultimate findings to be considered valid and reliable.
However, the process which they described - the arbitrary selection of a section of the data, which is then termed "representative" - instead of making "the ultimate findings to be considered valid and reliable", would render a report based on this selected data unfit for serious consideration.
"Population" sampled. It is not clear what the authors meant by the term "the population." If the population referred to was that of Evanston, the sample of children examined in this study - if properly drawn - provided an unbiased estimate of the dental condition of the population of that city; if only some of the data are included, the results will be biased. If this term "population" was intended to refer to the general population of the U.S.A., it should be realized that the results from Evanston can represent only a stratum of the country as a whole, varying as to climate and racial composition, to mention only two variables.
It will be recalled that the caries rates were said to be significantly different, even between children attending the different types of school in Evanston; and also that the rates in that city were considerably different from those in Oak Park, which was at first stated to be "the ideal control community" for Evanston (Blayney and Tucker, 1948). These differences emphasize the fact that caution should be exercised when applying results obtained in a test city to a wider population, of which the test city may not be representative.
Altered methods in latest report. In the latest report (Hill et al., 1958) which shows the findings for the permanent teeth of children in the control city of Oak Park, the authors have published in the same tables as the results of the control groups, the DMF rates, not of the public school white children, but of the total sample of Evanston children. This is strange in view of their statement that "comparisons of caries rates for the study group and the control group are based on the caries experience of public white school children only" (Hill et al., 1951). It would appear that they no longer held the opinion which they stated the previous year (Hill et al., 1957a) that it is necessary "to make comparisons of like groups."
As a result of this change in procedure the differences between initial caries rates in Evanston and Oak Park are diminished. In children aged twelve to fourteen years, the pre-fluoridation rates reported for the 1,226 public school white children in Evanston were far closer to the values found in the Oak Park children than were either the rates of the 96 Negro, or of the 379 parochial school children (Hill et al., 19571, 1958). However, the rates of the Negro children were lower, and the rates of the parochial school students were considerably higher than those of the public school white children. By adopting the authors' latest (1958) method, which is to add the results of the three groups, it is found that the pre-fluoridation rates of the twelve and fourteen-year-old children are considerably less divergent from those of the initial examinations in Oak Park - and those of the thirteen white children of those ages. Whether this situation arises with regard to the six-, seven- and eight-year old children cannot be determined, for no dissection into the rates prevalent in the three school groups has been published.
Late examination in the control city. The United Kingdom Mission (1953) stated: "Before fluoridation started a dental survey was made of 4,375 children in the selected groups in Evanston and of 2,493 children in Oak Park. Further examinations have been carried out each year since 1947 and will continue until 1962." However, the examinations in Oak Park were not commenced until after the fluoridation of the Evanston water supply on 11 February 1947, for Blayney and Tucker (1948) stated: "The study in Oak Park was instituted on Feb. 26, 1947". Also, at the time of the United Kingdom Mission Report (1953), no further examinations had been conducted in Oak Park; even in Evanston only one age group was examined during each year, as can be seen by inspecting the "schema for study" published by Blayney and Tucker in 1948, and reproduced in several subsequent reports.
Only two examinations in the control city. This "schema" indicates that the design of the trial provided for only two examinations - eleven years apart - to be made in the control city. It would appear that the authors did not anticipate changes in the caries rates of the control, such as were reported in Muskegon (Arnold et al., 1953), and, as will be seen later, in Sarnia (Brown et al., 1954b), and in Kingston (Ast, Finn and Chase, 1951). The first examination was made in 1947, and the second, although not scheduled until 1958, was commmenced in 1956 when it became apparent that the water supply of Oak Park would be fluoridated (Hill et al., 1956). This examination was completed on 14 November 1956, soon after the fluoridation of the Oak Park water on 1 August (Hill et al., 1958).
A ten-year delay in the publication of data. Caries attack rates for the six-, seven- and eight-year-old children which were obtained in Oak Park in 1947 (Blayney and Tucker, 1948) have only recently been published by Hill et al. (1958). This great delay is inexplicable and is particularly unfortunate, because it is in regard to these younger children that the major claims are made for reduction of dental caries as a result of fluoridation. No explanation was offered for this delay, and the members of the United Kingdom Mission (1953) did not comment on this strange omission, merely saying that "The incidence of caries among the children aged 6-8 years is compared with the baseline data of Evanston itself while caries experience of children aged 12-14 years is compared with that of Oak Park."
Gross differences in initial caries rates. The latest report (Hill et al., 1958) reveals that in the younger children there were gross differences between the initial caries attack rates in Evanston and Oak Park. The rates were: 46.85, 26.89 for age six years; 153.49, 102.63 for age seven years; and 249.93, 222.44 for age eight years in Evanston and Oak Park respectively.
In regard to the great difference between the pre-fluoridation rate for the six-year-old children in Evanston and the initial one for children of that age in Oak Park, 46.85 and 26.89 respectively, a footnote to Table I (Hill et al., 1958), referring to the former rate, stated: "This figure results from the very high DMF rate of 87.91 found in one school in 1946." However, as the children were drawn "from 24 schools in the study area" (Blayney and Greco, 1952), it is probable that the rates for six-year-old children in most schools approached the figure of 46.85, unless the school with the high DMF rate also happened to provide a disproportionately large number of six-year-old children.
It should be noted that no comment on the magnitude of this rate of 46.85 was made in any of the four reports in which it had been shown previously (Hill et al., 1950, 1952, 1956, 1957a); all of which were published before the rate of 26.89 for Oak Park was released, and therefore before a comparison with it could be made. The rate of 46.85 was used in all those papers - and even in their latest report (1958) - in calculating the "% reduction", and in computing the "Probability of difference due to chance."
Much unpublished data. The members of the Evanston Dental Caries Study devoted most of the years 1947 and 1956 to the collection of data from children in Oak Park (Blayney and Tucker, 1948; Hill et al., 1958). Despite this fact, the major part of each of the two tables shown in the XVIII Report (Hill et al., 1958) was devoted to a re-presentation of data obtained in Evanston, although this report was said to have as its purpose the comparison of the permanent teeth dental caries experience rates in children examined in Oak Park in 1947 and 1956. The Oak Park data were restricted to four lines of figures showing the DMF rates in permanent teeth. No report was made of other findings such as those which had been shown in reports on Evanston children. For instance, in the XV Report (Hill et al., 1957a), no fewer than eight tables relating to the twelve-, thirteen and fourteen year-old children only were devoted to these other findings. This very incomplete presentation of the data obtained in Oak Park is unaccountable.
Figure 3. Gross differences in initial caries rates in Evanston and its control city of Oak Park. The Oak Park rates remained unpublished for over ten years.
Disagreements between results. In their XVIII Report, Hill et al. (1958) stated: "The DMF rates and percentage reduction from year to year for the Evanston children of all age groups shown in Tables I and 2 have been published in previous reports. However, four of the figures for the year 1955, shown in Table I of the 1958 Report, are different from "the rates and percentage reduction" given, for the same year, in Table I and the text of the XVI Report (Hill et al., 1956). The DMF rates at age seven years were only slightly different (40.95 and 40.92, in the XVI and the XVIII Reports respectively), but at age eight years the two rates were 114.04 and 120.32. It is very improbable that these different rates are due to typographical errors, for they were confirmed by the "per cent reduction from 1946", which was given in the summary and in Table I of the respective reports as 73.32 and 73.34 for children aged seven years, and as 54.37 and 51.85 for those that were eight years of age. This "reduction" was shown in the XVIII Report as 85.96 for the six-year-old children, but in the XVI Report it was given as "80 per cent" in the findings and as "85.96 per cent" in the summary.
Disagreement between tables. The DMF rate in terms of tooth surfaces was given only twice in this study (Hill et al., 1955, Table X and 1957a, Table XII). In both papers the "DMF rate per 100 surfaces" for children aged fourteen years was 14.82 in 1949 and 13.94 in 1952. However, in the former report this rate was given as 15.09 in 1946, but in the latter one, for children of the same age in the same year, the figure shown was 15.92. As a result of this change, the "% differences from 1946" were altered from 1.78 to 6.85 (1949) and from 7.62 to 12.44 (1952). By using these new rates it can be said that "all 3 methods, namely; per hundred children, per hundred teeth, and per hundred surfaces all express approximately the same proportion of percentage differences in rates" (Hill et al., 1957a). This result is a good illustration of the comment made on the method most commonly used in these studies to express changes in caries experience, that "relatively small variations in the baseline values will produce substantial alterations in the percentage reduction obtained" (Part One, p. 137).
It may be mentioned that the "total tooth surfaces considered" for thirteen-year-old children in 1954 (Table X11, Hill et al., 1957a) should be 58,325 not 58,352; and that for fourteen-year-old children in 1949, in the column of that table giving the "% differences from 1946", the figures shown should be 6.91 not 6.85. In their XI Report (Table IX) and their XV Report (Table XI), Hill et al. (1955, 1957a) showed different figures for children aged twelve years examined in 1952. Although both tables show the same total number of teeth considered, in the former table children were shown as examined, with a "DMF rate per 100 teeth" of 25.76, and a difference from 1946 of 19.50 per cent. In the latter table, the figures were 516, 25.60 and 20.00 per cent respectively. In 1953 Hill et al. published the figure of 19.50 per cent.
No data for deciduous teeth. The authors have not published any data regarding the deciduous teeth of children in the control city, either for the first (1947) or the second (1956) examination. The most important omission, the def rates, could have been shown by adding only two lines to Table I in Hill et al. (1958). This omission is particularly unfortunate in view of the fact that in the deciduous teeth in Evanston during the first four years of fluoridation the def rate of the six to eight years group was considerably higher than the initial one (Hill et al., 1952). It was not until nine years after the commencement of the study that a significant reduction in this rate was reported.
In 1950, Hill et al. stated that the caries rate for deciduous teeth in these children "does not indicate any trend", despite the fact that in Table I of that report the initial rise in this rate during the first two years of fluoridation was shown by them to be statistically significant (P = 0.005). Two years later these authors altered their opinion of the significance of this rise. In 1952 they re-published the same data for children aged six, seven and eight years in 1946 and 1948, but computed different rates for the combined age group six to eight years. The rise in the def rate was then said to be not statistically significant.
Variations in caries rates in control. The meagre data regarding caries attack rates in Oak Park which have been published are included in Tables I and 2 of Hill et al. (1958). Of the six age groups shown, between the years 1947 and 1956 the authors reported a significant increase in the DMF rate of children aged seven years, and non-significant upward trends in the rates of those aged eight and thirteen years, and downward ones in the caries attack rates in children aged six, twelve and fourteen years. (The question of "significant" changes in the rates in control cities will be considered later.) The authors said: "The children 12, 13 and 14 years of age, Table 2, have only minute differences between the 1947 and 1956 rates. These are not considered to be significant." The footnote to that table is more definite, in each comparison stating: "Difference is not statistically significant." Although these differences of 61.20, 34.96 and 58.87 DMF teeth, for children aged twelve, thirteen and fourteen years respectively, were termed "minute differences", those seen in the rates of the twelve and fourteen-year-old children are approximately a third the size of the absolute drop in the rates recorded for the same age groups in Evanston since the inception of fluoridation. It cannot be assumed that the fluctuations in the rates during the intervening period of nine years, when no examinations were made, did not exceed the differences between the initial and final rates. It will be recalled that considerable variations occurred in Muskegon (see Figs 1 and 2).
Inadequacy of the control. Blayney and Tucker (1948) realized that "A study of this nature must have an adequate control." Therefore, it is strange that in the "schema" which they published there was provision for only two examinations, eleven years apart, to be made in the control area. It should have been obvious that the usefulness of data gathered in such a manner would be, at most, very limited. The explanation given by the authors for their failure to examine the children in the control city "every year" (instead of only twice) was the strange one that "It was not necessary to do so in as much as Evanston and Oak Park are subjected to the same advertising campaigns, have a similar economic level, participate in comparable educational programmes, and so forth" (Hill et al., 1958). It is extraordinary that the authors advanced this explanation and that they adhered to such a plan, despite the marked dispanity in canes rates disclosed in the first examinations in Evanston and Oak Park (Hill et al., 1958), which makes it obvious that the latter city was a poor choice in seeking an "adequate control" for the former one.
Differences between school groups. Hill et al. (195 1) stated that "statistically significant differences were found to exist [in 1946] between the caries rates of Negro and parochial school children on one hand, and public white school children on the other hand." However, they made a further statement that "the caries rates of parochial school children were found to be higher and those of Negro children lower than those of white children in public schools" (Hill et al., 195 1). These two statements are inconsistent. The first appears to mean that the comparisons between Negro children and white children in public schools, and that the comparison between white children attending parochial schools and those attending public schools, were both statistically significant in 1946.
"Nearly comparable" or significantly different? The XV Report Hill et al., 1957a) stated that "In 1946 and 1954 the public school white children and the Foster School (Negro) children maintained nearly comparable DMF rates". The actual rates" (per 100 children) in 1946 for twelve, thirteen and fourteen-year-old white children attending public schools were 707.51, 946.17, and 1133.33; for the Negro children of the same ages they were 658.82, 861.76 and 1035.71. (The rates of each school group of younger children were not published.)
It is not understood how the same authors could on one occasion (Hill et al., 195 1) state that there were "statistically significant differences" between the two series of rates, and later (Hill et al., 1957a) describe them as "nearly comparable DMF rates" It may be thought that the word "maintained" referred to a comparison between the DMF rates of the white children in public schools, and of the children in the Negro school, between 1946 and 1954. However, this cannot be the case, for the authors claimed for these twelve, thirteen and fourteen-year-old children "a reduction of approximately 21.96 per cent in dental caries-experience rates of the permanent teeth" (Hill et al., 1957a). (In this study, percentages were frequently shown "approximately" to two decimal places.) Table IV of that paper shows that both the Negro and the public school children participated in the reductions reported.
Decline in eruption rate. An observation of considerable interest is obtainable from Tables V and VI of the X Report (Hill, et al., 1952). The former table shows the rates per 100 six, seven and eight-year-old children that had occlusal surface pit and fissure caries or fillings in their first permanent molars; the latter one, the number of these teeth which were free from those defects. The mean number of erupted first permanent molars per 100 children may be obtained, in each age group, by adding these two rates to that showing the extracted and congenitally missing permanent molars. It is probable that the number of congenitally missing teeth was negligible and that the number of permanent molars which had been extracted in these young children was small, particularly in the six years age group (five and a half to six and a half years). Therefore, it would be expected that, in each age group, the mean number of erupted molars per 100 children would be similar at the time of each examination. This was the case in children aged eight years; the figures for the examinations made in 1946 (pre-fluolidation), 1948, 1950 and 1951 being (to the nearest whole number) 387, 387, 384 and 386 respectively. At age seven years the numbers erupted were 330, 336, 320 and 315; but in the six-year-old children, the number of erupted molars showed a marked and progressive decline 189, 156, 140 and 132 during the period covered by those four examinations.
The question naturally arises whether the eruption rate of these teeth had decreased; a possibility of extreme importance in interpreting the results of a fluoridation trial. However, further consideration of this matter is prevented by the authors' failure to publish this type of data when they reported the results of the two later examinations (conducted in 1953 and 1955) which were made of children of these ages; and the "schema for study" indicates that children aged six to eight years will not be examined again until 1960.
This failure to publish this type of data for the 1953 and 1955 examinations is extraordinary, especially in view of the fact that the authors continued to show similar data for the permanent molars of the older age group (Hill et al., 1955, 1957a); the latter report, the only one showing results for both age groups, gave the prevalence of occlusal pit and fissure caries and fillings in the molars of the older, but not of the younger age group.
In considering the eruption of teeth, the odd method of assessment used in this study must be taken into account. Hill et al. (1955) said: "Only teeth which were 50 per cent or more erupted were considered. A carious or filled tooth was, of course, considered regardless of its stage of eruption."
Figure 4. Suggestion of a progressive decline in the number of erupted first permanent molar teeth in six-year-old children in Evanston. The results obtained in the examinations conducted in 1953 and 1955 were omitted from the published reports.
Strange superiority of artificial fluoridation. The authors of this study compared the Evanston DMF rates per child with those of children in Aurora, Illinois (Dean et al., 1950) in the expectation that after sufficient time had elapsed for all the erupted teeth to have been formed since fluoridation commenced "the Evanston rate will closely approach the Aurora rate" (Hill et al., 1957a). It is surprising that this parity between the rates of Aurora and Evanston was expected, because in the Aurora survey only clinical methods of examination were used, but in the Evanston examinations X-ray surveys were used routinely. Hill et al. (1951) stated: "We find our baseline figures for caries experience in Evanston and Oak Park approximately 32 per cent higher than those of Dean and his co-workers for Evanston and Oak Park in 1941. We assume this may be explained partially by differences in the techniques of examination, particularly in the use of X-ray in the current investigation." The United Kingdom Mission (1953) stated that in this study "the minutest radiolucency was taken as indicating caries."
In view of these findings, it is even more strange that Hill et al. (1957a) were able to report: "The Evanston 6 and 7-year-olds of 1953 have a lower dental caries experience rate after 71 to 82 months of fluoridation than the Aurora 6 and 7-year-olds of 1945-1946 with lifetime exposure to water naturally fluoridated to 1.2 ppm." That this difference was not only slightly below the 1945-1946 Aurora rate for children of the same age" (Hill et al., 1957a) can be seen by comparing the actual rates reported. In Evanston and Aurora respectively, the rates were 14.73, 28.0 at age six years and 53.35, 70.5 at age seven years (Hill et al., 1957a; Arnold et al., 1953). It should be noted that in Evanston two years previously (195 1), after a shorter period of fluoridation, the rate for the six-year-old children was even lower, 12.36 (Hill et al., 1952) and was less than half the Aurora rate; in 1955 (Hill et al., t956) it had become 6.58, less than a quarter of the Aurora rate. Blayney and Greco (1952) found that in children in the Evanston study, with regard to proximal caries "the 6-year-olds have the highest percentage (83.90) disclosed by X-ray findings only. In the 7-year-old group 79.04 per cent of proximal lesions were demonstrated by X-ray findings only". Therefore, if clinical methods of examination only had been used in Evanston, as was the case in Aurora, what may be thought to be a strange superiority of artificially over naturally fluoridated water as a means of reducing dental caries attack rates would have appeared to have been even more marked.
"Weighting" of results. The method of combining the results of the six, seven and eight-year-old children into one category introduces an important source of error when comparisons are made between the results obtained in the control city and in the test one, or between those found on different occasions in Evanston. Owing to the great differences in caries attack rates which are observed between children of these ages (the baseline DMF rates for these three ages in Evanston were 46.85, 153.49, and 249.93, according to Hill et al., 1950), the results may inadvertently be "weighted" by including a preponderance of young or of old children in the age group six to eight years. If this occurs, the average value will be lower or higher than it would have been if the three ages had been equally represented in the sample. In comparing the results of the control and the test cities, "weighting" of this nature could make it appear that large differences were present, when, in fact, they were either slight or absent, or the presence of actual differences could be hidden.
An example of "weighting". The results of the pre-fluoridation, and of the first post-fluoridation survey at Evanston (Hill et al., 1950), clearly demonstrate the process of "weighting" and show that its occurrence is not merely a theoretical possibility. On these two occasions, the number of children in each of the age groups six, seven and eight years that were examined in 1946 was 461, 759 and 771 respectively; the corresponding numbers seen in 1948 were 756, 838 and 440. On both occasions the results of the three ages were combined, and a caries rate was computed for the age range six to eight years.
Significant tests and "weighting ". Despite the rather obvious "weighting" in the examples which have just been cited, tests were applied to determine the significance of the difference between the caries attack rates found during the two examinations in the combined age range six to eight years. In regard to the permanent teeth, it was stated that "The probability of this difference being due to chance is 0,0000" (Hill et al., 1950). Curiously, in those teeth a decrease in the caries rate was reported, contrasting with the statement of a significant rise in the rate of the deciduous ones.
Random variation ignored. Hill et al. (1950) stated: "It is to be expected that the rate of caries in all teeth varies from year to year due to chance. A significant reduction of caries prevalence can therefore be assumed to exist only when the statistical analysis of the data provides almost absolute certainty that the observed differences are not due to chance." However, in a subsequent paper (Hill et al., 1956) these authors ignored the variations in the intervening years, even when these were as marked as those in Table 5 of that report, and stated: "Difference between 1946 and 1955 rates is statistically significant."
Original results altered. In the X Report (Hill et al., 1952), and in all the later ones, alterations were made to the rates shown for the years 1946 and 1948 in children of the combined age group six to eight years, which were published by Hill et al. in 1950 (Tables I to VI). The original rates were replaced by values which are the means of the mean rates for the children of each of the three ages six, seven and eight years (Hill et al., 1952, Tables 11 to IX).
System of computation changed. The change in the system of computation was explained by Hill et al. (1952) in these terms: "The group averages, shown in previous reports, represents weighted averages of the individual mean caries rates. Inasmuch as the composition of the groups of children with respect to the number of 6, 7 and 8-year-olds varies from year to year, it was felt that unweighted group averages form a more sound basis for comparison of group caries rates between years."
The new method of computation. In 1952 Hill et al. stated that "The new averages were obtained by taking a simple arithmetical mean of the individual caries rates of the 6, 7 and 8-year-old children." This description of the new method is apt to cause some confusion, for it is considered to describe accurately the old method. It was used by these authors in 1950, and then abandoned by them in favour of the new one. The results for 1950 and 1951 in Table IV of Hill et al. (1952), and those for 1953 in Table I of Hill et al. (I 957a), and for 1955 in Table I of Hill et al. (1956) make it clear that in this new method of calculation, the rate per 100 children aged six to eight for each examination was obtained by taking a simple arithmetical mean of the mean rate for each of the three ages six, seven and eight years.
Errors in amended rates. The amended rates published by the authors (Hill et al., 1952) for the age group six to eight years need further amendment, and the difference between them is even less than that stated. The mean of the three values shown for 1948 in their Table IV, 23.54, 103.58 and 194.09, is found to be 107.07, not 92.07 as stated; also, the mean of the three values for 1946 - 46.85, 153.49 and 249.93 is 150,09 not 149.76. These errors were repeated in the XV and the XVI Reports (Hill et al., 1957a, 1956).
The figure 149.76 was shown also in the XIV Report (Hill et al., 1954). In that report the rate for age six to eight years was said to be "65.82 in 1953." However, in Table I of the XVI Report (Hill et al., 1956) the rate for 1953 for age six to eight years was given as 63.52. The latter figure is the mean of the three mean rates shown for the six, the seven and the eight year-old children.
The XIV Report (Hill et al., 1954) stated: "The combined 6 to 8-year-old children had a permanent tooth DMF rate of 149.76 per 100 children in 1946 and 65.82 in 1953. This is a difference of 60.38 per cent." In fact, by using their standard method of calculation, the "difference" is 56.05 per cent.
A confusing calculation. The situation is made even more confusing by the figures shown in Table 6 of the XVI Report (Hill et al., 1956). If the method commonly used in these trials is employed, when the difference between the DMF rates for 1946 and 1955, which is 95.90 (the rates being 149.76 and 53.86), is expressed as a percentage of 149.76, the "per cent difference" is 64.04, not 64.11 as shown. However, if the correct figure of 150.09 (which does not appear to have been mentioned in these reports) is substituted for 149.76, the "per cent difference" becomes 64. 11 as shown in their Table 6.
Was sampling used? The six, seven, eight and twelve, thirteen, fourteen year age groups were chosen for study (Blayney and Tucker, 1948), but it was not stated whether all children of these ages (the ages were taken to the nearest birthday) were examined, or whether a sampling method was used. The VII Report of Hill et al. (1951) said that "0. 1 per cent of the children in the control area were Negro." However, in the XV Report (Hill et al., 1957a) it was stated that "the control area (Oak Park) examinations included only public school white children". It is not clear whether the Negro children in that city were excluded from the examination by design, or by the chance of a sampling method. The former alternative is suggested by the statement of Hill et al. (1955) that "In the control village of Oak Park, only public school children were studied".
Were children "continuous residents"? It is not clear whether all the children included in the early reports (Blayney and Tucker, 1948; Hill et al., 1950, 1951, 1952, 1953, 1954) were "continuous residents". Although the questionnaires recorded the residence record of each child, it was not until the X1 Report (Hill et al., 1955) that the statement was made that "The data given in this report are limited to those children whose entire lives have been on Lake Michigan water." The United Kingdom Mission (1953) stated that "The study includes only white children attending public schools in the city who have lived in the area continuously from birth." However, as the first part of that statement presents an incomplete description of the authors" method, doubt is raised as to the accuracy of the statement made in regard to continuous residence.
Disturbing disagreements. In the following paragraphs are cited some disturbing disagreements between the statements made regarding the number of children examined. No suggestion has been found that more than one series of examinations was conducted in Evanston in each year from 1946 onwards, and in Oak Park in 1947 and 1956. Therefore, although the situation is uncertain regarding sampling and continuous residence. it would be expected that all the reports would agree with regard to the number of subjects of each age that were examined in each individual year. The exception is the XVII Report (Hill et al., 1957b), which compares the caries rates of white with those of Negro children; for it was stated that "in this report no attempt has been made to limit the examinations to continuous resident children." Therefore, it would be expected that the sample sizes shown in this report may be larger than those published in other reports.
Gross discrepancies between sample sizes. The numbers of children of each of the ages twelve, thirteen and fourteen years that were examined in 1946, 1949, 1952 and 1954 were given in the second column of Tables XI and XII of the XV Report (Hill et al., 1957a), the same figures appearing in both tables. It is to be noted that in eleven out of the twelve cases, the sample sizes given there are different from those shown in Tables 111, V, VI, VII, VIII, IX and X of the same report. In six cases the samples were larger in Tables XI and XII than in the other tables mentioned, and in five cases they were smaller. The largest discrepancy was between the number of children aged twelve years that were examined in 1949. Tables XI and XII showed this figure as 627, and the other tables gave 522 as the sample size. Similar discrepancies (for 1946, 1949 and 1952) are present between the sample sizes shown in Tables IX and X of the 1955 paper of these authors, and Tables 1, 111, IV, V, VI, VII and VIII of that report. The authors (Hill et al., 1957a) stated: "The number of teeth and surfaces associated with the DMF rates from 1946 through 1954 are shown in Tables XI and XII." In other tables mentioned in that report the "Rate per hundred children" was employed, but there appears to be no reason why the number of children examined should not be the same for both of these comparisons. No explanation for the different sample sizes was advanced by the authors.
Disparities in Negro sample sizes. Marked disparities are seen between the sample sizes shown for Negro children, for, judging from Table 10 of the XVII Report (Hill et al., 1957b), data >from only about half of the Negro children aged twelve to fourteen years who were examined in 1946, and of less than a third of those examined in 1954, were included in the XV Report (Hill et al., 1957a). The number studied is given in Table IV of the latter paper as 96 in 1946, and as 79 in 1954. However, the XVII Report (Hill et al., 1957b, Table 10), shows that 188 Negro children of those ages were examined in 1946, and 250 in 1954.
The XI Report (Hill et al., 1955) also shows that 96 Negro children were examined in 1946. The VII and XVIII Reports (Hill et al., 1951, 1958), although they do not state the number of Negro children, indicate the same sample size, 1,701 children, as the XI and XV Reports (Hill et al-, 1955, 1957a). In the last mentioned report, referring to the 1954 results, the authors said: "It is admitted that the Foster (Negro) school sample (79) was limited." Why, then, were so few of the 250 Negro children aged twelve to fourteen years that were examined in that year included in the report? Were less than a third of these children continuous residents?
The situation with regard to children aged six to eight years cannot be investigated, because the XVII Report is the only one in which the data of the younger age group of Negro children are shown separately from those of the white children.
Further unexplained differences. The position revealed in the last paragraph is further confused by the presence of large variations between the number of white children, aged twelve to fourteen years, whose data were shown in earlier reports, and the number given in Report XVIL In the former reports (Hill et al., 1955, Table 11; 1957a, Table IV) the number of these children examined in 1946 (public plus parochial schools) is stated to be 1,605, but, according to the XVII Report (Hill et al., 1957b, Table 10) the number seen in that year was 1,368. In 1954 the examinations of white children totalled 1,247 (Hill et al., 1957a, Table IV), but the figure of 1,905 is shown in the XVII Report (Hill et al., 1957b).
In the younger children, as no dissection of the data into school groups has been published, only the total number inspected can be considered. The XVII Report (Table 10) states that 1,754 children were examined in 1946 and 2,952 in 1955; but Table I of the XVI Report (Hill et al., 1956) shows 1,991 and 1,376 examinations respectively. The two statements of sample sizes (XVII Report figures minus the XVI Report ones) therefore differ by -237 and + 1,576 children.
It is possible that the larger sample sizes shown in the XVII Report for the examinations in 1954 and 1955 were due, despite the sizes of the increases (171 Negro and 658 white children aged twelve to fourteen years, and 1,576 children aged six to eight years), to the inclusion of all subjects, and not only those who were "continuous resident children". If, at the time of commencement of the study in 1946, children who had not lived in Evanston "continuously" since birth were excluded from the main study, an explanation can be found for the larger number of Negro children included for that year in the XVII Report. However, it is strange that that report, which included children who were not "continuous residents" (Hill et al., 1957b), in 1946 should be based on 237 fewer white children aged twelve to fourteen years and on 237 fewer white plus Negro children aged six to eight years than were included for that year in the other reports mentioned.
Incompatible statements. The authors made incompatible statements regarding the total number of children examined during the initial examinations in Evanston and Oak Park. In Report II (Blayney and Tucker, 1948) it was stated that the "baseline observations were made on 4,375 North Shore" (study area) "children and 2,493 Oak Park children." These figures were repeated in 1950 by Hill et al. However, Tables I to VI of the latter paper show that 1,991 children aged six to eight years were examined in Evanston in 1946; Tables 1, 11 and III of Hill et al. (195 1) indicate that 1,701 children aged twelve to fourteen years were examined in that year, that is, a total of 3,692 children. One or both of these figures (1,991 and 1,701) were repeated by the authors (or may be obtained by adding figures for individual yearly age groups) in 1952, 1955, 1956, 1957a and 1958.
Figure 5. Incompatible statements regarding the number of children inspected during the initial examinations in Evanston and its control city of Oak Park. Evanston statement A is from Blayney and Tucker (1948) and Hill et al. (1950). Statement B is from Hill et al. (1950, 1951, 1952, 1955, 1956, 1957a and 1958). Statement C is from Hill et al. (1957b). Oak Park statement D is from Blayney and Tucker (1948) and Hill et al. (1950), and statement E from Hill et al. (1958). See p. 211.
The third total sample size for Evanston in 1946 is shown in the XVII Report (Hill et al., 1957b). By totalling the figures in Table 10, it appears that 1,754 children aged six to eight years, and 1,556 aged twelve to fourteen years, were examined, a total of 3,310 subjects. From Tables I and 2 of Hill et al. (1958) it is deduced that a total of >2,051 children were examined in Oak Park in 1947 (see figure 5, p. 167).
Therefore, three very different sample sizes were given for the 1946 examination in Evanston: 4,375, 3,692 and 3,310; and two total sample sizes of 2,493 and 2,051 subjects examined in Oak Park in 1947. The smallest sample size for Evanston (3,3 10) was given in the XVII Report, despite the statement of the authors (Hill et al., 1957b) that "in this report no attempt has been made to limit the examinations to continuous resident children."
Remarkable changes in assessment of statistical significance. In the footnote to Table II in Hill et al. (1952) it was stated: "It should be noted that the caries rates per 100 children for the 6-8 year olds as a group shown in this report, vary slightly from those shown in previous reports." Although these were said to be slight variations, the remarkable fact emerges that, although based on the same data, the difference between the 1946 and the 1948 caries attack rates for the deciduous teeth of children of that age range, which was said to be statistically significant (the probability being given as 0.005) in the 1950 Report, was stated by the same authors, in 1952, to be "not statistically significant."
On reading the X Report (Hill et al., 1952), it appears that even more extraordinary changes of opinion with regard to the significance of results based on the same data occur in five comparisons between the rates of permanent teeth; significant differences (probability "0.0000") being altered to "not statistically significant." However, a correction (J. dent. Res., 31, 597) stated that the footnotes to Tables IV, V, VI, VII and VIII were incorrect, and that the statements: "Differences are not statistically significant" should have read "Differences are statistically significant". It is considered likely that the correction is incomplete, and that in the footnote to Table IX of that paper, the word "not" should be deleted. If this alteration is not made, that footnote indicates that the difference between the rates for 1946 and 1948 is "not statistically significant", although two years earlier, the difference computed from the same data was stated in the footnote to Table VI of Hill et al. (1950) to be significant (probability "0.0000") .
At first sight, the employment of statistical terminology in the presentation of this study engenders confidence in the results reported, but the few examples which have been cited clearly indicate their unreliability.