Scales For Grading Social Conditions
Willis W. Clark
Department of Educational Research, Los Angeles
City Schools
Formerly Sociologist, California Bureau of Juvenile Research
By WILLIS W. CLARK
Department of Educational Research, Los Angeles
City Schools
Formerly Sociologist, California Bureau of Juvenile Research
RECENT attempts to evaluate sociological data have been developing a method of research which may be expected to assist in social diagnosis. This method consists of scales and measurements for rating, grading, and evaluating social conditions. In many instances, these scales have developed in conjunction with, and are analogous to, the measurements which have been devised to test psychological development and educational progress.
Social rating scales are prepared for the purpose of providing a uniform and objective method of describing and determining the relative consequence of the factors under consideration. Any social situation which may be accurately described may be graded by means of a properly constructed scale. The method has already been applied to home conditions, neighborhood conditions, school buildings, moral character, leadership, juvenile offenses, isolation, social achievement, qualities of citizenship, etc. This article presents the criteria of useful scales and suggests briefly the technique in their construction.
An analysis of the factors involved in the condition to be graded is the first step in the preparation of a scale. Familiarization with important research studies and social surveys in, related fields, field-work, and a careful outline-study of the problem are suggested as aids in this analysis. The social condition may then be classified by items which, taken as a whole, constitute the essential qualitative factors involved, e. g., necessities, neatness, size, parental conditions, and parental supervision have been considered
( 14) the principal elements in home conditions. In certain cases it may not be necessary to sub-classify the data; e. g., a scale dealing with degree of truancy. The method of classification adopted should preferably be concurred in by three or more competent persons. It is not necessary that a scale contain every possible item related to the condition to be graded, but it should include enough items to allow any important differences in quality to influence the final score.
The most important feature of a scale is its provision for evaluating or grading the condition being studied and it is in this function that most of our scales have their principal weakness. It is necessary that scales be valid, reliable, objective, and usable and the steps to be taken to secure them will be discussed in the following paragraphs.
A scale should measure the factors, abilities, or conditions which it is designed to measure. In order to obtain this feature it is necessary to secure sample descriptions which are representative of the entire quantitative range from poorest to best, least serious to most serious, most inferior to most superior, etc., for each of the qualitative items provided for in the classification. For example, in a scale for grading juvenile offenses,[1] the following samples were among those selected to provide a quantitative range for the item truancy:
1. Played hookey to attend a circus.
2. Played truant only on Fridays on which days he was required to memorize poetry.
3. Played truant intermittently for period of two years.
4. Frequently away from school and was finally transferred to a parental school because of truancy.
5. Brought before juvenile court three times in two years on ac-account of truancy; will not go to school.
These descriptions should be in objective, non-relative terms in order to eliminate personal opinion and clarify meaning. The number of sample descriptions necessary
( 15) depends entirely on the nature of the scale and might vary from a few to several hundred. For example, a scale for grading quality of associates would not call for as many samples as a scale for grading social achievement, or success record, which for adequate description would require data for all the various vocational, social, mental, physical, moral and spiritual factors involved.
If it were possible, the relative value of the various samples should be determined by intercorrelation with the true value of the conditions being graded. However, this method is seldom possible as the true value is usually unknown. The alternative is to secure ratings of the relative value or consequence of the various samples by persons competent to judge. Directions for rating, together with a clearly defined criterion (social consequence of offenses, quality of a home, legibility of handwriting) should accompany the samples when submitted for rating. The samples may be arranged in rank order if they are few in number or may be placed in groups the number of which would vary according to the degree of variability desired. Ordinarily a range of ten will permit satisfactory classification and correlation of samples.
The number of raters necessary to secure the desired reliability, determined by correlating the average ratings of one-half with the average ratings of the second half, will vary with the nature of the condition for which the scale is being prepared. If there is slight difference of opinion a small number of raters will be sufficient, while if there is a wide diversity of opinion a larger number will be required. The aim should be to secure self-correlation of +.90 or higher. The number of raters required may be determined by statistical formulae after a few ratings have been obtained and intercorrelated. Twenty raters of offenses of juvenile delinquents with an average standard deviation of 1.66 on a ten point scale provided
( 16) a correlation of +.927. Eighty raters on the same scale gave a correlation of +.970.
After the ratings have been obtained, the results should be tabulated and each sample description assigned a numerical value based on the average ratings as shown in Chart I. For most purposes this value may be taken as the score to be assigned the sample. However, in some cases it is desirable to compute probable error differences in merit between samples, thus arriving at scale values. A measure of variability such as standard deviation, quartile deviation, or probable error, should be obtained for each sample for the purpose of determining whether there is sufficient agreement among the raters to warrant its retention in the scale. In case the P. E. is not over one-fourth the range of the scale the sample may be safely retained.
These samples, which may be considered standards, should be regrouped in order of scale value under the item classification to which they relate for the purpose of providing a standard score sheet. (See Chart I.) All samples having similar or equivalent score values should be assigned unit grades ; e. g., samples having average score between 3.5 and 4.4 may be grouped as 4; between 4.5 and 5.4 as 5 ; etc. It may be necessary to eliminate certain of the original specimens in case they (1) are similar in content to other samples, (2) have a high degree of variability among ratings, or (3) have extreme rating values for a given grade.
The standard score sheet provides a uniform and objective method of evaluating the social condition under consideration. Data which have been accumulated may be compared with the standard samples and assigned values by one person which are approximately as reliable as if the data had been rated by the same number of persons as provided ratings for the standard samples. The average correlation obtained when three persons independently
( 17)
( 18) graded juvenile offenses by using a standard score sheet prepared by the writer was +.93. When similar specimen offenses were rated independently by the same persons an average correlation of only +.69 was obtained.
In addition to furnishing a score for each of the items, scales for grading social conditions, as outlined above, pro-vide a general index which is a numerical valuation of the relative quality, importance, or seriousness of the condition under investigation. This general index is obtained by adding together all of the item scores and is analogous to objective measures in current usage such as height, weight, temperature, mental age, etc. It is particularly valuable for analysis and correlation of interrelated social problems.
The usefulness of results obtained by applying scales for grading social conditions depends on the accuracy of the original data, and the use of such scales does not enable social case workers and research students to eliminate careful investigation. Rather, they foster the systematic collection of pertinent data and the keeping of accurate records and, in addition, aid in scientific interpretation and evaluation.
SELECTED REFERENCES
Clark, Willis W., Whittier Scale for Grading Juvenile Offenses. Whittier, Calif.: Bureau of Juvenile Research, Bulletin No. 11. 1922.
McCall, William A. How to Measure in Education. New York: Macmillan Co. 1922.
Rugg, Harold O. Application of Statistical Methods to Education. New York: Houghton Mifflin Co. 1916.
Thorndike, Edward L. An Introduction to the Theory of Mental and Social Measurements. New York: Teachers' College, Columbia University. 1913.
Williams, J. Harold. A Guide to the Grading of Homes. Whittier, Calif.: Bureau of Juvenile Research, Bulletin No. 7. 1918.