Measurement in Sociology
Floyd N. House
University of Virginia
American sociologists appear to agree that social science should be as quantitative as possible, that subjective phenomena can be measured only through objective indexes, that statistics can be used to verify or disqualify hypotheses, and that statistics may have great practical value; probably also that statistics may suggest some explanation. They agree also that non-quantitative methods should be used where quantitative methods have not yet been devised, but only there. The issues of the controversy are: (I) Can knowledge of social phenomena be completely reduced to quantitative expression? (2) Can we know other people except from behavioristic data? (3) Is there no ground of choice among research projects except the competence of their sponsors? The value of research depends partly on the need for knowledge for practical use; needed knowledge may be such as can be had only by non-quantitative methods. The term "science" may not be granted, in the long run, to non-quantitative knowledge, but in that case the aims of sociology should be stated to include other elements besides the results of scientific research. Scientific knowledge is built up from acquaintance knowledge, much more of which is needed before statistical inquiry is effective in some sociological problems. The data of physical science are taken from raw experience in such form that they can be measured, but to do this in social research practically destroys the character of the phenomena studied. Sociological knowledge is based, in part, on "insight," which is inference concerning what is not directly known to sense experience. Of four recent attempts at the measurement of social phenomena, one proves upon analysis not to be concerned with social phenomena in the strictest sense; the others are all attempts to measure attitudes. Attitudes are subjective, and it seems doubtful whether reliable knowledge of them can be had directly from objective indexes, or in highly quantitative form. Rice's method of studying attitudes through the analysis of votes in actual elections is immune to certain criticisms, but is limited by the availability of data. It does not seem that quantitative techniques for studying attitudes have been such as to show how the knowledge of attitudes that is needed can be had except under certain favorable conditions.
The problems of measurement in sociology can be divided roughly into technical problems and logical or epistemological problems. The former are involved in the execution of operations of measurement and the calculation of correlations or other recondite mathematical inferences from the gross results. The latter are concerned with the nature of the phenomena to be measured, the determination of indexes of social phenomena which do not appear to be subject to direct measurement, and the capacity of measurements to provide adequate knowledge of the things studied. It is with problems of the latter sort that this paper is concerned.
There is, ostensibly, a fairly wide and even general agreement among American sociologists concerning the use of quantitative methods. Let us start, then, by enumerating some of the points of agreement. First, it seems to be agreed that it is inherent in the nature of all science, including social science, to seek to reduce its findings or conclusions, so far as may be, to quantitative formulation. Second, it is agreed that subjective phenomena can be measured, if at all, only indirectly, through objective indexes. Third, it is agreed that statistics afford verification or disqualification of sociological hypotheses. Fourth, no one denies that measures of the frequency of social phenomena are of great practical value; they tell us on what scale we have to be prepared to deal with those phenomena, and whether they are decreasing or increasing. Probably a majority of our fraternity agree also, fifth, that a body of statistical data, properly tabulated and correlated, frequently suggests some explanation of the facts in question.
As to the limitations of quantitative measurement, its most enthusiastic advocates seem to agree that, where an adequate method of quantification has not yet been devised for the study of some kind of phenomena of sociological interest, we may be permitted or even encouraged to do what we can to illuminate the matter by non-quantitative studies. In short, no one wishes to rule our non-quantitative methods from sociology altogether, although it seems that there are those who wish to see what they term "qualitative research" restricted to the smallest possible dimensions, quantification being taken as a mark of respectability and maturity of sociological wisdom.
If, now, there are some rather sharp differences of opinion among us concerning the use and limitations of quantitative measurement in sociology, what are the issues of this controversy? At least three can be defined: First, can knowledge of social phenomena be completely reduced to quantitative formulation? The issue, as I see it, concerns the possibility of resolving the things in which we are interested as sociologists into more ultimate factors in such a way that the final result of the analysis can be stated in purely quantitative terms. Must not the quantities always be quantities of something? It may be, of course, that our qualitative terms should ultimately give way to purely denotative symbols. But this is not the same thing as the resolution of qualitative into quantitative differences.
A second issue may be briefly expressed as follows: Is the source of our knowledge of other people's behavior to be found in that behavior and nowhere else? In other words, can we have reliable knowledge of other people except from "behavioristic" data? Lundberg, Bain, and most of the proponents of quantitative measurement of attitude would apparently answer these questions in the negative.
Finally, a third issue is one of research policy. Is it a matter of indifference which research projects we push first and most strongly? This is a real issue. If it be true that the choice of projects to be indorsed, supported with funds, and otherwise encouraged by our fraternity is of little importance, provided only that they are undertaken by men who are competent in their line, then other controversy concerning research methods is pointless. The obvious solution of all difficulties is to say, simply, "All methods are good; let each use the methods that appeal to him and choose projects to be indorsed solely on the basis of our judgment of the competence of their sponsors." If, on the other hand, it can be shown to be true that the varying competence of the men who execute researches is not the only difference of value between their projects, then evidently the problem of the use and limitations of quantitative methods in sociological research has more than academic significance. It is an assumption of this paper that the quest for scientific sociological knowledge derives its sanction in the last analysis from the need for guidance in practical human affairs, and that, accordingly, choices of research projects to be supported should be based upon the existing needs for
(4) different kinds of sociological knowledge. This involves the possibility that some of the kinds of knowledge needed may be such as can be developed only by non-quantitative methods.
The questions with which we are here concerned turn, in part, upon the concept of scientific knowledge, and the process by which it is created or established. According to one view, scientific knowledge is, first, derived solely from sense data and, second, essentially quantitative. A plausible case can be made out in support of this view; it is set forth clearly in a number of standard treatises on scientific method and is more or less familiar. It is possible, however, to entertain a somewhat different conception of scientific knowledge, or at any rate of the knowledge that is available and somewhat reliable for guidance in human affairs. The discussion of the matter is complicated by the common practice of using the terms "science" and "scientific" as epithets—evaluative terms referring to the only worth-while form of human knowledge. It is, of course, in the end a matter of indifference whether the term "science" is used broadly or narrowly, provided an evaluative judgment is not linked to the descriptive meaning of the term. If the majority is unwilling to call non-quantitative types of knowledge "scientific," then such forms of knowledge will in the long run receive other designations; but in that case it is a thesis of this paper that the aims of sociology should not be stated solely in terms of scientific research.
In the following paragraphs I shall seek to clarify the points at issue by a necessarily brief discussion of certain fundamental aspects of human knowledge, particularly knowledge of human society, and the process of its development.
If we examine the common-sense knowledge by which men so largely guide themselves in everyday life, and from which, as a point of departure, they develop more recondite forms of knowledge, we find that it all starts with, and rests upon, what has been termed "acquaintance knowledge." We may not properly be said to "know" anything in any useful sense unless we are acquainted with it, and acquaintance involves some insight into causation or process, as well as mere external apprehension based simply on sense experi-
(5) -ence. When we are acquainted with anything, or any person, we have some idea what behavior to expect from that person or thing. Scientific propositions must contain only terms with which we have some acquaintance, or knowledge derived from acquaintance knowledge; any other terms can have no real meaning for us. This suggests, among other things, that in present-day sociology we need to build up a great deal more acquaintance knowledge than we now have, concerning some of the things in which we are interested, before we can to advantage do much laborious or expensive statistical work on those problems.
The matter may be further clarified by a brief consideration of the nature of scientific data. Dewey makes the pertinent suggestion that the effect of the scientific or experimental method of studying things is to "substitute data for objects." In his conception, data are elements which we take from the objects of common sense, as means to further knowledge. In other words, data, in the sense in which the term is used in science, are not the stuff of acquaintance knowledge, but are objects constructed from that stuff for the purposes of further inquiry. A part of the purpose of scientific inquiry, he says, is to perform operations of measurement upon the data, hence they are so taken, or constructed, that they will lend themselves to measurement. But such resolution of objects into data which can be handled by numerical calculation does not imply that the objects of acquaintance knowledge are those measurable elements, or that they are composed of such elements. Dewey contends that it is just here that the physical and social sciences part company. Physical scientists are interested in doing things which are not precluded by the radical abstraction that is involved in the reduction of the objects of experience to numerical data. But in the social and humanistic studies the case is different; the use of measurable data only brings about a reduction of the actual stuff of experience to the physical  Park has expressed the same thought in a passage in which he says that "statisticians have applied their technique to social phenomena as if the social sciences did not exist, or as if they were mere compendiums of common sense." Socíologí-
(6) -cal knowledge must be based on data, to be sure, but as Dewey so suggestively puts it, data are somewhat unfortunately named; they are in one sense "givens," but they are also "takens," and what must be taken from the raw materials of experience for the purposes of formulating sociological knowledge is, in part, a kind of elements which do not readily lend themselves to enumeration or measurement, though of course no complete repudiation of measurement is necessarily involved.
A part of the knowledge that we need for sociological purposes consists of what we variously call "insight" or "understanding." As Dewey has said, insight, as distinguished from sight, involves inferences regarding what is not seen, nor, we may add, otherwise known to sense experience . Perhaps we can go a step farther and mention a source of insight which Dewey seems reluctant to credit, namely, one's own introspection. This factor in the development of sociological knowledge has been described in the late Charles Horton Cooley's discussion of "spatial knowledge" and "social knowledge." He accepts quite candidly the characterization of this "social knowledge" as, in one sense of the term, subjective; in fact he refers to the process by which it is generated as "sympathetic introspection." But he points out that in this respect the distinction between social knowledge and spatial or material knowledge is at most one of degree only. As I understand him, Professor Lundberg, who has been one of the more vigorous proponents of quantification in sociological research, follows the reasoning of Cooley without difficulty but reaches a different conclusion. The inference he would draw is, let us define our terms as objectively as we can, and make as accurate measurements of the factors so defined as are possible. In one passage of an unpublished seminar paper which he has kindly furnished me, Lundberg seems to take issue with the proposition that knowledge of social phenomena is gained in part through other means than sense experience, which would be a flat refusal to accept the reasoning of Cooley and others were it not qualified by the remark that the sense experience is "conceptualized and organized into the pat-
(7) -terns determined by our neuro-muscular system as conditioned by the culture in which we have lived, and now live." I find this terminology somewhat obscure, not to say awkward, but I take it that Lundberg means to concede the point made by Cooley in terms of "sympathetic introspection," "visualization," and "dramatization." If so, the difference seems to resolve itself into one of emphasis and terminology, but may be important for all that.
In the remainder of this discussion I shall try to illustrate briefly some of the considerations to which I have sought to call attention in the foregoing, by means of a necessarily brief examination of four experimental attempts to measure phenomena of sociological interest, namely, (i) Professor F. Stuart Chapin's living-room scale, (2) Professor E. S. Bogardus' technique for the measurement of "social distance," (3) Professor L. L. Thurstone's scales for the measurement of attitudes, and (4) Professor Stuart A. Rice's experiments in the measurement of mass attitudes in politics through the analysis of the votes cast in actual elections.
These four experiments fall into one class to the extent that they all undertake to secure measurement of social phenomena rather than mere statistical enumeration. When they are examined closely, however, and with particular reference to the nature of the phenomena which they undertake to measure, they fall into two categories. Bogardus, Thurstone, and Rice are all trying to measure attitudes. Chapin's living-room scale, on the other hand, seems to be designed to measure phenomena of the sort to which Cooley refers when he says that some facts commonly regarded as social are also material
(8) events, like marriage, and hence they can be precisely observed and enumerated. Chapin does not entirely evade the question of the adequacy of his scale as an index or measure of an intangible something called "socio-economic status," but he does not dwell upon this aspect of his investigation. In any case, one may ask whether socioeconomic status, as Chapin employs the term, is not in the last analysis a matter of material and pecuniary differences between families. Material and pecuniary differences are, almost by definition, measurable; however, status, in a different sense of the term, is something constituted by the attitudes of other people toward the person or family in question, and the measurement of status in this sense would be another problem of measuring attitudes. Since Chapin is obviously making no attempt to measure status in the latter sense,. we may dismiss his living-room scale without further comment as one that proves, upon analysis, to fall outside the scope of this paper.
As has been said, the other three experiments in sociological measurement referred to have in common the character of attempts to measure attitudes, or something closely related to attitudes. Attitudes are important to the sociologist, for the behavior of people is largely determined by what they think other people think and intend; in other words, social behavior is largely a process of the interaction of attitudes. In so far as social behavior displays any consistency at all—in so far, in other words, as it can be made the object-matter of a science—it is due to the relative stability of human attitudes. But an attitude is subjective, as is conceded by nearly all writers. Faris has pointed out that an attitude is subjective at least in the sense that one may be said to have a certain attitude "in between times," when he is not visibly acting upon ít. Bain, however, contends that "feelings, sentiments, tendencies to act, wishes, attitudes, and so on, mean nothing, and worse than nothing, unless they are interpreted as overt behavior of some kind." From the point of view so indicated, he criticizes the use of verbal re-
(9) -sponses to questions as indexes to attitudes, and points out that there is little evidence to show that such responses are correlated with the overt behavior of the subjects. From such controversy as this, at least two fundamental questions emerge: (1) What are attitudes? and (2) By what means can reliable knowledge of attitudes be had? A third question is conditioned upon the answer to the second: Can knowledge of attitudes be reduced to fairly precise quantitative form? It is conceivable that valid knowledge of human attitudes may be had only by such methods of inquiry as will preclude establishment of accurate quantitative measurements. Attitudes, for example, may be so deep-seated in the personality that neither verbal responses to questions, marking of scales, and the like, nor overt acts which can be observed in some simple way, will serve as reliable indexes; the behavior of the person, verbal and nonverbal, may have to be studied extensively and over a rather long period of time before an investigator can have reasonably certain knowledge of that person's attitudes.
Bogardus does not deal with the question of the reliability of his technique for the measurement of social distance except by tests for the internal consistency of results. Thurstone, on the other hand, indicates frankly the possibility that expressions of "opinion" may not measure real attitudes, but contends that overt acts are no better index. He points out, however, that verbal reactions have at least this significance: they enable us to measure the attitudes which the subjects wish to make people believe that they have . This seems to be a point well taken, though Thurstone does not seem to be particularly concerned to interpret his findings in the way that this comment would suggest. He does, however, state a working limitation of his technique, namely, that an attitude scale is used only in situations in which one may reasonably expect people to tell the truth about their opinions. From this it appears that Thurstone's attitude scales, by his own account, are limited to cases in which, from criteria not ascertainable by the technique itself, we judge that verbal expressions, or the marking of one's preference among such expressions, actually measure the underlying attitude. There re-
(10) -mains the task of dealing with attitudes which may be of great social significance, but which do not fall within the class so defined; also the task of objectifying the criteria by which one can judge whether, in a given case, the subjects tell the truth.
It is in response to such needs and difficulties as these that Rice has offered, as a partial solution, his methods for the measurement of attitudes by the analysis of the votes cast in actual elections. Rice's method tends to meet the objection, too, that in such techniques for the measurement of attitude as have been developed by Bogardus and Thurstone those who serve as subjects do not have the feeling that there is anything at stake in their answers to questions or marking of scales. At least, in purely experimental studies of this type, it is well nigh impossible to ascertain what consequences of his expression of opinion the subject may have in view. Rice's technique is relatively immune to such criticisms, for he takes as his data the returns of actual political elections. The data available for correlation with the votes cast in such elections are, however, none too abundant, and this is, accordingly, a case where the investigation is limited by the availability of data rather than by the nature of the interests to be served, unless one permits himself to go outside the framework of his quantitative procedure and seek to illuminate his findings by non-quantitative evidence and methods. On the whole, however, one cannot help being impressed favorably by this method. It appears to be a technique that may profitably be used and adapted as widely as available data can be found. It should be possible to discover many situations in which people indicate their attitudes by significant acts which can be enumerated and statistically correlated with other data.
Do not quantitative techniques, of the more refined and critical sorts, in the very nature of their operations yield some criteria of the validity of the indexes of which they make use? Bain contends, by implication, at least, that they do not; he holds that the consistency of response to questionnaires and similar tests of attitude are no proof that such responses will be consistent with overt action. In the most recent of Thurstone's publications on this subject which I have seen, however, he cites a study of nationality preference by Eggan, similar to Bogardus' studies of social distance, in which it
(11) was found that the subjects would continue to express consistent preferential judgments of discrimination between nationalities after they had forgotten the exact question at the beginning of the schedule. Thurstone now defines attitude as the degree of affect about a psychological object, and contends that the attitude scale does measure this affect, and not simply surface rationalizations of it. Apparently, however, this claim must still be qualified by the stipulation that the subject shall have no strong reason to conceal or misrepresent his attitude. In most situations, people probably do not feel called upon to make any secret of their attitude of prejudice against certain races and nationalities; indeed, they are rather proud of such attitudes. If we need knowledge concerning attitudes which people are in the habit of concealing from others and even from themselves, for example, some of their sex attitudes, or their evaluations of themselves in comparison with other members of their own groups, we might find the verbal attitude scale less reliable. Possibly Thurstone's tests for consistency and relevancy would expose the limitations of such measurements, but I do not see that they would give any indication of a way by which the knowledge desired could be more reliably secured in quantitative form.
Concerning the desirability of establishing a body of knowledge about human attitudes and their formation, change, and operation, there is scarcely any difference of opinion among contemporary American sociologists. Nor will there be many to dispute the proposition that it is desirable to have such knowledge in quantitative form so far as possible. The crux of the matter seems to be contained in two questions: (i) Do the techniques for the measurement of attitudes which have been presented for our consideration up to now promise to afford with a fair degree of validity knowledge of all the different kinds of attitudes in which we, as sociologists, are interested? and (2) Is it good research policy to allow our inquiries to be directed and limited, to a large degree, by the availability of data suitable for quantitative treatment?