Methods Used for Measuring Public Opinion
Daniel D. Droba
Ohio State University
Five methods that have been used by investigators for measuring opinion about various public issues have been selected for review. The method of construction was chosen as a basis for classification. (1) The questionnaire method consists of a series of unscaled questions or statements selected by a few judges to represent the opinions. (2) In the ranking method a number of items representing either the object of opinion or the opinion itself is arranged in rank order. (3) The rating method refers to self-ratings or ratings by others on an arbitrary scale with respect to a certain opinion. (4) In the method of paired comparison two items of a pair of words, phrases, or sentences representing the opinion are compared by the subject. He is asked to indicate which of the two items is preferable. (5) The main principle involved in the method of equal appearing intervals is that statements representing the opinions are sorted into a number of piles, say 9 or II, according to the degree of opinion expressed in the statements. Arrangement of the piles is such that the differences between the piles appear to the subject approximately equal.
Interest in opinion or attitude measurement seems to be growing steadily. Literature on this topic is being enlarged from year to year, and the number of requests for instruments for measuring public opinion is increasing. It seems appropriate that a summary of the methods used be made, partly for those who are interested in such methods in a general way, and partly for those who are interested in doing research in this field and are looking for one or more methods of measurement.
The methods chosen for review will somewhat depend on the definition of opinion or attitude. Unfortunately, however, there is wide disagreement as to what opinions and attitudes are. This paper does not permit a discussion of the nature of attitude, and no definition will here be submitted. However, in order to avoid too much misunderstanding, the writer will limit himself to opinions on public issues or public opinion. Only methods used for measuring opinions on such topics as prohibition, nationalities and races, war, politics, will be considered in this paper.
The classification of methods may be made by several criteria. One may classify on the basis of scoring the tests, according to the method of administering the test or according to the method of construction. In this summary the latter one is used as a criterion
( 411) for classifying the methods on the assumption that the method of construction is a more important characteristic of a test than its method of administration or the method of scoring.
No complete bibliography is included. Bain has already given us a good service in this direction. Although he seems to extend his list beyond the so-called "attitudes" and "opinions," his report will suffice for some time. Selection of the illustrative samples here included is rather arbitrary, being guided perhaps by the type of opinion studied, the thoroughness of the investigation, and personal preference. About each investigation an answer to each of the following seven questions is given if reported by the author: purpose, method of construction, contents, administration, scoring, reliability, and validity.
In this paper only methods already used are described. Five methods have been employed by the various investigators in this field. These are: the questionnaire method, the ranking method, the rating method, the paired comparison method, and the method of equal appearing intervals.
1. THE QUESTIONNAIRE METHOD
The fundamental procedure in the questionnaire method is a series of questions or statements selected by a few judges to represent the opinions. The statements are not scaled. Sometimes they are divided into two groups to represent the favorable and unfavorable opinions. Four investigations illustrate this method.
Harper has made an attempt to measure Conservatism-Liberalism-Radicalism of American educators about various beliefs and public issues. Forty-one judges, doctors of philosophy, or highly selected educators nearing that degree, were asked to pass judgments on 71 statements regarding the conservatism and radicalism of the statements. If the judge expected that a larger per cent of the conservatives than the radicals will agree with the statement, he marked it with a "C." If he expected that a larger per cent of radicals will
( 412) agree with the statement, he marked it with an "R." Twenty-five statements were marked by an "R," the rest by a "C." An average agreement of over 98 per cent was found among the judges.
The questionnaire, when given to 3000 educators, consisted of 71statements such as "World conditions seem now to insure enduring peace among the nations," or "The power of huge fortunes in this country endangers democracy." The giving of the 71 statements takes 30 minutes. The subject is instructed to make the statement with a plus sign, if he agrees with it more fully than he disagrees. If the subject disagrees with the statement more fully than he agrees, he is asked to mark the statement with a minus sign. The raw score is the number of radical statements marked plus. The raw scores are transmuted into scaled scores ranging between o and 8o based on a distribution of attitudes of 675 representative educators.
The reliability of the questionnaire is as follows: Correlations between scores on halves of the questionnaire are .75, .78, and .81 for three different groups. Correlation between scores obtained from the questionnaire given for the first time and scores obtained from the questionnaire given three weeks later was found to be go. To obtain a check on the inconsistency of marking the statements, 29 judges were asked to pass judgments upon the consistency of marking 30 groups of statements. If the first statement in a group was marked with a plus sign, the judge was instructed to mark the other statements in the group with a sign consistent with the first mark. The score of inconsistency was the number of statements marked ac-cording to the finding of the judges, plus one-third of that number, added to correct for the average number of inconsistencies avoidable through guessing.
Watson measured the attitudes of Occident toward Orient or opinions of Americans about China, Japan, and other Eastern nations. Statements representing the opinions were first formulated by Watson and Mr. Keeney. Then about 12 Americans and Orientals were asked to pass judgments on them. The resulting 300 items were criticized by 20 competent judges. One hundred best items were selected on the basis of frequency of choice by judges, state-
( 413) -ment containing a single idea, ambiguity, popular language, and balance of items among different countries concerned, different is-sues, and the radical and conservative positions. Two sample statements are: "Japan's attitude in her relation with the United States in the last five years has been finer than our attitude toward her," and "We should be willing to let American investments in China be lost rather than be drawn into armed conflict in China."
The giving of the first part of the questionnaire takes 15 minutes, the giving of the second part 30 minutes. The subject is asked to check one of the five answers: absolutely true, probably true, doubtful, probably false, absolutely false. Scores are expressed in terms of percentages of the five answers, and profiles of opinion were plotted for each of the various American groups.
Neumann measured twelve types of international attitudes of high-school students such as racialism, nationalism, imperialism, and militarism. In constructing the questionnaire, the indicators or verbal statements were criticized by a seminar and the twelfth grade of a high school. The indicators used may be illustrated as follows: "Japan has demonstrated by her rapid rise to power that the yellow race is the equal of the white race" (racialism) ; "The United States has not always treated small nations justly" (nationalism); and "America ought to join heartily in international efforts to bring about disarmament" (militarism).
Two methods of marking were used. In the first part of the questionnaire a modification of Hart's method was applied. First, all statements with which the subject did not agree were marked by a minus sign. Statements with which he agreed were marked by plus signs. Ambiguous statements or statements which he did not know anything about were marked by a question mark. Then he went over the list of statements again and underlined those with which he most strongly disagreed or agreed. After this he read the underlined statements the third time and double underlined those with which he agreed or disagreed the most strongly of all. This method allows seven types of responses.
In the second part of the questionnaire each statement was
( 414) marked by either one of the five answers: R+, R, ?, W, and W+. For the purpose of scoring, number 2 was assigned to W +, number 3 to W, and so on. In the first part of the questionnaire, scores ranged from 0 to 8, the double underlined minus having a 0 and the double underlined plus having an 8. An individual score is the average of the values of responses to all the statements.
Zeleny measured social opinions of students. Her statements were phrased both in "forward" and "reverse" manner. Only those were finally used in testing that were consistently answered in both forward and reverse order. The statements were submitted to seven faculty members for criticism, and finally 34 were retained in two forms, making a total of 68 statements such as: "True patriots are always loyal to their political parties" (forward), "True patriots are sometimes disloyal to their political parties" (reverse), and "There should be a minimum wage law" (forward), "Minimum wage laws are unwise" (reverse).
Each statement was to be marked either true or false by underlining one of the phrases. If the subject is unable to express opinion, he is instructed to draw no line. An individual score is the total number of right. The reliability of the questionnaire is .89.
2. THE RANKING METHOD
Two ranking methods may be distinguished. In the first type of ranking method the subject is asked to arrange in order a number ofitems-for example nationalities-representing the objects or issues toward or against which the attitude is directed. The arrangement is based on the degree of opinion or attitude with reference to the object.
In the second type of ranking method, items to be arranged in order do not represent the object or issue toward or against which the attitude is directed, but represent rather expressions of the attitude itself. For example, statements representing different degrees of "wetness" and "dryness" on the prohibition question are to be arranged on a scale running from the extremely wet statements through the neutral to the extremely dry statements. The arrange-
( 415) -ment of statements in the order of merit is again based on the degree of opinion or attitude with reference to the object.
The first type of ranking method was used by Bogardus  in a studyof the origins of social distance. Subjects were asked to classify a list of racial and language groups in three columns. In column I those races were to be put toward which a friendly feeling was felt; in column 2the races toward which a feeling of neutrality was experienced; and in column 3 the races whose mention aroused feelings of antipathy and dislike. Each person was then requested to rearrange the three columns: in column I were to be put first those races toward which the greatest degree of friendliness was felt and the others in order. Column 2 was to be started off with the races toward which the nearest perfect degree of neutrality was experienced, and so on. In column 3 were to be put first those races to-ward which the greatest antipathy was experienced and the other in order of decreasing antipathy. The list of races studied included Canadians, Czechoslovaks, Germans, Russians, Englishmen, and the like.
Allport and Hartman  used the second type of method to measure the attitude of conservatism, liberalism, radicalism, and reactionism toward seven issues: the League of Nations, qualifications of President Coolidge, distribution of wealth, the legislative control of the Supreme Court, prohibition, Ku Klux Klan, and graft in politics. Statements about the seven issues were selected from the written descriptions of opinion of 6o students. Each statement was then ranked by six judges according to the degree of attitude expressed in it, and from these results seven tests were constructed. Samples of statements used are: "We should join the League with full responsibility to prevent aggression, but should first obtain sanction for this step by a popular referendum vote," and "A two-thirds decision on the part of the Supreme Court should be necessary in order to declare a law passed by Congress unconstitutional."
In administering the tests, the subjects were instructed to check
( 416) one statement (in the blank space in front of the statement) which most nearly coincides with his or her view. For scoring purposes, each statement was assigned a number in order, and an individual score was the number of the statement checked.
Another example of the second type of ranking method is a recent investigation made by Gordon W. Allport. The purpose of the study was to measure political attitudes. A number of statements expressing political opinion were assembled first. Twenty-five professors of social science were then asked to rank the statements into four groups according to the degree of radicalism-conservatism revealed in them. In group i were put the most radical statements, in group 2 the lessradical statements, in group 3 the less conservative ones, and in group 4 the most conservative statements. Other statements were ranked by the same twenty-five professors on a scale of o, r, and 2 indicating no prejudice, slight prejudice, and considerable prejudice, respectively.
The whole test, when ready to use, consisted of seven pages including statements of attitude, statements to detect amount of in-formation and misinformation, and items to detect amount of prejudice about political questions. E.g., "Not so much public ownership as at present should be practiced," "No more public ownership than at present should be practiced," (statements of attitude); "The cultural background of Smith and his family disqualifies him for presidency" (statement of prejudice).
The administration of the test consists of checking that statement with which the subject is most in sympathy. For the purpose of scoring, each statement of attitude was assigned a value of 6 so that the range of scores is from 6 to 24. The individual score is the average of values of all the checked statements.
An elaborate statistical technique was applied to the second type of ranking method by Thurstone. He submitted Allport-Hartman's thirteen statements on prohibition to two hundred subjects for ranking. Then for each possible pair of statements (78) the proportion
( 417) of the two hundred judges who considered one of the statements more strongly in favor of prohibition than the other was determined. The proportion of each such pair is expressed as PB>A or PB<A. The standard deviation of the distribution of proportions is XB–A = √σ2B+σ2A, σ being the standard deviation of judgments about one statement, or otherwise called the discriminal error. The discriminal errors for each pair of stimuli are considered equal. We get, therefore, from the above equation XB–A = σ √2. The discriminal error, σ, is chosen as the unit of measurement, hence XB–A =√2;. The difference between the statements A and B will then be SB-SA = XBA √2.
After the statements were arranged in rank order on the basis of the proportions of judgments, the scale distances were determined from the above equation between SA and SB, SB and Sc, and so on. From these scale separations are calculated the final scale values for each of the statements. The obtained spacing of the statements according to this technique differed markedly from the one obtained by Allport-Hartman. Some of the statements were found to be bunched in one place, others were far apart, and there was no statement to correspond to the neutral position.
3. THE RATING METHOD
There are two types of rating methods. The first type of rating method is called the self-rating method in which the subjects rate themselves with reference to an attitude or opinion. The second type of method is called the rating-by-others method in which the attitudes of persons are rated by their friends or acquaintances who have a definite knowledge of those attitudes. In both forms of the rating method, however, degrees of attitude or opinion are represented along a line, with steps indicated by descriptive words or phrases, or statements. The subject checks the phrase or statement which he thinks most nearly represents his or his friend's attitude.
The self-rating scale was used by Rice. His scale consisted of eight steps and four descriptive words: "Radicalism," "Liberalism," "Conservatism," and "Reactionism." The scale was intended to
(418) measure attitudes toward existing social conditions. If the subject thought he was a liberal he was instructed to put an "X" above the middle of the word "Liberalism," if he judged himself to be a radical liberal he was instructed to place an "X" above the left half of the term "Liberalism," and so on. Results were expressed in terms of frequencies of judgments for each of the eight steps on the scale.
A graphic rating scale was used by Thurstone and Chave in the measurement of attitudes toward the church. The rating scale was thought by the authors to be merely an incident to the method of equal appearing intervals used in constructing the statement scale. The graphic rating scale consisted of a horizontal line across the title page. At one end of the line was printed the phrase "Strongly favorable to the church," at the middle of the line was printed the word "Neutral," and at the other end of the line there was printed the phrase "Strongly against the church." The subject was instructed to indicate by a cross where he estimated his own attitude to be. A correlation was calculated between the scores on the constructed statement scale and the tenth of the line in which the self-rating check occurred and was found to be .67.
Droba used the self-rating scale to measure attitudes toward war. Scores on this scale were used as a possible criterion in calculating the validity of a statement scale. The self-rating scale consisted of a line on the bottom of the page on which the statement scale was printed. Degrees of attitude were designated both by phrases and by numbers. On the extreme left end of the line the word "Militarism" was printed, on the extreme right end of the line the word "Pacifism," and in the middle range the word "Neutral" was printed. Below the line, numbers were spaced equally from o to 21. The subject was asked to locate his attitude on the scale by placing a cross above the number that most nearly represented his attitude toward war. The correlation between the scores on the statement scale and the scores on the graphic self-rating scale was found to be 75.
The method of rating by others was used by Porter in studying student opinion on war. It was employed to ascertain the scale values of each of the five types of responses to a questionnaire. To each of the 150 statements about war, included in his questionnaire, five types of responses were allowed: "certainly right," "probably right," "doubtful," "probably wrong," and "certainly wrong."
In order to ascertain the amount of militarism in each statement and in each possible answer to it, Porter submitted the questionnaire to 100 students whose convictions on the issue were known to their friends. These 100 students were rated by 3 to 13 judges on a scale running in steps from an extreme anti-militarism of o to an extreme militarism of 10, the neutral being at 5. From the data thus obtained, a scatter diagram was prepared for each of the statements. The five answers were represented on the base line and the rating scale of ten degrees on the y axis. For a person who was rated 4 and answered the statement "doubtful," a check mark was placed in the appropriate square in the diagram. This procedure was followed until 100 check marks, one for each person, constituted the scatter diagram.
The calculation of scale values for each of the five answers to a statement was as follows: The numbers that were found in the column above an answer in the diagram were averaged. The number thus obtained was an average of ratings of all the persons who gave that particular answer. This average rating was assigned as the scale value of that particular answer. As a result, a range of gross total scores ranging between 288 and 756 was obtained. These gross scores were then reduced to final scores ranging from 0 to 10.
Bogardus  measured social distance, or "degrees and grades of understanding and feeling that persons experience regarding each other" by the method of ratings by others. Seven steps were designated on the scale by the following phrases: to close kinship by marriage, to my club as personal chums, to my street as neighbors, to employment in my occupation in my country, as visitors only to my country, would exclude from my country. The objects of attitude or
( 420) opinion or social distance were races or nationalities. Each subject was instructed to place a cross under the classifications to which he would willingly admit members of the races concerned. The social contact range index (S.C.R.) is expressed by the number of classifications to which a race is admitted. The social contact distance index (S.C.D.) is represented by the arithmetic mean of ratings by the subject.
4. THE METHOD OF PAIRED COMPARISON
The method essentially consists of comparing two items of a pair of words, phrases, or sentences representing the attitude or the object of the attitude. The subject is asked to indicate which of the two items is preferable or is the more nearly representative of his attitude. To use statements expressing the attitude itself would be too laborious. For this reason so far only words standing for the objects of attitudes were used.
Thurstone reports a study of nationality preferences. Two hundred and thirty-nine undergraduates were asked to underline one nationality of each pair that they would rather associate with, e.g.: Englishman-Swede. The subject was instructed to underline one of the two, even if he found it difficult to decide. There were 210 such pairs. To calculate the scale distances between the nationalities, an equation was used similar to the one applied by Thurstone to the ranking method. Proportions such as that 89.8 per cent preferred to associate with Americans rather than Englishmen were calculated. The rank order of the 21 nationalities was ascertained by a simple summation of the proportions. Sigma values were read off from appropriate tables for each proportion. Then the difference between the sigma values of two items in each pair was calculated. The scale separations between the sigma values of the adjacent items were obtained by getting the average of the sigma differences. The next step was to choose the scale value of one of these nationalities as an origin and to calculate the scale values of the other nationalities from this origin. Thurstone chose the American nationality for an origin. When finished, the first half of the scale consisted of seven nationalities in order as follows: American, Englishman, Scotch-
( 421) -man, Irishman, Frenchman, German, and Swede. In the second half of the scale were found fourteen nationalities beginning with the South American and ending with the Negro.
used similar procedure to
measure attitudes toward twenty-five races and nationalities such as Austrian,
Belgian, Canadian, Chinese, etc. The purpose of his study was to see if the
scale finally obtained is different under different conditions of instruction.
One group of students was instructed to underline the one of the two races or
nationalities in a pair which he would prefer to have as a fellow-student.
Another group of students was instructed to under-line the one of the two races
or nationalities in a pair which he would prefer to have as a neighbor, etc. The
rank correlation between the scales obtained under the different conditions
ranged between .97 and .99, which indicates that different conditions of
instruction do not exert any noticeable effect on the rank order of
nationalities obtained by the method of paired comparison.
5. THE METHOD OF EQUAL APPEARING INTERVALS
The principle involved in the method of equal appearing intervals is that statements representing attitudes are sorted into a number of piles, say 9 or II, according to degree of attitude expressed by the statements. If in the pile at the extreme left are put the statements representing the most extreme attitude against the object in question, in the pile at the extreme right are put those statements representing the most extreme attitude in favor of the object or issue. In the pile found in the middle range are put statements expressing medium position on the issue. Arrangement of the statements is such that eventually the difference between pile 1 and 2 will appear to the majority of subjects about the same or equal to the difference between pile 2 and 3, and so on.
Smith used the method to study attitudes toward prohibition. A large number of statements about prohibition was collected from various sources. After eliminating a number of statements, 135 were
( 422) left for experimental purposes. Three hundred college students were asked to classify the 135 statements into eleven piles ranging from extreme or absolute freedom to complete restriction that should be imposed on the individual's consumption of alcohol. As a result, 300 judgments were obtained for each statement. The frequencies of these judgments for each statement were then cumulatively added, and percentages of the total number of judgments were calculated for each of the resulting sums. The eleven groups or degrees of attitude toward prohibition were plotted against the frequencies. A point on the base line corresponding to the 50 per cent of judgments or the median gave the scale value of a statement. Ambiguity of statements was measured by quartile deviation.
Knowing the scale value and the ambiguity of each statement, 45 least ambiguous statements about equally spaced along the base line were selected to constitute the final scale. Statements such as the following were included in the scale: "Prohibition is needed to conserve the family," and "Prohibition should come as the result of education, not legislation."
There is no time limit in giving the scale, but the usual time spent by the subjects is 20 minutes. Instructions are to check those statements that express the subject's sentiment toward prohibition. An individual score is the average of the scale values of all the checked statements. The correlation between the two halves of the scale is .84. When the Spearman-Brown formula is applied .92 is obtained.
Thurstone and Chave applied the method to measuring attitudes toward the church. Their procedure was similar to the one described above. They had 300 students to sort 130 statements about church into 11 piles from highest appreciation to an extreme depreciation of the church. The final scale consisted of 45 statements such as "I believe in religion but I seldom go to church," or "I find the services of the church both restful and inspiring." The giving of the scale takes about 20 minutes. Instructions are to check every statement that expresses the subject's sentiment toward the church. An individual score is the average of the scale values of all the checked statements. The correlation between two forms of the scale
( 423) was found to be .89. The estimated reliability of the two forms combined is .94.
Droba  measured attitudes toward war with the same method. He collected 237 statements about war from books, magazines, newspapers, students' written statements, and his own resources. The longest and least clear statements were eliminated and 130 left for experimental purposes. The 300 students used were instructed to classify the 130 statements into 11 groups according to the degree of militarism and pacifism expressed in the statements. To extreme left were to be put statements expressing the extreme of militarism and to the extreme right statements expressing the extreme of pacifism.
Finally, 44 statements were chosen on the basis of scale values and variabilities to constitute two forms of the scale, e.g., "War is the tonic of races" and "There is no justification for war."
The administration of the scale usually does not exceed 20 minutes. The subjects were instructed to mark with a plus sign all statements with which they agreed. If the subject did not agree with the statement he was asked to mark it with a minus sign. If the statement appeared to be an ambiguous one so that the subject could not decide either for or against the statement, he was asked to mark it with a question mark.
Scoring was based on equivalent numbers ranging from 0 to 21, number 0 being assigned to the most extremely militaristic statement and number 21 to the most extremely pacifistic statement. An individual score was the average of equivalent numbers of all the statements marked plus. The correlation between the two forms of the scale was found to be .83. The estimated reliability of the two forms combined was .90.
Purposely no critical comments are made in this paper. It is in-tended to be a descriptive review and not a critical summary of the methods since to include both would make the paper too long. A critical review of the methods is reserved for a separate publication.