Written by Nea Lulik, MSc in Psychology of Individual Differences
Psychology
has been trying hard for a long time to find a measurement that would
accurately measure theoretical constructs. Personality is one of such abstract
and difficult to accurately measure theoretical constructs. Considering personality
is part of every individual, it would be logical to assume that self-report
will be the best answer. In the past decades self-report has been largely used
as the main and only measurement of personality. However, people are biased in
many forms, and that led to re-evaluation of self-report, as the fundamental
measurement of personality.
Recently, interest of developing and using
alternatives to self-report, or in combination to self-report has emerged.
Alternative measurement to self-report (Schwarz,
1999)
are informant method (Vazire, 2006), behavioural method (Moskowitz,
1986),
multiple method (McDonald, 2008) and computer based
personality judgement model (Youyou, Kosinski &
Stillwell, 2015).
These methods can measure various psychological constructs, traits and
processes, but in this article I will concentrate on personality measurement. All
four methods are described below, including their strengths and weaknesses.
Self-report
is a very common method to assess a various number of psychological constructs
and traits in psychological research (Schwarz,
1999).
It usually involves asking participants about their traits, feelings,
behaviour, attitudes, beliefs, personality… Self-report measures “self-observations of individuals collected
in the form of global ratings or responses to items on questionnaires” (Moskowitz,
1986, p. 299). Self-report
can be direct, indirect or open-ended (Paulhus &
Vazire, 2007).
Validity
and reliability are very important when it comes to scientific method. Reliability
means that it is self-consistent and it is possible to re-test and expect the same
results over time. Validity requires high reliability, and means that a test is
valid if it measures what it is supposed to measure. This is especially
difficult when it comes to personality tests, because personality itself is an
abstract construct (Kline, 1983).
Construct validity is the most important in personality research (Bagozzi,
1993).
Construct validity is obtained when the questionnaire is formulated in the
context of a theory which can predict behaviours in relation to the
questionnaire (Loewenthal, 1996). It is also very
important, besides reliability and validity, to pay attention to other major
factors that might affect the quality of the data.
Self-report
results can be affected by many factors such as minor changes in question
wording, question format, or question context can result in great changes in
the results (Schwarz, 1999).
Phrasing of different items can affect an individual's response, such as understanding
of the question, recall of a significant behaviour, and various others (Holden &
Troister, 2009). Open against
closed response formats, frequency scales and reference periods also have the
power to impact results (Schwarz, 1999).
Rating scales can affect results too, depending on the numbers, being positive
or negative, that are used in the scale (Schwarz, Knauper, Hippler,
Noelle-Neumann & Clack, 1991).
Also, reversing the relation between the verbal and numerical label such as, lower
numbers corresponding to stronger agreement, can affect the results (Rammstedt
& Krebs, 2007).
Although,
self-report measurement has many advantages, such as being practical,
efficient, convenient, easy to administer, inexpensive, has a direct insight
into very personal information of the individual, individuals are usually
motivated to respond, and can control most response biases (McDonald, 2008).
However, it has also many disadvantages (Paulhus &
Vazire, 2007).
Many resources have arisen potential issues with credibility of response and
systematic errors (Wiechman,
Smith, Smoll & Ptacek, 2000)
in self-reports due to response biases such as social desirability, acquiescent
and extreme responding, non-response bias, assumption that participants are
self-knowledgeable and have no disported self-perception,
non-situation-specific language use of questions, and cultural limitations (Mundia, 2011).
All these biases represent a threat to construct validity and that is why they
have to be controlled or minimized (Moskowitz, 1986).
Social
desirability is one of the main concerns regarding personality measurement (Bäckström, Björklund & Larsson, 2009),
especially in self-report and it is described as “the tendency of subjects to endorse an item according to how socially
desirable a response is” (Kline, 1983, p. 19).
People who engage in this kind of response bias are motivated to provide a
positive self-presentation (Holden &
Troister, 2009), they want to
appear in favourable, bright way, are motivated by the approval of others.
Basically, they want to project themselves to the outside world in a very
positive aspect. This occurrence is quite common in human beings, but the
extent of social desirability is not uniform and it reveals itself differently
in different situations (Mundia, 2011).
Paulhus (1984)
has distinguished social desirability in two components; Self-deception and
impression management. Self-deceptive positivity is characterized as an honest
but extremely positive view of the individual himself, and it is shown to be
linked to adjustment (Paulhus,
1991). Meanwhile,
impression management is more related to the standard characteristics of social
desirability, such as socially desirable apparent behaviours, so others can see
him in a positive light (Paulhus, 1984).
Although, some studies (Paunonen & LeBel, 2012)
argue that social desirability does not significantly influence results due to
the distribution that resembles the normal/Gauss curve (though biased toward
the positive end), many studies are still prone to significant social
desirability effects (Holden &
Troister, 2009; Soubelet & Salthouse, 2011)
or mask important relations between different variables and therefore needs to
be controlled or minimized (Wiechman, Smith, Smoll &
Ptacek, 2000) with scales,
inventories or statistical techniques (Paulhus,
1991).
Another
type of response bias is acquiescent responding and it is described as “the tendency to agree with an item,
regardless of its content” (Holden &
Troister, 2009, p. 126).
People who engage in this kind of response bias can agree with two statement
even though they are mutually exclusive (Paulhus,
1991). Research has
shown that individuals will more likely reply with ’yes’ to a neutral statement
rather than to extreme statements (Knowles & Condon, 1999).
Authors concluded that the best way to control this response bias is to balance
the trait-indicating items with the trait-contraindicating items, and balance
the ratio of assertions and negations, so there is the same number of ’yes’ and
’no’ answers. Although, there are ways to control and minimize this kind of
response bias, it still can happen and cause problems with interpretation of
results, or inability to interpret results at all.
The
third response bias is extreme responding. Extreme responding represents “the tendency to use the extreme choices on
a rating scale, both positive and negative” (Holden & Troister, 2009, p.
126).
Extreme response bias was found to be stable over time and a consistent
individual difference (Paulhus &
Vazire, 2007). A study (Naemi, Beal & Payne, 2009)
about extreme response style found that intolerance of ambiguity is related to
extreme responding. They found that decisiveness accounts for a significant
amount of the variance in extreme responding style. Also, they discovered that
quick response time to the questionnaire plus intolerance to ambiguity,
decisiveness and simplistic thinking leads to extreme response style. This kind
of response bias is difficult to interpret, because it is never really clear if
the individual's response is due to decisiveness of the answer, lack of
introspection regarding intensity or frequency, tendency toward extreme
ratings, or something else altogether.
There
is another associated matter regarding credibility of responses in self-report,
and it is associated with self-perception distortions. Self-report measurements
rely on participant's self-understanding and presume that participants are
self-aware and self-knowledgeable (McDonald,
2008). Though, that
is not always the case, considering that self-reports measure a person's
self-perception of a psychological trait and not the actual trait (Roberts,
Yeidner & Matthews, 2001).
That is, assuming the information we seek is available to conscious
interpretation. Therefore, it is self-perception that raise response biases and
it can be also due to that people are predisposed toward self-enhancement (Smith, 2005),
or they are trying to preserve unrealistic positive image about themselves (McDonald, 2008).
In contrast with social desirability, distorted self-perception cannot be
controlled or measured by a lie scale, which can cause limitations.
In
self-report there are often issues with non-context-specific language in
questionnaire items (McDonald,
2008). Answers to
personality questionnaires are suggested to be influenced by many factors of
the semantic networks, such as the individual's ego ideal, the intention to be
semantically consistent over questions, and the representations of distinct
life experiences that might be retrieved when answering a question (Kagan, 2007).
Also, questionnaires are constructed such that a person can either consider asserting
the trait or the context in which the trait is present, which can lead to
different answers to the same item being misinterpreted as implication of
different behaviours.
Self-report
has been considered to have cultural limitations. Research regarding cultural
difference in response style with the use of dialectical thinking (Hamamura,
Heine & Paulfus, 2008)
has discovered that East-Asians exhibit more ambivalent and moderate responding
on self-report than North Americans. There has been a debate in the literature (Smith, 2005)
between individualist and collectivist culture regarding self-enhancement,
which suggest that there are indeed cultural differences in self-report.
Considering
all the above disadvantages of self-report, it would be logical and understandable
to consider other alternatives.
An alternative to self-report is Informant
report.
Informant report is a method
in “obtaining peer reports in which a
number of other informants provide ratings about the individual” (McDonald,
2008, p. 5). The
observers can be the parents, friends, spouses, peers, or co-workers, basically
people who share a history with the participant (Moskowitz,
1986).
Generally
this method uses inventories to quantify the information from informants. These
information include everyday functioning, frequency and levels of specific
behaviours, ratings on personality scales in the third-person perspective or
scales that use meta-perceptual wording approach (Simms,
Zelazny, Yam & Gros, 2010).
Meta-perceptual approach means assessing and rating informants’ perception of
the participants’ self-perception.
The
advantages of this method is that informants provide more objective information
about the individual (McDonald,
2008). If there are
more informants involved, there probably will be more data, and therefore could
lead to reliability of results. Informants can also provide situational insight
on behaviours (Hofstee, 1994).
Most importantly, there are no social desirability response biases in informant
report. In the past the traditional informant method was seen as
time-consuming, expensive, not valid due to dishonest responding, and the
informants were believed to not cooperate. However, in the Twenty-first Century,
with the development of internet and technology, the view of this method should
be reconsidered. Informant report method has the potential of becoming more
practical, convenient, less time-consuming, and inexpensive using the informant
questionnaires on the internet (Vazire, 2006).
With the use of e-mails and internet questionnaires, informant reports become
inexpensive, and the researcher saves time on data entry and can easily keep
track of informants' participation. Results (Vazire, 2006)
also show that participants are willing to cooperate due to time-efficiency,
internet questionnaires, and the validity would increase if participants are
acquainted with each other, as well as with informants' rating consensus
correlations and self-other agreement. However, there are still some weaknesses
in using this method. Like self-report, this method can have potential issues
with response biases such as acquiescent and extreme responding. This issues
can present themselves also in the choosing the right informants, because they
can be biased based on relationship or research aims (McDonald, 2008).
This method can have difficulty in assessing a specific behaviour in a specific
situation (Berry, Carpenter & Barratt,
2012). Finally, the
most obvious weakness would be, that others cannot access certain personal information
about the individual, because only the individual himself has the access to
them (Hofstee, 1994; Paulhus & Vazire, 2007).
Over
the years researchers have started relying more and more on self-report and
abandoning behavioural observation.
Behavioural
observation is a method where the observant collects behavioural data through
observation of overt behaviour either in an artificially constructed laboratory
or in a naturally occurring situation (Moskowitz,
1986). This
measurement involves external judges’ view and coding individual's behaviours (McDonald,
2008).
The
researcher can use various technological tools to help prevent missing
important information such as cameras, microphones, or electronically activated
recorder (Mehl, Gosling & Pennebaker,
2006).
Behavioural
observation in a laboratory setting involves the researcher observing and
measuring the participant behaviour through one or two brief standardized
situations. The procedures that occur in the laboratory setting are
implemented. Doing so minimize the reduction of the standardization of the
situation and the reliability with which responses are coded (Moskowitz, 1986). Strengths of this setting are that it
directly observes behaviour and the observer is in control of the situation and
of the stimuli, which he can use to his advantage to observe the desirable
overt behaviour. There are many weaknesses to the behavioural observations in a
laboratory setting such as being time-consuming, unethical and inconvenient (Baumeister,
Vohs & Funder, 2007).
Because it is performed in a laboratory it includes artificiality, as well as
the insight of only situation-specific behaviours (Kagan, 2007),
therefore generalization cannot be made (McDonald, 2008).
Another possible problem in the laboratory setting is the awareness of the
participant at being observed. Social desirability can occur in this type of
settings, as in participant responds while thinking about what the observer's
expectations are. Social desirability can constitute, as well in self-report,
many problems of validity (Moskowitz,
1986).
On
the other hand, behavioural observation in naturally occurring situations are
restricted to the use of one general setting in which data gets collected on
various occasions. Situations might vary, depending on the setting that the
observer chooses. Most of the time it is possible to identify persistent
configurations of stimuli (Moskowitz,
1986). Because the
data collection occurs in a period of time, and not in one or two brief
situation, it gives the researcher insight into more general behaviour of the
individual (Mehl, Gosling & Pennebaker,
2006). However, the
presence of the observer might delay adaptation, the participant participating
in ordinary activities will conjure habitual modes of responding in his
behaviour (Moskowitz,
1986).
Overall,
the behavioural observation method has many strengths because it directly
examines behaviour, gets situation-specific information, has less response
biases than self-report, and it can be done in two settings (McDonald,
2008). On the other
hand, the weaknesses are that it contains lack in practicality and convenience,
has complex coding of behaviour, is time-consuming, expensive, there are
ethical concerns involved. Also, because of the nature of observational
studies, it is not completely sure where the line between overt behaviour and
specific trait.
After
considering all the alternatives to self-report and their strengths and
weaknesses, it would be safe to conclude that the combination of these measurements will provide greater construct validity and closer to the truth results (Kagan, 2007; Williamson,
2007; Holden &
Troister, 2009).
Multiple
methods measurement can improve construct validity, provide more accuracy, and
gain richness of the data by opposing methods against each other (Hofstee,
1994).
Combining
behavioural observation with self-report can lead to greater validity of
self-report responses, because the observational measures will additionally
support those responses (Fulmer & Frijters, 2009). This method
envelopes all the strengths of individual measurements (McDonald,
2008). Although, it
encompass all the individual measurement weaknesses as well, these individual
weaknesses will be less problematic, because of the mixture of the methods.
This type of method also requires more effort, expenses, resources, time and
skills, but the greater construct validity and better results should be worth
it.
A
recent study (Youyou,
Kosinski & Stillwell, 2015)
has come up with another alternative how to assess human personality, more
specifically, human personality judgements. The study compared the accuracy of
human and computer-based personality judgement.
Human personality judgements
results were obtained with self-report, other-report, which was done by one
individuals' friend or two, and using Likes on Facebook (digital footprint)
made by individuals. They selected the realistic approach with three key
criteria; self-other agreement, inter-judge agreement and external validity.
Results in the self-other agreement have shown that computers' average accuracy
over the Big Five traits grows firmly with the number of Likes the individual
had on his profile. Results in the inter-judge agreement have shown that the
average consensus between computer models was greatly higher than the
estimation of personality judgement of two individuals' friends. As for the
external validity, the computer' judgement was able to predict twelve out of
thirteen life outcomes, behaviours and traits related to behaviour, and it was
higher than human judges. This study has proven a new meaning to the assessment
of personality and that is a huge strength. It might become commonly used and
therefore researchers would have less trouble collecting data. It is also
inexpensive, accurate and less time-consuming. On the other hand, a weakness
could emerge in the light of ethics, and protecting people's privacy.
Finally,
after describing and evaluating self-report and all the alternative
measurements of personality, it is safe to say that best method to assess personality
would be multiple method measurement. Observational and informant method have
proved to be very useful and although they have disadvantages, they should be
considered by psychologists as valuable as self-report. Observational method in
laboratory settings has potential for good construct validity due to control
over the situation, though a poor reliability. However, behavioural observation
in natural sittings has good potential for both construct validity and
reliability, if data is collected on a satisfactory number of events. Informant
reports have high reliability and were proven to be an improvement to the
validity of assessment of personality. Self-report should absolutely still be
used in the assessment of personality because it provides unique information
about the individual that only he has access to. Although, it should definitely
be merged with another method, either observational or informant. Combining two
or more methods together would achieve better construct validity and richer
data. Also, combining different measurements might provide in the future a
measurement to accurately measure theoretical constructs, like personality.
More research should be done in this field especially combined with the
evergrowing technology. The computer-based personality judgement model looks
promising, but it is relatively new and more research should be done in this
area. Although, it is a big step in the right direction.
References:
Bäckström, M., Björklund, F., & Larsson, M.
(2009). Five-Factor Inverntories Have a Major General Factor Related to Social
Desirability Which Can Be Reduced by Framing Items Neutrally. Journal of
Research in Personality, 335-344.
Bagozzi, R. P. (1993). Assessing Construct Validity in
Personality Research: Applications to Measures of Self-Esteem. Journal of
Research in Personality, 49-87.
Baumeister, R. F., Vohs, K. D., & Funder, D. C. (2007).
Psychology as the Scence of Self-Reports and Finger Movements. Whatever
Happened to Actual Behavior? Perspectives on Psychological Science,
396-403.
Berry, C., Carpenter, N., & Barratt, C. (2012). Do
Other-Reports of Counterproductive Work Behavior Provide an Incremental
Contribution Over Self-Report? A Meta-Analytic Comparison. Journal of
Applied Psychology, 613-636.
Fulmer, S. M., & Frijters, J. C. (2009). A Review of
Self-Report and Alternative Approaches in the Measurement of Student
Motivation. Educational Psychology Review, 219-246.
Hamamura, T., Heine, S., & Paulfus, D. (2008). Cultural
Differences in Response Style: The Role of Dialectical Thinking. Personality
and Individual Differences, 932-942.
Hofstee, W. K. (1994). Who Should Own the Definition of
Personality? Europian Journal of Personality, 149-162.
Holden, R., & Troister, T. (2009). Developments in the
Self-Report Assessment of Personality and Psychopathology in Adults. Canadian
Psychology, 120-130.
Kagan, J. (2007). A Trio of Concerns. Perspectives on
Psychological Science, 361-376.
Kline, P. (1983). Personality: Measurement and theory.
London: Hutchinson.
Knowles, E., & Condon, C. (1999). Why People Say
"Yes": A Dual-Process Theory Of Acquiescence. Journal of
Personality and Social Psychology, 379-386.
Loewenthal, K. M. (1996). An Introduction to Psychological
Tests and Scales. London: UCL Press.
McDonald, J. D. (2008). Measuring Personality Contructs: The
Advantages and Disadvantages of Self-Reports, Informant Reports and Behavioural
Assessments. Enquire, 75-94.
Mehl, M. R., Gosling, S. D., & Pennebaker, J. W. (2006).
Personality in Its Natural Habitat: Manifestations and Implicit Folk Theories
of Personality in Daily Life. Journal of Personality and Social Psychology,
862-877.
Moskowitz, D. (1986). Comparison of Self-Reports, Reports by
Knowledgeable Informants, and Behavioral Observation Data. Journal of
Personality, 294-317.
Mundia, L. (2011). Social Desirability, Non-Response Bias and
Reliability in a Long Self-Report Measure: Illustations From the MMPI-2
Administered to Brunei Student Teachers. Educational Psychology,
207-224.
Naemi, B., Beal, D., & Payne, S. (2009). Personality
Predictors of Extreme Response Style. Journal of Personality, 261-286.
Paulhus, D. (1984). Two-Component Models of Socially
Desirable Responding. Personality Processes and Individual Differences,
598-609.
Paulhus, D. (1991). Measurement and Control of Response Bias.
In J. P. Robinson, P. R. Shaver, & L. S. Wrightsman (Eds.), Measures of
Personality and Social Psychological Attitudes (pp. 17-59). San Diego, CA:
Academic Press.
Paulhus, D., & Vazire, S. (2007). The Self-Report Method.
In R. W. Robins, R. C. Fraley, & R. F. Kueger (Eds.), Handbook of
Research Methods in Personality Psychology (pp. 224-239). New York:
Guilford.
Paunonen, S., & LeBel, E. (2012). Socially Desirable
Responding and Its Elusive Effects on the Validity of Personality Assessments. Journal
of Personality and Social Psychology, 158-175.
Rammstedt, B., & Krebs, D. (2007). Does Response Scale
Format Affect the Answering of Personality Scales? Europian Journal of
Psychological Assessment, 32-38.
Roberts, R., Yeidner, M., & Matthews, G. (2001). Does
Emotional Intelligence Meet Traditional Standards for an Intelligence? Some New
Data and Conclusions. Emotion, 196-231.
Schwarz, N. (1999). Self-Reports. Americal Psychologist,
93-105.
Schwarz, N., Knauper, B., Hippler, B., Noelle-Neumann, E.,
& Clack, L. (1991). Rating Scales: Numeric Values May Change the Meaning of
Scale Labels. Public Opinion Quarterly, 570-582.
Simms, L. J., Zelazny, K., Yam, W. H., & Gros, D. F.
(2010). Self-Informant Agreement for Personality and Evaluative Person
Description: Comparing Methods For Creating Informant Measures. European
Journal of Personality, 207-221.
Smith, G. (2005). On Construct Validity: Issues of Method and
Measurement. Psychological Assessment, 396-408.
Soubelet, A., & Salthouse, T. (2011). Influence of Social
Desirability on Age Differences in Self-Reports of Mood and Personality. Journal
of Personality, 741-762.
Vazire, S. (2006). Informants Reports: A Cheap, Fast, and
Easy Method for Personality Assessment. Journal of Research in Personality,
472-481.
Wiechman, S., Smith, R., Smoll, F., & Ptacek, J. (2000).
Masking Effects of Social Desirability Response Set On Relations Between
Psychosocial Factors and Sport Injuries: A Methodological Note. Journal of
Science and Medicine in Sport, 194-202.
Williamson, A. (2007). Using Self-Report Measures in
Neurobehavioural Toxicology: Can They Be Trusted? NeuroToxicology,
227-234.
Youyou, W., Kosinski, M., & Stillwell, D. (2015).
Computer-Based Personality Judgments are More Accurate Than Those Made by Humans.
PNAS, 1036-1040.