A Short Version of the Big Five Inventory (BFI-20): Evidence on Construct Validity

Several measures were developed in the past decades to measure personality, focusing on the Big Five Factor Model (BFFM; Openness, Conscientiousness, Extraversion, Agreeableness, and Neuroticism). Despite the relevance of their findings in different countries, a shared limitation of such measures is their length, demanding time from researchers and participants, which might cause boredom or fatigue, biasing the final results. This research aimed to provide a shorter version for the 44-Item Big Five Inventory (BFI), through two studies (NTotal = 8,119). The structure was assessed using a range of techniques (e.g., PAF analysis, Procrustes rotation). The best 20 items (4 per factor) were chosen to compose the final version of the BFI-20, which presented suitable psychometric evidences across the samples. Thus, due the growing need for shorter measures without losing their psychometric quality, our findings indicate the adequacy of the 20-item BFI and its potential applicability in research context.


Introduction
Personality traits are stable characteristic patterns of thoughts, feelings and behaviors of each individual in their interaction with the environment (Dumont, 2010;Goldberg, 1993;Hall et al., 2000). The Big Five model is the most widely used taxonomy of personality traits. The Big Five model was developed from the lexical approach that uses trait-descriptive adjectives to identify the structure of personality traits. The model proposes the five trait factors of Openness to Experience, Conscientiousness, Extraversion, Agreeableness, and Neuroticism (Gurven et al., 2013;John et al., 2008;McCrae, 2011;Paunonen & Jackson, 2000;Silva & Nakano, 2011;Yarkoni, 2010;Wright, 2017).
Many psychometric measures have been developed to measure these five personality factors, comprising different sets of items and assessing directly the factors or their facets (e.g., Costa Jr. et al., 2001;Schmitt et al., 2007). However, most of the available measures comprise multiple items. When the inclusion of multiple measures is necessary in a particular research project or in occasions in which the researcher has limited time available for data collection, the length of the instruments becomes an issue.
Hence, for certain research purposes long instruments are not desirable, as they cause fatigue and demotivation to the respondents, making it less likely for them to adhere to future studies (Credé et al., 2012). As an alternative for extensive instruments, some researchers have proposed and defended shorter measures of the Big Five factors, which has increased the number of brief versions for assessing these personality traits (e.g., Ames et al., 2006;Denissen et al., 2008;Gosling et al., 2003).
Despite the many advantages of shorter measures of the Big Five, it is important to note limitations. For instance, the instruments' reliability can be directly and negatively influenced by the small number of items (Carvalho et al., 2012), short measures might not represent the construct adequately (Clark & Wilson, 1993;Yarkoni, 2010), and might lead to poor predictive validity (Credé et al., 2012). When proposing a shortened version for a personality measure, researchers should conciliate the length of the instrument with the quality of its psychometric parameters. In the current article, we present an effort to contribute with the measurement of personality, offering evidences on the construct Revista Interamericana de Psicología/Interamerican Journal of Psychology 2021 ARTICLE | 3 validity (factorial validity and reliability) of a widely used measure for assessing the NEACO factors: the Big Five Inventory (Benet-Martínez & John, 1998;Schmitt et al., 2007). This investigation beings with a brief overview of the

The Big Five Factors: Characteristics and Measures
The Big Five can be conceptualized as a hierarchical organization of personality traits, represented by specific traits clustered within facets which in turn are clustered within the five main personality dimensions, that indicate a structure in which most traits can be classified (McRae, 2010;McCrae & John, 1992). The Big Five model is probably the most accepted model of personality in the literature given its replicability of the five factors in diverse and cross cultural samples (De Young et al., 2010;McCabe et al., 2013;Soto & John, 2012). Despite the lack of consensus about the label for the Big Five factors (Silva & Nakano, 2011), the core of its traits is similar in different approaches (Carvalho et al., 2012). Thus, each one of the factors is named based on a general trait, encompassing characteristics and semantics shared by the specific traits that form the corresponding dimension (Lima, 1997). As noted, the five general traits are Openness to Experience, Conscientiousness, Extraversion, Agreeableness and Neuroticism, which are often abbreviated in the OCEAN acronym. Based on available scholarship (De Young et al., 2010;Digman, 1990;Goldberg, 1993;McCrae, 1992), these broad personality traits can be summarized as follows. Openness to experience: Reflects the degree of intellectual curiosity, creativity, and a preference for novelty and variety; Conscientiousness: Indicates a tendency to show self-discipline, to act dutifully, and to aim for achievement; Extraversion: Energy, positive emotions, assertiveness, sociability, the tendency to seek for stimulation in the company of others, and talkativeness describe this trait; Agreeableness: Expresses a tendency to be compassionate and cooperative rather than suspicious and antagonistic towards others; and Neuroticism: Reflects the tendency to frequently experience unpleasant emotions, such as anger, anxiety, depression, or vulnerability.
The Big Five personality taxonomy has produced several benefits, including the ability to better integrate and compare findings from several studies (Parks & Guay, 2009). The benefits of a widely accepted taxonomy of personality traits lead to the development of several rating instruments in the 1990's (Ostendorf & Angleitner, 1994).
Perhaps the most comprehensive instrument is Costa and McCrae's (1992)   . Consequently, shorter instruments have been proposed, ranging from 5 (Sporrle & Bekk, 2013), to 10 , 15 (Lang et al., 2011), 20 (O'Keefe et al., 2012, or 40 (Saucier, 1994) items. However, it is a great challenge to maintain the psychometric properties of an inventory with fewer items. For instance, in some cases, the Cronbach's alphas for the five dimensions are lower than the recommended (e.g., .40 for Agreeableness and .45 for Openness ;Gosling et al., 2003). In the following, we discuss the Big Five Inventory, the measure used in the present research, with a focus on available brief measures.

The Big Five Inventory
Many instruments for assessing the Big Five model of personality have been developed based on the pool of items from Goldberg's (1992) 100-item TDA (see, e.g., Goldberg et al., 2006;Saucier, 1994). Among these measures, John et al.'s (1991) 44items Big Five Inventory (BFI) is one of the most used instruments in studies about personality and correlates, mainly due to its clear factorial structure, acceptable coefficients of reliability, and significant convergent validity (Soto & John, 2009).
Indeed, the BFI has been validated in more than 50 countries in all the inhabited continents, including Brazil, Japan, Lebanon, New Zealand, Poland, South Africa, United Kingdom, and United States (Schmitt et al., 2007). Some substantial evidence of its psychometric parameters are detailed below.

Brief Measures of the Big Five Inventory
Many brief versions of the BFI have been proposed. Aiming to provide a psychometrically sound measure for contexts in which participant time is usually quite limited, Rammstedt and John (2007) abbreviated the Big Five Inventory to a 10-item version. Their results indicated that reducing the items yielded effect sizes that were lower than those for the full version-and the losses were more substantial for the Five measures, one with 20 items and another with 32 items (Andrade, 2008 authors argued that the study did not present a representative sample of the entire Brazilian population, as data was obtained from 554 subjects in two Brazilian cities.

The Present Research
As reviewed above, the BFI has been used in diverse cultures, showing evidences of factorial and convergent validity, and reliability. However, despite its popularity and usefulness in the research context, the measure has an extensive number of items, which can be problematic when the demanded time is short and/or many constructs are assessed (Denissen et al., 2008;Rammsted & John, 2007). The present research contributes to a growing literature developing or evaluating the psychometric parameters of brief measures to assess personality traits (Denissen et al., 2008;Gosling et al., 2003;Rammsted & John, 2007;Sporrle & Bekk, 2013;Woods & Hampson, 2005) by examining the psychometric properties of the BFI across Brazilian samples.
In particular, this article reports two studies examining the psychometric properties of the BFI-44 in Brazil and the development of a short version of the scale.
Study 1 examines the adequacy of the BFI-44 in a large sample of Brazilian participants considering parameters reported by Schmitt et al. (2007). Study 2 examines the factorial structure of the proposed 20-item version of the scale in another large Brazilian sample, checking its congruence regarding the previous study.

Participants
Participants were 4,995 Psychology/Education undergraduate students from all five Brazilian regions, covering 24 out of its 27 states (see Table 1). Most of the participants were women (71%), single (75.7%), with mean age of 23.7 years (SD = 6.99, ranging from 16 to 67). This was a non-probabilistic and convenience sample, including students who voluntarily agreed to participate.  produced using the committee approach (Brislin, 1970) by three bilingual psychologists.

Data Analysis
Using the BFI factorial structure. We then used Cronbach's alpha to assess the internal consistency of the five factors.

Results
We first carried out a PAF analysis followed by a parallel analysis to identify the number of factors to extract across the full sample. Although the parallel analysis suggested up to eight factors, five clear factors with eigenvalues greater than 2 were observed, accounting for 35.7% of the total variance. Table 2 presents the factor structure of the BFI, and inspection indicates that the structure is similar to the one reported by Schmitt et al. (2007).  ranging from 18 to 73) (see Table 3). As in Study 1, this was a non-probabilistic and convenience sample of undergraduate students who completed the measures voluntarily.

Instrument, Procedure and Data Analysis
This study is part of the same larger project investigating the personality correlates of human values in Brazil, but with a particular focus on the Northeast region of the country. Similar to Study 1, the survey package was posted to research collaborators, who collected data from their students during class time. The project followed ethics guidelines from the National Health Council in Brazil (resolution 466/12), and obtained  ARTICLE | 13 ethics approval from the Federal University of Paraiba (approval number: CEP/HULW 257/10).

Revista Interamericana de Psicología/Interamerican Journal of Psychology
The survey questionnaire had the same measures as in Study 1, and the average completion time for the questionnaires was 15 minutes. To provide evidence of discriminant validity, we examined the values measure in this study which is composed of 18 marker values rated as guiding principles on a 7-point scale ranging from 1 (completely unimportant) to 7 (of the utmost importance). Gouveia (2003) argues there are six clusters of values based on their function of expressing basic needs and guiding behavior. Although using the BFI-44, our analysis focused on the best 20 items identified in the first study. We used a similar data analytical approach using Procrustes rotation to test the factorial congruence of the Northeastern matrix of the BFI-20 in relation to the national data from the first study. Cronbach's alphas (α) and McDonald's Omega (ω) for each factor were also computed. In addition, convergent validity between the BFI-20 and the BVS were calculated (Pearson's correlations).

Results
The PAF analysis resulted into five clear factors with eigenvalues greater than 1 and accounted for 37.2% of the total variance. The Northeastern factor structure of the BFI was similar to the one presented in Study 1. The factorial structure of the BFI-20 is presented in Table 4, corresponding to the second column of each factor. As expected, the factor loadings of all the items were higher than |.30| in their respective theoretical factor. The lowest loadings (.31 for both) were for items 19 (Agreeableness) and 13 (Conscientiousness), and the highest were for items 16 (.80; Neuroticism) and 8 (.78; Agreeableness). The last five columns of the     ARTICLE | 15 previous findings (Roccas et al., 2002), Neuroticism did not show reliable associations with basic values.
The Big Five model is the most widely used taxonomy suggesting Openness to Experience, Conscientiousness, Extraversion, Agreeableness and Neuroticism as the core general factors of personality traits (e.g, Gurven et al., 2013;John et al., 2008;McCrae, 2011;Wright, 2017). As a result, many instruments have been developed over the years to measure these personality factors, many of them using larger set of items (e.g., Costa & McCrae, 1992;Goldberg, 1992 According to our results, the 20-item version of the Big Five Inventory (or BFI-20) can be adequately used as a measure of the basic five factors of the personality for research proposes in Brazil. Despite being expected that Cronbach's alpha will be negatively affected by the reduction of items (Yuan & Bentler, 2002), even after eliminating up to 50% of the items, this most commonly used coefficient (Dunn et al., 2014) had similar or better results than the those found for the 44-item in Schmitt et al. (2007), and mainly for Conscientiousness in Study 1. Perhaps Conscientiousness is a broader construct, involving more than one idea in the Brazilian context, comprehending both a way of behaving (e.g., "Does things efficiently"; "Perseveres until the task is finished") and a personal characteristic (e.g., "Does a thorough job"; "Is a reliable worker").
Notwithstanding the evidences of adequacy of the BFI-20, potential limitations of the studies can be pointed out. Firstly, the sampling comprised participants who are literate and urban, although we made an intent to include people from the countryside in Study 2, a less common practice in studies on personality traits (Gurven et al., 2013). For the current version, besides showing adequate psychometric parameters (evidences on factorial validity and reliability), its five subscales or factors were composed only by positive items, which can induce response bias (van Sonderen et al., 2013). Moreover, when a set of items is reduced, it is possible that it will be less able to cover the full range of a construct. However, a set of four items by factor is in line with the recommended by the literature (Hair et al., 2010).  ARTICLE | 17

Revista Interamericana de Psicología/Interamerican Journal of Psychology
Finally, future studies must be carried out to check the additional psychometric evidences of the BFI-20 in Brazil, as well as to test whether the same set of 20 items that adequately index the Big Five in Brazil would showed similar adequacy in other cultural contexts; including those using similar language (Portugal and Angola) and other languages with and without cultural similarities (e.g., Argentina, Finland). Furthermore, it will be important to assess the adequacy of its set of items by the Items Response Theory, exploring their functionality individually and in the pool. Regarding the inventory itself, it is important to observe its convergent validity to alternative measures of the Big Five, including shortened ones, such as the Ten-Item Personality Inventory . Furthermore, it is necessary to investigate any potential response bias altering participants' scores, as social desirability (discriminant validity), as well as to estimate the predictive power (predictive validity) of the brief version. Checking its temporal stability (test-retest) is equally important, assuring its usability in longitudinal studies, for instance.