Personality Questionnaire validity and reliability

Synopsis

We estimate that the Personality Questionnaire will indicate an English-speaking adult personality type accurately 85% of the time in a non-controlled (i.e. over the internet) environment. Resulting types are repeatable 75% of the time in a non-controlled environment, and 95% of the time in a controlled environment. Unless otherwise noted, these statistics were generated from a set of 100,000 subjects.

Techniques employed

Validation technique: Best Approach method. This method incorporates "Content" and "Criterion-related" validity assessment. Best Approach was used during the development of the Personality Questionnaire to ensure that we were starting from a good place. We satisfied this requirement primarily by ensuring that the primary creator of the Personality Questionnaire was highly educated with a thorough understanding of Psychological Type, and that this understanding was verifiable by other known experts in the field.

Validation Technique: Comparison method. This method was used during first-phase and second-phase testing and validation of the Personality Questionnaire. It was used primarily to validate the end-results of the Personality Questionnaire. We compared Personality Questionnaire results against the results of other well-known instruments, namely the MBTI and Keirsey's Temperament Sorter. Our goal was to produce the same type as these comparable indicators at least 75% of the time in an uncontrolled environment. Subsequent revision after release brought us up to 85% matching.

Validation Technique: Averages method. We used the Averages method extensively to validate the individual questions that make up the Personality Questionnaire, and to check how closely the overall results lined up with what we know about Psychological Type distribution in the general population. This was used during the original implementation and all revisions of the questionnaire, and was our primary validation tool.

The Averages method was employed to validate that specific questions fell within expected norms. Our goal was twofold; we weighed the value of questions based on how often the answer to any individual question matched what was expected for the resultant type, and secondly we measured how answers overall fit into the average measured types of the general population, looking for a result that fell within 10% of the expected norm. Individual questions that did not match the resultant type at least 75% of the time were discarded and replaced with more effective questions. Using data sets of 5000 questionnaire results, we applied the Averages method until both standards were reached.

Expected norms of the general population are as follows:
60% Extraverted
40% Introverted
75% Sensing
25% Intuitive
50% Thinking
50% Feeling
50% Judging
50% Perceiving
Using 5000 questionnaire results per pass, we checked that each question rendered results within 10% of these expected norms.

Reliability All reliability data was determined via Repetition. Elements that will affect validity and reliability of the Personality Questionnaire:

Mastery of English. PQ reliability goes down for subjects whose first language is not English.
Age. PQ validity is highest for subjects aged 20-30, and least valid for those under 16 and over 60. Results can be reliably repeated for all ages, except for under 16.
State of mind. Individuals who are under extreme stress, or under treatment for a mental condition may not be typed accurately.