The K10 and K6 scales were developed with support from the U.S. government's
National Center for Health Statistics for use in the redesigned U.S. National
Health Interview Survey (NHIS). As described in more detail in Kessler et
al. (2003), the scales were designed to be sensitive around the threshold
for the clinically significant range of the distribution of nonspecific
distress in an effort to maximize the ability to discriminate cases of serious
mental illness (SMI) from non-cases. A small validation study carried out
in a convenience sample in Boston found evidence that the scales perform
quite well and that, in fact, the six-question scale is at least as sensitive
as the ten-question scale for the purpose of discriminating between cases
and non-cases of SMI. The K6 is now included in the core of the NHIS as
well as in the annual National Household Survey on Drug Abuse. The K10 is
in the Australian and Canadian equivalents of the NHIS. The K10 is also
included in the National Comorbidity Survey Replication (NCS-R) as well
as in all the national surveys in the World Health Organization's World
Mental Health (WMH) Initiative. We plan to refine the calibration rules
for the scales based on analysis of data in these surveys. At the moment,
though, the only calibration data come from the Boston validation study.
This web site will post information on expanded calibration as these data
Two versions of the scales are presented here, one for interviewer-administration
and the other for self-administration. Note that the K6 is merely a truncated
form of the K10 in which four questions are deleted. The question series
presented here include not only the six or ten Likert scale questions
in the scales, but also a number of other questions that we routinely
administer along with the scales to learn about persistence and impairment.
These additional questions are not required to score the K6 or K10.
Scoring Note: The K10 and the K6 scales are administered in Australia
using an alternate scoring system based on responses of "1-5" versus the
"0-4" system presented here. This alternate system results in a score
range of 6-30 for the K6 and 10-50 for the K10. The optimal cut point
on the K6 for this system is 6-18 versus 19+.
For a comprehensive list of articles on the K10 and K6, click here.
All versions of the K6 and K10 are available for download on our website.
Translated versions of the K6/K10 instruments:
*Not yet translated in this language.
Are you interested in translating and using the K10/K6 in a language other than English?
Researchers and clinicians from a number of countries have contacted us about translating and validating the K10/6 in their languages. We will post each such translation on our web page when we receive it along with an acknowledgement to the person or persons who made the translation and a citation to any publication that reports the results of the validation. This memo provides a brief overview of requirements for acceptable translations and validations.
The 10 questions in the basic form of the K10 are included in the WHO World Mental Health surveys. These surveys are being carried out in 30 countries around the world and the interview schedule is being translated into 35 different languages. Some, but not all, of the collaborators in the WMH surveys are also translating the K10/K6 self-administration forms into these same languages. We are posting these versions as they become available. For those of you who are interested in translating the K10/6 self-administration forms into your language, we are using the WHO translation and back-translation protocol for all official K10/K6 translations. Please write and ask us for this protocol. We can then tell you if we already have another group working on a translation in your language and, if not, we can discuss with you a time line for your translation.
K10/K6 validations are being done in primary care clinics, community mental healthy centers, and social welfare offices. The validation standards differ in each case, but our preference is for validation to be based on a semi-structured research diagnostic interview, such as the SCID, and to have a sufficient number of respondents validated to have the statistical power needed for evaluation. A minimal design would be to carry out a clinical evaluation of 50 people who have a positive K6 score (10 or more on the 0-24 scale) and 50 who have a score in the range 0-9, although larger numbers would be better.
To contact us about K10/K6 translation and validation, email Kate Peruzzini at firstname.lastname@example.org.
Is a formal request to use the scale needed? If yes, how?
No formal request is needed, but we would appreciate it if you cited the
following article (see below) when you use the scale and if you would
send us citations to all publications that use the scale.
Kessler, R.C., Barker, P.R., Colpe, L.J., Epstein, J.F., Gfroerer, J.C.,
Hiripi, E., Howes, M.J, Normand, S-L.T., Manderscheid, R.W., Walters,
E.E., Zaslavsky, A.M. (2003). Screening for serious mental illness in
the general population Archives of General Psychiatry. 60(2),
Q: Where can I find scoring rules for the K10 and K6?
A: Simple scoring is to convert the K6 to a 0-24 scale (each of the six
questions coded 0-4 and summed) and the K10 to a 0-40 scale. The calibration
study that was published recently in Archives of General Psychiatry
(see publications above) shows that a cut point on 13+ on the K6 is the
optimal cut point for assessing the prevalence of SMI in the national
population, where "optimal" means equalizing false positives and false
However, as you might know, this cut point is optimal only in a population
that has the same prevalence as the total US population. As a result,
even though the 13+ rule will generally get you a fairly good estimate
of the prevalence of SMI in your population, that's not the correct way
to estimate the prevalence of SMI. The correct way is to use information
about the sensitivity and specificity of the scales in your population
to generate a prevalence estimate. In the absence of such information,
you might want to use sensitivity and specificity information from the
NCS-R in populations of various sorts (e.g., primary care populations,
low-income community populations, etc.).
Kessler, R.C., Green, J.G., Gruber, M.J., Sampson, N.A., Bromet, E., Cuitan,
M., Furukawa, T.A., Gureje, O., Hinkov, H., Hu, C.Y., Lara, C., Lee, S.,
Mneimneh, Z., Myer, L., Oakley-Browne, M., Posada-Villa, J., Sagar, R.,
Viana, M.C., Zaslavsky, A.M. (2010). Screening for serious mental illness
in the general population with the K6 screening scale: results from the
WHO World Mental Health (WMH) survey initiative. International Journal
of Methods in Psychiatric Research 19(S1), 4-22.
Q. Has the scale been used in surveys of racial and
The K10 has been used in the WHO World Mental Health (WMH) surveys. WMH
includes surveys of nearly 250,000 people in 30 countries throughout the
world. Calibration of K6/K10 scores to clinical assessments is currently
being carried out and results will be posted when they become available.
In the US, a number of studies are using the K6 or the K10 in studies
of minority samples.
Q. How does the scale work in minority populations?
Scale properties are stable in minority sub-samples of our large surveys.
Q. Has this scale been translated into languages other than English?
Yes, see http://www.hcp.med.harvard.edu/ncs/k6_scales.php.
Q. Have you ever used the CDC measure of "frequent mental
distress (FMD)" derived from the question "Thinking
about your mental health, for how many days during the past 30 days
was your mental health not good?" in the same survey where K-6
questions are used? If yes, to which extent may FMD and psychological
distress overlay? Do they measure different things?
The FMD question has a much more skewed distribution than the K6
or K10 and should only be used when you are interested in fleshing out
the top 1-2% of the population. The K6 is best to use when you want to
look at serious mental illness (about 5-8% of the population). The K6
was not designed to distinguish the top 1-2% from the rest of the population.
If you want to have sensitivity throughout the severity range, you can
use the K6 or K10 along with the FMD question. Remember, though, that
in order to have adequate statistical power for substantive analysis of the
top 1-2% of the population, you need a very large sample.
Q: A question has been raised by several colleagues about the likelihood
of seriously ill persons who are being successfully treated (manic depressives
with lithium, for example) being picked up by the K6. How do we handle
A: Symptom screening scales miss people who are successfully treated.
It takes much more extensive instruments to find people who have, say,
bipolar disorder, but who are being successfully treated. A quick thing
you might want to do if you want to know this stuff in a rough and ready
way: You can ask people if they are in treatment and, if so, you can ask
them to think about the month before treatment and to tell you how they
would respond to the K10/K6 questions for that month. Note that this approach
will only expand your assessment to include people who might have high
distress right now were it not for current treatment (e.g., taking their
lithium every day). You will not pick up people with a history of SMI
who are currently doing well without treatment.
Q. I haven’t been able to find clear guidance regarding
the proper coding of records with missing values. Should one set to missing
in the final variable all records missing a value for even one of the
original variables, only records missing values for more than some minimum
subset of the original variables, or some other rule (clearly records
missing values for all six have to be coded to missing)?
A: We never made recommendations on how to handle missing values in our
publications on K6 scoring. There are lots of available options and the
best way is usually to handling missing values on the K6 in the same way
you do missing values on the other variables in your survey. We would
normally use Multiple Imputation if we had other variables in the dataset
that were strongly related ot K6 scores. If not, then you have no basis
for making imputations and that means you will get equivalent results
deleting the cases, usibngf hot deck imputation, and making a weighting
adjustment. Or you could be conservative in that case and recode missing
values to the mode, which will always be a score of "none of the
time" for items like this. But these are matters of taste and I don't
think it is appropriate for the developer of a scale to dictate when it
comes to matters of taste.