How reliable are psychiatric diagnoses?

Is ADHD one of the only psychiatric conditions that can be diagnosed objectively?

It really depends on what you mean by “objective”, but the answer is “probably not”.

Since we do not understand the underlying causes of ADHD — or any major psychiatric disorder — we diagnose them based on clusters of symptoms.

In the United States and several other countries, a large number of psychiatrists use a book called Diagnostic and Statistical Manual of Mental Disorders (DSM).

DSM (now in it’s 5th, revised edition, DSM-5) essentially uses a system of checklists to enable a clinician to assess if a person has a given disorder. This is a controversial book for various reasons, but for now it is what most psychiatrists use.

Instead of the complex philosophical question of ‘objectivity’, the usefulness of DSM can be assessed using statistical measures of ‘reliability’.

Given that one clinician uses DSM-5 to give the diagnosis of ADHD, how likely is another clinician to do so using the DSM-5? Measures of “test-retest reliability” capture this probability.

Here is a paper that explains the statistical measurement of reliability in some detail:

DSM-5: How Reliable Is Reliable Enough? [pdf]

There are conflicting reports on the reliability of DSM-5, but here is one paper that reports statistical assessments:

DSM-5 field trials in the United States and Canada, Part II: test-retest reliability of selected categorical diagnoses.

“There were a total of 15 adult and eight child/adolescent diagnoses for which adequate sample sizes were obtained to report adequately precise estimates of the intraclass kappa. Overall, five diagnoses were in the very good range(kappa=0.60–0.79), nine in the good range(kappa=0.40–0.59), six in the questionable range (kappa = 0.20–0.39), and three in the unacceptable range (kappa values,0.20). Eight diagnoses had insufficient sample sizes to generate precise kappa estimates at any site.”


“Two were in the very good (kappa=0.60–0.79) range: autism spectrum disorder and ADHD.”

For more on the quantity reported here, kappa, see this paper:

Interrater reliability: the kappa statistic

The quantity kappa ranges from 0 to 1. Zero means that there was no agreement between raters (clinicians in this case), and 1 means there was perfect agreement.

As I said before, the DSM is controversial — and not just because of reliability issues. Here is a sampling of papers and popular articles on the general topic:

Academic articles

Reliability in Psychiatric Diagnosis with the DSM: Old Wine in New Barrels

“However, the standards for evaluating κ-statistics have relaxed substantially over time. In the early days of systematic reliability research, Spitzer and Fleiss [4] suggested that in psychiatric research κ-values ≥0.90 are excellent; values between 0.70 and 0.90 are good, while values ≤0.70 are unacceptable. In 1977, Landis and Koch [5] proposed the frequently used thresholds: values ≥0.75 are excellent; values between 0.40 and 0.75 indicate fair to good reliability, and values ≤0.40 indicate poor reliability. More recently, Baer and Blais [6] suggested that κ-values >0.70 are excellent; values between 0.60 and 0.70 are good; values between 0.41 and 0.59 are questionable, and values ≤0.40 are poor. Considering these standards, the norms used in the DSM-5 field trial are unacceptably generous.”

The Reliability of Psychiatric Diagnoses: Point—Our psychiatric Diagnoses are Still Unreliable

“Today, 26 years later, did the DSM system succeed in improving the reliability of psychiatric diagnoses? Two answers exist. The DSM did improve the reliability of psychiatric diagnoses at the research level. If a researcher or a clinician can afford to spend 2 to 3 hours per patient using the DSM criteria and a structured interview or a rating scale, the reliability would improve. [13] For psychiatrists and clinicians, who live in a world without hours to spare, the reliability of psychiatric diagnoses is still poor. [2,3]”

Diagnostic Issues and Controversies in DSM-5: Return of the False Positives Problem.

“The fifth revision of the Diagnostic and Statistical Manual of Mental Disorders (DSM-5) was the most controversial in the manual’s history. This review selectively surveys some of the most important changes in DSM-5, including structural/organizational changes, modifications of diagnostic criteria, and newly introduced categories. It analyzes why these changes led to such heated controversies, which included objections to the revision’s process, its goals, and the content of altered criteria and new categories. The central focus is on disputes concerning the false positives problem of setting a valid boundary between disorder and normal variation. Finally, this review highlights key problems and issues that currently remain unresolved and need to be addressed in the future, including systematically identifying false positive weaknesses in criteria, distinguishing risk from disorder, including context in diagnostic criteria, clarifying how to handle fuzzy boundaries, and improving the guidelines for “other specified” diagnosis.”

Popular articles

The DSM-5 Controversy

“You will need to display fewer and fewer symptoms to get labeled with certain disorders, for example Attention Deficit Disorder and Generalized Anxiety Disorder. Children will have more and more mental disorder labels available to pin on them. These are clearly boons to the mental health industry but are they legitimate additions to the manual that mental health professionals use to diagnose their clients?”

DSM 5 Is Guide Not Bible—Ignore Its Ten Worst Changes

“This is the saddest moment in my 45 year career of studying, practicing, and teaching psychiatry. The Board of Trustees of the American Psychiatric Association has given its final approval to a deeply flawed DSM 5 containing many changes that seem clearly unsafe and scientifically unsound. My best advice to clinicians, to the press, and to the general public – be skeptical and don’t follow DSM 5 blindly down a road likely to lead to massive over-diagnosis and harmful over-medication. Just ignore the ten changes that make no sense.”

Normal or Not? New Psychiatric Manual Stirs Controversy

“Among the flashpoints: Asperger’s disorder will be folded into autism spectrum disorder; grief will no longer exempt someone from a diagnosis of depression; irritable children who throw frequent temper tantrums can be diagnosed with disruptive mood dysregulation disorder. [Hypersex to Hoarding: 7 New Psychological Disorders]

“One prominent critic has been Allen Frances, a professor emeritus of psychiatry at Duke University who chaired the DSM-IV task force.

“Frances charges that through a combination of new disorders and lowered thresholds, the DSM-5 is expanding the boundaries of psychiatry to encompass many whom he describes as the “worried well.””


A Clockwork Orange? (A brief musing on the concept of a neural “code”)

I was asked this question on Quora:

Are there many layers of neural codes from the human retina to the optic nerve and the optic nerve to the brain, or are they essentially same signals relayed?

Here’s how I responded:

Here’s a question: in a system composed of clockwork, is there a “code”?

I ask this because I find that the “code” metaphor is often misleading when thinking about biology. Codes are composed of symbols. But it is not clear that neurons communicate using symbols.

The way a neuron affects other neurons is more like how a gear affects other gears. There is no code — there is causality. An active neuron releases some neurotransmitter, and this in turn makes other neurons more active. It’s like a complex network of dominoes.

Does the idea of a “code” help us understand how one domino affects the next one in the chain?

I admit that by the time a human is thinking in terms of words and symbols, “code” is probably a useful metaphor for what is going on. But the origin of coding schemes remains a great mystery in neuroscience, cognitive science, and artificial intelligence. So I recommend starting with a much less loaded metaphor, such as clockwork or dominoes. Thinking in mechanical terms helps us realize what exactly neuroscience and AI research are trying to achieve.

For now, there is a fascinating gap in our understanding of what exactly codes are in the first place.

Anyway, if you are interested in the causal “domino effect” that starts at the retina, have a look at this answer:

Yohan John’s answer to In which format is information stored in the brain?


What are emotions?

What indeed?

My research is on cognitive-emotional interaction, so I suppose I am qualified to answer this question. 🙂

But my answer cannot be the answer, since there is actually no consensus among scientists concerning the definition of emotions.

[Illustration of grief from Charles Darwin’s book The Expression of the Emotions in Man and Animals.]

Continue reading

Dopamine is not the “feel good” molecule (and the very concept of a feel good molecule is meaningless)


Dopamine is not the feel good molecule or the basis of pleasure. The idea that any molecule considered in isolation could be the basis of a subjective experience is basically nonsense.

For people who can’t really reason through this idea, there is plenty of experimental evidence showing the complexity of each and every “celebrity” neurochemical — dopamine, serotonin, oxytocin, and so on.

Continue reading

What is neuroplasticity?

I was asked this question on Quora:

Can you explain to a layman what neuroplasticity entails?

Neuroplasticity is the umbrella term for all of the brain’s mechanisms for learning and memory.

Since the average layperson already knows about learning and memory, I’m not sure whether there are any interesting implications.

Unless of course you are surprised that the brain is involved in learning and memory. Then the implications are vast. 🙂

Continue reading

No New Neurons? No Problem!

This answer was written in response to the following Quora question:

New research has found no neurogenesis in human adults, could this mean there is none or could it mean that neural stem cells are undetectable with the used techniques? What are your thoughts on this?

It’s good that you’re thinking of such things, since that is exactly what researchers themselves have to do, and what reviewers do. In order to show that the method works, there have to be adequate controls as part of the experiment.

And this is in fact the case. The paper would not have been published without controls.

Continue reading

“The first rule of intelligence: Don’t talk about your intelligence”

That line is from an article in The Atlantic about how poor people are at self-assessment:

People Don’t Actually Know Themselves Very Well

“The first rule of intelligence: Don’t talk about your intelligence. It’s something you prove, not something you claim. As comedian Patton Oswalt quipped about humor, the only person who goes around saying “I’m funny” is a not-funny person. If you were really funny, you’d just make people laugh.”

To me this kind of thing is pretty obvious, but I guess some people really need to be reminded of it.

Here’s another paragraph with several important reminders, particularly for people who blather about intelligence and cognitive biases:

“This is why people consistently overestimate their intelligence, a pattern that seems to be more pronounced among men than women. It’s also why people overestimate their generosity: It’s a desirable trait. And it’s why people fall victim to my new favorite bias: the I’m-not-biased bias, where people tend to believethey have fewer biases than the average American. But you can’t judge whether you’re biased, because when it comes to yourself, you’re the most biased judge of all. And the more objective people think they are, the more they discriminate, because they don’t realize how vulnerable they are to bias.”