| | Public Health Safety | Medical Research

How to Calculate an Inter-Rater Agreement

An inter-rater agreement is a measure of how consistent measurements are between two or more observers (or "raters"). For example, medical researchers often use a group of physicians to observe and categorize the effects of a treatment experiment. Measuring the consistency between observers is important because we want to be sure that any effects we see are due to our experiment and not to big differences between the observers. The primary statistic used to measure inter-rater agreement is called "kappa."

Instructions

- 1
  Calculate percent agreement by dividing the number of observations that agree over the total number of observations. Consider the example of an experiment involving 100 patients given a treatment for asthma. Two physicians are asked to categorize the patients into two categories: controlled (yes) or not controlled (no) asthma symptoms. The results might look like this:
  
  Both Raters Yes 70
  
  Both Raters No 15
  
  Rater #1 Yes/Rater #2 No 10
  
  Rater #1 No/Rater #2 Yes 5
  
  Total observations: 100
  
  In this example, the percent agreement (Pa) would be the number of observations in which the raters agree (70+15) divided by the total observations (100), or 85 percent. This will be used in the numerator in the final equation.
- 2
  Calculate the expected agreement due to random chance. Some of the observed agreement will be due to pure luck. We can calculate how much agreement would be seen if the observations were completely random.
  
  In our example:
  
  Both Yes 70
  
  Both No 15
  
  #1Yes/#2 No 10
  
  #1 No/#2 Yes 5
  
  Total observations: 100
  
  Dr. #1 rated Yes 80/100 or 80 percent of the time (.80).
  
  Dr. #1 rated No 20/100 or 20 percent of the time (.20).
  
  Dr. #2 rated Yes 75/100 or 75 percent of the time (.75).
  
  Dr. #2 rated No 25/100 or 25 percent of the time (.25).
  
  If the observations were random, the probability of both rating Yes would be (.80)x(.75), or 60 percent, and the probability of both rating No would be (.15)x(.25), or 4 percent.
  
  So the total probability of agreement due to chance (Pe) would be the sum of these: (.60)+(.04), or 64 percent. This will be used in the numerator and the denominator of the final equation.
- 3
  Calculate kappa by using the following equation:
  
  k= (Pa)-(Pe)
  
  ---------------
  
  1- (Pe)
  
  In our example,
  
  k= (.85)-(.64)/1-(.64)=(.58) or 58 percent
- 4
  Evaluate kappa to determine if inter-rater agreement is strong. A good rule of thumb is that kappas above 80 percent are excellent and above 60 percent are good. Anything below 60 percent is considered less than optimal agreement.
  
  In our example, 58 percent is on the borderline of good inter-rater reliability, and study results should be interpreted with this in mind.
- 5
  Evaluate kappa for more than two raters by using Fleiss' kappa calculations. This is a much more complicated calculation that should be done by computer.

How to Calculate an Inter-Rater Agreement

Instructions

Medical Research - Related Articles