Inter rater reliability
If you want to obtain inter-rater reliability measures for dichotomous ratings, by more than two raters, but not all raters rated all items, Fleiss and Cuzick (1979)
will be the referece you'll find. For example, we asked researchers
which model they consider a process model (dichotomous rating), and we asked about 60 researchers (more than two rater), of whom not everyone was familiar with every model (not all raters rated all items).
They proposed a measure between a minimum value and 1. Here is an function to calculate their kappa measure in R.