primeqa.mrc.metrics.tydi_f1.tydi_eval.compute_macro_f1#

primeqa.mrc.metrics.tydi_f1.tydi_eval.compute_macro_f1(answer_stats, prefix='')#

Computes F1, precision, recall for a list of answer scores.

This computes the language-wise macro F1. For minimal answers, we also compute a partial match score that uses F1, which would be included in this computation via answer_stats.

Parameters
  • answer_stats – List of per-example scores.

  • prefix – Prefix to prepend to score dictionary.

Returns

Dictionary mapping measurement names to scores.