primeqa.mrc.metrics.nq_f1.nq_eval.score_answers#

primeqa.mrc.metrics.nq_f1.nq_eval.score_answers(gold_annotation_dict, pred_dict, skip_missing_example_ids: bool = False, long_non_null_threshold: int = 2, short_non_null_threshold: int = 2)#

Scores all answers for all documents.

Parameters

gold_annotation_dict – a dict from example id to list of NQLabels.
pred_dict – a dict from example id to list of NQLabels.
skip_missing_example_ids – True to only use example ids from intersection of gold and preds
long_non_null_threshold – Min number of non null spans in the annotation before considering the question to be requiring a non null answer
short_non_null_threshold – Min number of non null spans in the annotation before considering the question to be one with a non null answer

Returns

List of scores for long answers. short_answer_stats: List of scores for short answers.

Return type

long_answer_stats