primeqa.calibration.confidence_scorer.ConfidenceScorer#
- class primeqa.calibration.confidence_scorer.ConfidenceScorer(confidence_model_path=None)#
Bases:
object
Class for confidence scoring.
Methods
Make confidence features from the predictions (top-k answers) of an example.
Make training data from prediction file and reference file for confidence model training.
Check if confidence model exists
Compute confidence score for each answer in the top-k predictions.
Calculate the F1-style overlap score between ground truth and prediction.
- classmethod make_features(example_predictions) list #
Make confidence features from the predictions (top-k answers) of an example.
- Parameters
example_predictions – Top-k answers generated by postprocessor ExtractivePostProcessor.
contains (Each) –
‘example_id’, ‘cls_score’, ‘start_logit’, ‘end_logit’, ‘span_answer’: {
”start_position”, “end_position”,
}, ‘span_answer_score’, ‘start_index’, ‘end_index’, ‘passage_index’, ‘target_type_logits’, ‘span_answer_text’, ‘yes_no_answer’, ‘start_stdev’, ‘end_stdev’, ‘query_passage_similarity’
- Returns
List of features used for confidence scoring.
- classmethod make_training_data(prediction_file: str, reference_file: str, overlap_threshold: float = 0.5) tuple #
Make training data from prediction file and reference file for confidence model training.
- Parameters
prediction_file – File containing QA result generated by evaluate() of MRC trainer (i.e. eval_predictions.json).
reference_file – File containing the ground truth generated by evaluate() of MRC trainer (i.e. eval_references.json).
overlap_threshold – Threshold to determine if a prediction is accepted as correct answer.
- Returns
Array of features. Y: Array of class label (0: incorrect, 1: correct).
- Return type
X
- model_exists() bool #
Check if confidence model exists
- predict_scores(example_predictions) list #
Compute confidence score for each answer in the top-k predictions.
- Parameters
example_predictions – Top-k answers generated by postprocessor ExtractivePostProcessor.
- Returns
List of scores for each of the top-k answers.
- classmethod reference_prediction_overlap(ground_truth, prediction) float #
Calculate the F1-style overlap score between ground truth and prediction.
- Parameters
ground_truth – List of ground truth each containing “start_position” and “end_position”.
prediction – Prediction containing “start_position” and “end_position”.
- Returns
Overlap score between ground truth and prediction.