primeqa.qg.models.hybrid_qg.path_sampler.PathSampler#

class primeqa.qg.models.hybrid_qg.path_sampler.PathSampler(lang)#

Bases: object

Samples hybrid chains from hybrid context of table and text. The sampled chains act as an input for HybridQG inference to generate relevant questions.

Example chain:

<answer> Mactan-Cebu International Airport </answer> <chain> Mactan-Cebu International Airport located on Mactan Island , is the second busiest

airport in the Philippines . <sep> The Island is Cebu. <hsep> The Population is 3,979,155.</chain>

The sampled chains have two parts, one a named entity or a cell text as a possible answer from the hybird context, and the other a set of sentences from the context which contain some reference to the named entity. We use NER from stanza library to extract named entities. Currently we ONLY support English!

Methods

aggregate_data

create_chains

create_qg_input

creates input for qg inference.

sample_paths

tokenize_text

create_qg_input(data_list, num_questions_per_instance=5, id_list=[])#

creates input for qg inference. Samples named entities or cell text as possible answers from a hybird context i.e., a table and its linked text passages.

Parameters
  • data_list (list) – List of tuples, each tuple contains two elements; a table and its passages.

  • num_questions_per_instance (int) – How many chains to sample from the hybrid input.

Returns

List of hybrid chains sampled. ans_list (list): List of answers sampled.

Return type

input_str_list (list)