primeqa.qg.models.hybrid_qg.path_sampler.PathSampler#
- class primeqa.qg.models.hybrid_qg.path_sampler.PathSampler(lang)#
Bases:
object
Samples hybrid chains from hybrid context of table and text. The sampled chains act as an input for HybridQG inference to generate relevant questions.
- Example chain:
<answer> Mactan-Cebu International Airport </answer> <chain> Mactan-Cebu International Airport located on Mactan Island , is the second busiest
airport in the Philippines . <sep> The Island is Cebu. <hsep> The Population is 3,979,155.</chain>
The sampled chains have two parts, one a named entity or a cell text as a possible answer from the hybird context, and the other a set of sentences from the context which contain some reference to the named entity. We use NER from stanza library to extract named entities. Currently we ONLY support English!
Methods
aggregate_data
create_chains
creates input for qg inference.
sample_paths
tokenize_text
- create_qg_input(data_list, num_questions_per_instance=5, id_list=[])#
creates input for qg inference. Samples named entities or cell text as possible answers from a hybird context i.e., a table and its linked text passages.
- Parameters
data_list (list) – List of tuples, each tuple contains two elements; a table and its passages.
num_questions_per_instance (int) – How many chains to sample from the hybrid input.
- Returns
List of hybrid chains sampled. ans_list (list): List of answers sampled.
- Return type
input_str_list (list)