Zeroshot models

Base zeroshot model

class src.models.zeroshot.base.ZeroshotModel(config: Config, logger: Logger, model_id: str, taxonomy: Taxonomy)[source]

Bases: Model

Base model for zeroshot approaches

_load_model(model_id: str) None[source]

[Adaptation] Logic for loading zeroshot models

This function enables custom model preperations before executing the classification

Parameters:

model_id – model_id to pull

Returns:

classify(text: str, multi_label: bool, **kwargs) dict[str, float][source]

Abstract function that executes the text classificatoin

Parameters:
  • text – the text to classify

  • multi_label – boolean to identify if it is a multilabel problem

  • kwargs – potential extra vars

Returns:

the results

nli_infer(premise: str, hypothesis: str)[source]

Low level implementation on how zeroshot models can be used aswel

Parameters:
  • premise – input text

  • hypothesis – parsed label text

Returns:

score for prediction to be one of 3 hardcoded laabels

Child based zeroshot model

class src.models.zeroshot.child_labels.ChildLabelsZeroshotModel(config: Config, logger: Logger, model_id: str, taxonomy: Taxonomy)[source]

Bases: ZeroshotModel

Zeroshot approach that implements the label merging of all child labels

_prep_labels(taxonomy: Taxonomy) None[source]

The function that prepares the labels, this converts them to the required format for further processing with a model. :param taxonomy: Taxonomy object where we will use the labels from :return:

_text_formatting(taxonomy_node: Taxonomy) str[source]

Custom text formatting logic

Parameters:

taxonomy_node – taxonomy node to format text for

Returns:

classify(text: str, multi_label: bool, **kwargs) dict[str, float][source]

The function that prepares the labels, this converts them to the required format for further processing with a model. :param taxonomy: Taxonomy object where we will use the labels from :return:

Chunked zeroshot model

class src.models.zeroshot.chunked.ChunkedZeroshotModel(config: Config, logger: Logger, model_id: str, taxonomy: Taxonomy)[source]

Bases: ZeroshotModel

Chunked zeroshot implementation, based on the regular approach but text is chunked based on its maximum length

classify(text: str, multi_label: bool, **kwargs) dict[str, float][source]

[Adaptation] Text length can be predefined with kwargs (using max_length)

Abstract function that executes the text classificatoin

Parameters:
  • text – the text to classify

  • multi_label – boolean to identify if it is a multilabel problem

  • kwargs – potential extra vars

Returns:

the results

Sentence based zeroshot model

class src.models.zeroshot.sentence.SentenceZeroshotModel(config: Config, logger: Logger, model_id: str, taxonomy: Taxonomy)[source]

Bases: ZeroshotModel

Zeroshot model that has the sentence based approach (predict for each sentence separately)

classify(text: str, multi_label: bool, **kwargs) dict[str, float][source]

Abstract function that executes the text classificatoin

Parameters:
  • text – the text to classify

  • multi_label – boolean to identify if it is a multilabel problem

  • kwargs – potential extra vars

Returns:

the results