Keras2Vec Documents¶
-
class
keras2vec.document.
Document
(doc_id, text, labels=[])¶ The Document class is used to contain a documents content - document id, labels, text These objects are passed into the Keras2Vec class, which will process them for training
- Args:
- doc_id (int): The identification number for the document or collection of documents.
- While these should range from (1, num_docs), in theory this is not a hard constraint.
- labels (
list
ofstr/int
): a list of labels that contextualize the document. - For example: a sports article might be labeled - [‘news’, ‘sports’] NOTE: This is not fully implemented in the current version of Keras2Vec
text (str): the content of the document
-
gen_windows
(window_size, pad_word='')¶ Generate a sliding window, of size window_size, for the given document
- Args:
- window_size (int): the size of the window, must be an odd number! pad_word (string): the word to pad indexes beyond the document, defaults to ‘’