Keras2Vec Module¶

class keras2vec.keras2vec.Keras2Vec(documents, embedding_size=16, seq_size=3, neg_sampling=5, workers=1)¶

The Keras2Vec class is where the Doc2Vec model will be trained. By taking in a set of Documents it can begin to train against them to learn the embedding space that best represents the provided documents.

Args:: documents (list of Document): List of documents to vectorize

build_model(infer=False)¶: Build both the training and inference models for Doc2Vec

fit(epochs, lr=0.1, verbose=0)¶

This function trains Keras2Vec with the provided documents

Args:: epochs(int): How many times to iterate over the training dataset

get_doc_embedding(doc)¶

Get the vector/embedding for the provided doc Args:

doc (object): Object used in the inital generation of the model

Returns:: np.array: embedding for the provided doc

get_doc_embeddings()¶: Get the document vectors/embeddings from the trained model Returns:

np.array: Array of document embeddings indexed by encoded doc

get_label_embedding(label)¶

Get the vector/embedding for the provided label Args:

label (object): Object used in the inital generation of the model

Returns:: np.array: embedding for the provided label

get_label_embeddings()¶: Get the label vectors/embeddings from the trained model Returns:

np.array: Array of the label embeddings

get_word_embedding(word)¶

Get the vector/embedding for the provided word Args:

word (object): Object used in the inital generation of the model

Returns:: np.array: embedding for the provided doc

get_word_embeddings()¶: Get the vectors/embeddings from the trained model Returns:

np.array: Array of embeddings indexed by encoded doc

infer_vector(infer_doc, epochs=5, lr=0.1, init_infer=True, verbose=0)¶

Infer a documents vector by training the model against unseen labels and text. Currently inferred vector is passed to an attribute and not returned from this function.

Args:: infer_doc (Document): Document for which we will infer a vector epochs (int): number of training cycles lr (float): the learning rate during inference init_infer (bool): determines whether or not we want to reinitalize weights for inference layer