Keras2Vec Data Generator

class keras2vec.data_generator.DataGenerator(documents, seq_size, neg_samples, batch_size=100, shuffle=True, val_gen=False)

The DataGenerator class is used to encode documents and generate training/testing data for a Keras2Vec instance. Currently this object is only used internally within the Keras2Vec class and not intended for direct use.

Args:
documents (list of Document): List of documents to vectorize
build_vocabs()

Build the vocabularies for the document ids, labels, and text of the provided documents

create_encodings()

Build the encodings for each of the provided data types

encode_doc(doc, neg_sampling=False, num_neg_samps=3)

Encodes a document for the keras model

Args:
doc(Document): The document to encode neg_sampling(Boolean): Whether or not to generate negative samples for the document NOTE: Currently not implemented
on_epoch_end()

Updates indexes after each epoch