datasets #
VTL Datasets
This module exposes some functionalities to download and use the VTL datasets. For this we created some batch based iterators to load the datasets. We expose the following datasets:
- Mnist: A dataset of handwritten digits.
- Imdb: A dataset of IMDB reviews for sentiment analysis.
Constants #
const mnist_test_labels_file = 't10k-labels-idx1-ubyte.gz'
const mnist_test_images_file = 't10k-images-idx3-ubyte.gz'
const mnist_train_labels_file = 'train-labels-idx1-ubyte.gz'
const mnist_train_images_file = 'train-images-idx3-ubyte.gz'
const mnist_base_url = 'https://github.com/golbin/TensorFlow-MNIST/raw/master/mnist/data/'
const imdb_base_url = 'http://ai.stanford.edu/~amaas/data/sentiment/'
const imdb_file_name = 'aclImdb_v1.tar.gz'
fn load_imdb #
fn load_imdb() !ImdbDataset
load_imdb loads the IMDB dataset.
fn load_mnist #
fn load_mnist() !MnistDataset
load_mnist loads the MNIST dataset.
struct ImdbDataset #
struct ImdbDataset {
pub:
train_features &vtl.Tensor[string] = unsafe { nil }
train_labels &vtl.Tensor[int] = unsafe { nil }
test_features &vtl.Tensor[string] = unsafe { nil }
test_labels &vtl.Tensor[int] = unsafe { nil }
}
ImdbDataset is a dataset for sentiment analysis.
struct MnistDataset #
struct MnistDataset {
pub:
train_features &vtl.Tensor[u8] = unsafe { nil }
train_labels &vtl.Tensor[u8] = unsafe { nil }
test_features &vtl.Tensor[u8] = unsafe { nil }
test_labels &vtl.Tensor[u8] = unsafe { nil }
}
MnistDataset is a dataset of MNIST handwritten digits.