UNIPEN with Dynamic Time Warping
(5323 queries, 10630 database objects)
bm_datasets/unipen/
MNIST database with shape context matching
(10000 queries, 60000 database objects)
bm_datasets/60sc/
MNIST database with shape context matching
(10000 queries, 20000 database objects)
bm_datasets/sc/
Time series with constrained Dynamic Time Warping, original dataset
(50 queries, 32768 database objects)
bm_datasets/ats/
Time series with constrained Dynamic Time Warping, scrambled dataset
(1000 queries, 31818 database objects)
bm_datasets/2ts/
ASL handshape data set with the chamfer distance
(710 queries, 80640 database objects)
bm_datasets/hands
IMPORTANT: You do not need to download every file in the above
directories. In each directory, the files that you need to download
are:
- testtrain_distances.bin (distances from each test object (i.e.,
query object) to each database object)
- traintrain_distances.bin (distances from each database object to each
database object)
- test_labels.bin (class labels for the test objects)
- training_labels.bin (class labels for the training objects)
Each distance file has this format:
- four 32-bit integers (ignore the first and fourth, the second is # of
rows in distance matrix, the third is # of columns in distance
matrix).
- rows * cols 32-bit floating point numbers, one for each distance. They
are saved row-by-row.
Each class label file has this format:
- four 32-bit integers (ignore the first and fourth, the second is # of
rows in the matrix, which is 1, the third is # of columns in distance
matrix).
- rows * cols 32-bit floating point numbers. The i-th number is the class label of the i-th object.
If a file (testtrain_distances.bin or traintrain_distances.bin) was over 2GB, I had to split it into chunks under 2GB, that's why you see filenames like "split_testtrainaa", etc. In that case, you have to manually merge those files to recreate the original file. I split the files using the Linux command "split".
Please do not hesitate to send me e-mail ([my last name] AT uta.edu) for any problems or questions, to let me know about results you have obtained on these datasets, or to let me know about additional datasets that may be of interest.