14.2 Datasets

  1. COCO Dataset (Common Objects in Context):

    • Lin, T. Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., & Zitnick, C. L. (2014). “Microsoft COCO: Common Objects in Context.” European Conference on Computer Vision (ECCV).

    • A widely used dataset for image recognition and generation tasks.

  2. LibriSpeech Dataset:

    • Panayotov, V., Chen, G., Povey, D., & Khudanpur, S. (2015). “Librispeech: An ASR corpus based on public domain audiobooks.” IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

    • Used for training and fine-tuning the Voice Agent’s speech recognition models.

  3. OpenWebText Corpus:

    • An open-source dataset inspired by OpenAI’s WebText, utilized for training language models in conversational AI.

  4. ImageNet:

    • Deng, J., Dong, W., Socher, R., Li, L. J., Li, K., & Fei-Fei, L. (2009). “ImageNet: A Large-Scale Hierarchical Image Database.” IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

    • A foundational dataset for training visual recognition and generation models.

Last updated