1. <small id="jeg3a"><menu id="jeg3a"></menu></small>
      <mark id="jeg3a"></mark>
        <label id="jeg3a"></label>
      1. <tt id="jeg3a"><button id="jeg3a"></button></tt>
        <tt id="jeg3a"><ruby id="jeg3a"></ruby></tt>
          <small id="jeg3a"><strong id="jeg3a"></strong></small>
          <tt id="jeg3a"><ol id="jeg3a"><source id="jeg3a"></source></ol></tt>

          <listing id="jeg3a"></listing>

          <listing id="jeg3a"><cite id="jeg3a"></cite></listing>
          <label id="jeg3a"></label>
          Skip to main content
          U.S. flag

          An official website of the United States government

          Dot gov

          The .gov means it’s official.
          Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.


          The site is secure.
          The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

          The EMNIST Dataset

          What is it?

          The EMNIST dataset is?a?set of handwritten character digits derived from the NIST Special Database 19 ?and converted to?a?28x28 pixel image format?and dataset structure that directly matches the MNIST dataset . Further information on the dataset contents?and conversion process can be found in the paper?available?at https://arxiv.org/abs/1702.05373v1.


          The dataset is provided in two file formats. Both versions of the dataset contain identical information,?and?are provided entirely for the sake of convenience. The first dataset is provided in?a?Matlab format that is?accessible through both Matlab?and Python (using the scipy.io.loadmat function). The second version of the dataset is provided in the same binary format?as the original MNIST dataset?as outlined in http://yann.lecun.com/exdb/mnist/

          Dataset Summary

          There?are six different splits provided in this dataset.?A?short summary of the dataset is provided below:

          • EMNIST ByClass: 814,255 characters. 62 unbalanced classes.
          • EMNIST ByMerge:?814,255 characters. 47 unbalanced classes.
          • EMNIST Balanced: ?131,600 characters. 47 balanced classes.
          • EMNIST Letters: 145,600 characters. 26 balanced classes.
          • EMNIST Digits: 280,000 characters. 10 balanced classes.
          • EMNIST MNIST: 70,000 characters. 10 balanced classes.

          The full complement of the NIST Special Database 19 is?available in the ByClass?and ByMerge splits. The EMNIST Balanced dataset contains?a?set of characters with?an equal number of samples per class. The EMNIST Letters dataset merges?a?balanced set of the uppercase?and lowercase letters into?a?single 26-class task. The EMNIST Digits?and EMNIST MNIST dataset provide balanced handwritten digit datasets directly compatible with the original MNIST dataset.

          Please refer to the EMNIST paper?[PDF, BIB]for further details of the dataset structure.

          How to cite

          Please cite the following paper when using or referencing the dataset:

          Cohen, G.,?Afshar, S., Tapson, J., & van Schaik,?A. (2017). EMNIST:?an extension of MNIST to handwritten letters. Retrieved from http://arxiv.org/abs/1702.05373


          Gregory Cohen, Saeed?Afshar, Jonathan Tapson,?and?Andre van Schaik
          The MARCS Institute for Brain, Behaviour?and Development
          Western Sydney University
          Penrith,?Australia 2751?

          Email:?emnist [at] nist.gov

          Where to download ?


          Created April 4, 2017, Updated March 28, 2019