--- title: dataloader keywords: fastai sidebar: home_sidebar summary: "Now deprecated, please use datagenerator instead." description: "Now deprecated, please use datagenerator instead." nb_path: "nbs/04_dataloader.ipynb" ---
{% raw %}
{% endraw %} {% raw %}
{% endraw %} {% raw %}
{% endraw %} {% raw %}

get_basename[source]

get_basename(path:<dtype: 'string'>)

{% endraw %} {% raw %}

show_batch[source]

show_batch(clf, limit:int, figsize:tuple=(10, 10))

Visualize image and labels

https://www.tensorflow.org/tutorials/load_data/images#load_using_keraspreprocessing

Args: data: tf.data.Dataset containing image, label limit: number of images to display figsize: size of visualization Returns: Displays images and labels

{% endraw %} {% raw %}
{% endraw %}

CLF DLoader

DataLoader class for loading dataset for image classification tasks.

clf.load_from_folder (TODOs)

  1. __len__ method impl
  2. image augmentation impl

folder structure

/root
    /class1_folder
        /img0.jpg img1.jpg img2.jpg ....
    /class2_folder
        /img0.jpg...
{% raw %}

class Clf[source]

Clf()

{% endraw %} {% raw %}
{% endraw %}

Detect

from_xml

Steps:

. list annotations
. read and parse annotations
. read images
. return images and annotations

folder structure

/root
    /image_folder
    /annotation_folder
{% raw %}
class Detect(object):

    def __init__(self):
        self.CLASS_NAMES = None

    def from_xml(self, path: Union[str, pathlib.Path]):
        """Load dataset from given path.
        Args:
            path: string, path of folder containing dataset.
        Returns: image, label -> tf.data.Dataset prefetched with tf.data.AUTOTUNE
        """
        assert isinstance(path, (str, pathlib.Path))
        path = pathlib.Path(path)
        remove_dsstore(path)

        list_folders = tf.data.Dataset.list_files(str(path / '*'))
        list_images = self._get_image_list(str(path))

        self.CLASS_NAMES = tuple(get_basename(e).numpy() for e in list_folders)

        data = list_images.map(self._process_path, num_parallel_calls=AUTOTUNE)
        data = data.prefetch(AUTOTUNE)
        return data
{% endraw %}