--- title: chitra keywords: fastai sidebar: home_sidebar summary: "
" description: "
" nb_path: "nbs/index.ipynb" ---
chitra (चित्र) is a Deep Learning Computer Vision library for easy data loading, model building and model interpretation with GradCAM/GradCAM++.
Highlights:
chitra.trainer
module.If you have more use cases please raise an issue with the feature you want.
Chitra dataloader
and datagenerator
modules for loading data. dataloader
is a minimal dataloader that returns tf.data.Dataset
object. datagenerator
provides flexibility to users on how they want to load and manipulate the data.
import numpy as np
import tensorflow as tf
import chitra
from chitra.dataloader import Clf, show_batch
import matplotlib.pyplot as plt
clf_dl = Clf()
data = clf_dl.from_folder(cat_dog_path, target_shape=(224, 224))
clf_dl.show_batch(8, figsize=(8,8))
for e in data.take(1):
image = e[0].numpy().astype('uint8')
label = e[1].numpy()
plt.imshow(image)
plt.show()
Dataset class provides the flexibility to load image dataset by updating components of the class.
Components of Dataset class are:
These components can be updated with custom function by the user according to their dataset structure. For example the Tiny Imagenet dataset is organized as-
train_folder/
.....folder1/
.....file.txt
.....folder2/
.....image1.jpg
.....image2.jpg
.
.
.
......imageN.jpg
The inbuilt file generator search for images on the folder1
, now we can just update the image file generator
and rest of the functionality will remain same.
Dataset also support progressive resizing of images.
from chitra.datagenerator import Dataset
from glob import glob
ds = Dataset(data_path)
# it will load the folders and NOT images
ds.filenames[:3]
def load_files(path):
return glob(f'{path}/*/images/*')
def get_label(path):
return path.split('/')[-3]
ds.update_component('get_filenames', load_files)
ds.filenames[:3]
It is the technique to sequentially resize all the images while training the CNNs on smaller to bigger image sizes. Progressive Resizing is described briefly in his terrific fastai course, “Practical Deep Learning for Coders”. A great way to use this technique is to train a model with smaller image size say 64x64, then use the weights of this model to train another model on images of size 128x128 and so on. Each larger-scale model incorporates the previous smaller-scale model layers and weights in its architecture. ~KDnuggets
image_sz_list = [(28, 28), (32, 32), (64, 64)]
ds = Dataset(data_path, image_size=image_sz_list)
ds.update_component('get_filenames', load_files)
ds.update_component('get_label', get_label)
print()
# first call to generator
for img, label in ds.generator():
print('first call to generator:', img.shape)
break
# seconds call to generator
for img, label in ds.generator():
print('seconds call to generator:', img.shape)
break
# third call to generator
for img, label in ds.generator():
print('third call to generator:', img.shape)
break
image_sz_list = [(28, 28), (32, 32), (64, 64)]
ds = Dataset(data_path, image_size=image_sz_list)
ds.update_component('get_filenames', load_files)
ds.update_component('get_label', get_label)
dl = ds.get_tf_dataset()
for e in dl.take(1):
print(e[0].shape)
for e in dl.take(1):
print(e[0].shape)
for e in dl.take(1):
print(e[0].shape)
The Trainer class inherits from tf.keras.Model
, it contains everything that is required for training.
It exposes trainer.cyclic_fit method which trains the model using Cyclic Learning rate discovered by Leslie Smith.
from chitra.trainer import Trainer, create_cnn
from chitra.datagenerator import Dataset
from PIL import Image
ds = Dataset(cat_dog_path, image_size=(224,224))
model = create_cnn('mobilenetv2', num_classes=2, name='Cat_Dog_Model')
trainer = Trainer(ds, model)
# trainer.summary()
trainer.compile2(batch_size=8,
optimizer=tf.keras.optimizers.SGD(1e-3, momentum=0.9, nesterov=True),
lr_range=(1e-6, 1e-3),
loss='binary_crossentropy',
metrics=['binary_accuracy'])
trainer.cyclic_fit(epochs=5,
batch_size=8,
lr_range=(0.00001, 0.0001),
)
It is important to understand what is going inside the model. Techniques like GradCam and Saliency Maps can visualize what the Network is learning. trainer
module has InterpretModel class which creates GradCam and GradCam++ visualization with almost no additional code.
from chitra.trainer import InterpretModel
trainer = Trainer(ds, create_cnn('mobilenetv2', num_classes=1000, keras_applications=False))
model_interpret = InterpretModel(True, trainer)
image = ds[1][0].numpy().astype('uint8')
image = Image.fromarray(image)
model_interpret(image)
print(IMAGENET_LABELS[285])
from chitra.visualization import draw_annotations
labels = np.array([label])
bbox = np.array([[30, 50, 170, 190]])
label_to_name = lambda x: 'Cat' if x==0 else 'Dog'
draw_annotations(image, ({'bboxes': bbox, 'labels':labels,}), label_to_name=label_to_name)
plt.imshow(image)
plt.show()
from chitra.utils import limit_gpu, gpu_dynamic_mem_growth
# limit the amount of GPU required for your training
limit_gpu(gpu_id=0, memory_limit=1024*2)
gpu_dynamic_mem_growth()
Contributions of any kind are welcome. Please check the Contributing Guidelines before contributing.