Concept Of CGANs: Code Implementation Using TensorFlow And Keras

CodeTrade
1543 Views

The concept of Conditional Generative Adversarial Networks (CGANs) has revolutionized the field of machine learning by enabling the generation of highly realistic and contextually relevant data. CGANs generate data samples that possess specific attributes or characteristics that make them ideal for various applications such as image synthesis, data augmentation, and even generating novel music and text.

In this article, we will delve into the fundamentals of CGANs, explore their architecture, and learn how to implement CGANs code using TensorFlow and Keras, the popular machine learning libraries.

Also Read: Create Machine Learning Model with TensorFlow

Understand The Concept of CGANs

Traditional GANs are made up of two neural networks: a generator and a discriminator. The generator aims to produce data that is indistinguishable from real data, while the discriminator tries to differentiate between real and fake data. During training, the generator and discriminator engage in a game of cat and mouse, with the generator continually improving its ability to generate realistic data.

Conditional GANs extend this concept by introducing conditional information, typically in the form of class labels, into the training process. This allows for the generation of data samples with specific attributes. For example, in the case of generating images of handwritten digits (like the MNIST dataset), a conditional label might specify which digit (0-9) the generator should create.

Applications of cGANs

Conditional GANs (cGANs) are a type of GAN that generates new conditioned data on some additional information. This means that we can generate data that has specific properties or characteristics using cGANs. Some of the applications of cGANs include:

  • Image-To-Image Translation

    Using cGANs we can translate images from one domain to another like convert black and white images to color images or translate the images from one style to another.

  • Text-To-Image Synthesis

    cGANs can be used to generate images from text descriptions. For example, a cGAN could be used to generate an image of a cat from the text description "a black cat sitting on a red couch."

  • Video Generation

    We can generate realistic videos using Video Generation. For example, using cGANs we can generate a video of a person who walks down the street, even if the cGAN has never seen a video of a person walking before.

  • Convolutional Face Generation

    With cGANs, we can generate realistic faces with specific attributes, such as gender, age, and race.

  • Medical Image Synthesis

    Synthesized medical images are generated using cGANs, such as MRI and CT scans. This can be useful for training medical imaging models and for generating new medical images for research purposes.

In addition to these specific applications, cGANs can be used for a wide variety of tasks that involve generating new data with specific properties.

CGANs Code Implementation Using TensorFlow and Keras

Let’s take a closer look into the Python code that demonstrates how to build and train a cGAN using TensorFlow and Keras.

Also Read: How To Train Image Captioning Model With TensorFlow

1. Import Libraries and Define Hyperparameters

Import the necessary libraries and define hyperparameters for cGAN.

# Import necessary libraries
!pip install -q git+https://github.com/tensorflow/docs

from tensorflow_docs.vis import embed
import tensorflow as tf
from tensorflow import keras
import numpy as np
import imageio

# Define hyperparameters
batch_size = 64
num_classes = 10
num_channels = 1
latent_dim = 128
image_size = 28

In the above code example, we install TensorFlow documentation utilities(tensorflow_docs) and import the required libraries. Define hyperparameters such as batch_size, num_classes, num_channels, latent_dim, and image_size. These parameters will be used throughout the code.

2. Load and Preprocess the MNIST Dataset

In this section, we load the MNIST dataset, preprocess the images, and create a TensorFlow dataset.

# Load and preprocess the MNIST dataset
(x_train, y_train), (x_test, y_test) =  keras.datasets.mnist.load_data()
images = np.concatenate([x_train, x_test], axis=0) 
images = images.astype("float32") / 255.0
images = np.reshape(images, [-1, 28, 28, 1])
labels = np.concatenate([y_train, y_test], axis=0)
labels = keras.utils.to_categorical(labels, 10)

# Create a TensorFlow dataset
dataset = tf.data.Dataset.from_tensor_slices((images, labels))
dataset = dataset.shuffle(buffer_size=1024).batch(batch_size)

In the given code, we load the MNIST dataset using Keras and combine the training and test sets. The images are scaled to the range [0, 1] and resized to the appropriate dimensions. The class labels are one-hot encoded using Keras's to_categorical() function. A TensorFlow dataset is created, which will be used for training.

3. Define Generator and Discriminator Architectures

In this section, define the architectures of the generator and discriminator networks. The generator architecture should be able to generate images from the latent space, conditioned on the additional information. The discriminator architecture should be able to distinguish between real and generated images, conditioned on the additional information. These architectures can be customized to suit your specific application.

# Define generator and discriminator architectures
generator_in_channels = latent_dim + num_classes
discriminator_in_channels = num_channels + num_classes

generator = keras.models.Sequential([
    keras.layers.InputLayer((generator_in_channels,)),
    
    keras.layers.Dense(7*7*generator_in_channels),
    keras.layers.LeakyReLU(alpha=0.2),
    keras.layers.Reshape((7,7,generator_in_channels)),
    
    keras.layers.Conv2DTranspose(128, (4,4), (2,2), padding='same'),
    keras.layers.LeakyReLU(alpha=0.2),
    
    keras.layers.Conv2DTranspose(128, (4,4), (2,2), padding='same'),
    keras.layers.LeakyReLU(alpha=0.2),
    
    keras.layers.Conv2D(1, (7,7), padding='same', activation='sigmoid')
], name='generator')

discriminator = keras.models.Sequential([
    keras.layers.InputLayer((28, 28, discriminator_in_channels)),
    
    keras.layers.Conv2D(64, (3,3), strides=(2,2), padding='same'),
    keras.layers.Conv2D(128, (3,3), strides=(2,2), padding='same'),
    
    keras.layers.LeakyReLU(alpha=0.2),
    
    keras.layers.GlobalMaxPooling2D(),
    
    keras.layers.Dense(1)
], name='discriminator')

generator_in_channels and discriminator_in_channels specify the number of input channels for the generator and discriminator networks, respectively. Here, define the generator and discriminator as sequential models. Sequential models are a type of neural network model that is made up of a stack of layers. You can add layers to the sequential models as needed to suit your specific task.

4. Define Conditional GAN (cGAN) Model

In this block, we define the cGAN model as a custom Keras model. This means that we will create a subclass of the tf.keras.Model class and implement the __init__() and call__() methods.

# Define the Conditional GAN (cGAN) model
class ConditionalGAN(keras.models.Model):
    def __init__(self, discriminator, generator, latent_dim):
        super().__init__()
        self.discriminator = discriminator
        self.generator = generator
        self.latent_dim = latent_dim
        self.gen_loss_tracker = keras.metrics.Mean(name="generator_loss")
        self.disc_loss_tracker = keras.metrics.Mean(name="discriminator_loss")
    
    @property
    def metrics(self):
        return [self.gen_loss_tracker, self.disc_loss_tracker]
    
    def compile(self, d_optimizer, g_optimizer, loss_fn):
        super().compile()
        self.d_optimizer = d_optimizer
        self.g_optimizer = g_optimizer
        self.loss_fn = loss_fn
    
    def train_step(self, data):
        real_images, one_hot_labels = data      # one_hot_labels shape = (batch_size, 10)
        # preprocessing one_hot_labels so that they can be concatenated with images
        image_one_hot_labels = one_hot_labels[:, :, None, None]
        image_one_hot_labels = tf.repeat(image_one_hot_labels, repeats=[image_size * image_size])
        image_one_hot_labels = tf.reshape(
            image_one_hot_labels, (-1, image_size, image_size, num_classes)   # now labels shape = (batch-size, 28, 28, 10)
        )
        batch_size = tf.shape(real_images)[0]
        random_latent_vectors = tf.random.normal(shape=(batch_size, self.latent_dim))
        random_vector_labels = tf.concat([random_latent_vectors, one_hot_labels], axis=1)
        generated_images = self.generator(random_vector_labels)
        fake_image_and_labels = tf.concat([generated_images, image_one_hot_labels], -1)
        real_image_and_labels = tf.concat([real_images, image_one_hot_labels], -1)
        combined_images = tf.concat(
            [fake_image_and_labels, real_image_and_labels], axis=0
        )
        labels = tf.concat(
            [tf.ones((batch_size, 1)), tf.zeros((batch_size, 1))], axis=0
        )
        
        with tf.GradientTape() as tape:
            predictions = self.discriminator(combined_images)
            d_loss = self.loss_fn(labels, predictions)
        grads = tape.gradient(d_loss, self.discriminator.trainable_weights)
        self.d_optimizer.apply_gradients(zip(grads, self.discriminator.trainable_weights))
        
        random_latent_vectors = tf.random.normal(shape=(batch_size, self.latent_dim))
        random_vector_labels = tf.concat(
            [random_latent_vectors, one_hot_labels], axis=1
        )
        misleading_labels = tf.zeros((batch_size, 1))
        
        with tf.GradientTape() as tape:
            fake_images = self.generator(random_vector_labels)
            fake_image_and_labels = tf.concat([fake_images, image_one_hot_labels], -1)
            predictions = self.discriminator(fake_image_and_labels)
            g_loss = self.loss_fn(misleading_labels, predictions)
        grads = tape.gradient(g_loss, self.generator.trainable_weights)
        self.g_optimizer.apply_gradients(zip(grads, self.generator.trainable_weights))

        # Monitor loss.
        self.gen_loss_tracker.update_state(g_loss)
        self.disc_loss_tracker.update_state(d_loss)
        return {
            "g_loss": self.gen_loss_tracker.result(),
            "d_loss": self.disc_loss_tracker.result(),
        }

5. Compile the cGAN Model

To compile a cGAN model with optimizers and a loss function, we can use the following code:

# Compile the cGAN model
conditional_gan = ConditionalGAN(discriminator, generator, latent_dim)
conditional_gan.compile(
    d_optimizer=keras.optimizers.Adam(learning_rate=0.0003),
    g_optimizer=keras.optimizers.Adam(learning_rate=0.0003),
    loss_fn=keras.losses.BinaryCrossentropy(from_logits=True),
)

Using the code, we create an instance of the ConditionalGAN model and compile it with separate optimizers for the discriminator and generator, along with a binary cross-entropy loss function.

6. Train the cGAN

Use the given code to train the cGAN on a dataset for a number specified of epochs.

# Train the cGAN
conditional_gan.fit(dataset, epochs=50)

The cGAN is trained using the fit() method of the model. You can adjust the number of training epochs as needed.

7. Generate and Display Fake Images

In this final block, we generate a fake image by providing random noise and a conditional label. Then visualize the generated image using Matplotlib.

# Generate a fake image
interpolation_noise = tf.random.normal(shape=(1, latent_dim))
category_data = keras.utils.to_categorical([4], num_classes)
noise_and_labels = tf.concat([interpolation_noise, category_data], 1)
fake = conditional_gan.generator.predict(noise_and_labels)

# Display the generated image
import matplotlib.pyplot as plt
plt.imshow(fake[0] * 255, cmap='gray')

As shown in the given example, the code generates random noise and a conditional label, then passes them to the generator to create a fake image. The generated image is displayed using Matplotlib.

How cGANs Used In Real World

  • Nvidia

    Nvidia uses cGANs to generate realistic images of synthetic worlds for training self-driving cars.

  • Adobe

    Adobe uses cGANs to develop new image editing tools, such as a tool that can remove unwanted objects from photos.

  • Google

    Google uses cGANs to develop new translation tools that can translate text from one language to another while preserving the style of the original text.

  • Facebook

    Facebook uses cGANs to develop new ways to generate personalized content for users, such as personalized news feeds and photo albums.

  • Fashion

    In the fashion industry, using cGANs generates new fashion designs and creates realistic images of clothing for online stores.

  • Also Read: GANs For Fashion: How To Use GANs To Generate Fashion Images In Python And TensorFlow
  • Gaming

    Using cGANs in the gaming industry generates realistic game environments and characters.

  • Architecture

    Architectures can generate new architectural designs and create realistic visualizations of buildings using cGANs.

These are just a few real-world examples of cGANs. One of the powerful tools cGANs used to generate new data for a wide variety of applications. As cGAN technology continues to develop, we can expect to see even more innovative and creative uses for cGANs in the future.

Conclusion

Conditional GANs are a powerful extension of GANs that enable the generation of data samples with specific attributes. They find applications in image-to-image translation, style transfer, and much more. By understanding the concepts and exploring the code provided in this blog post and you can start experimenting with cGANs and apply them to various creative tasks in deep learning.

You can use the given code as per your requirement. If you need any help related to GANs, contact CodeTrade, the best AI and ML Software development company in India. We have highly experienced AI and ML developers who are happy to help you with your AI ML project. Contact now and get a free consultation for AI and ML.

CodeTrade
CodeTrade, a Custom Software Development Company, provides end-to-end SME solutions in USA, Canada, Australia & Middle East. We are a team of experienced and skilled developers proficient in various programming languages and technologies. We specialize in custom software development, web, and mobile application development, and IT services.