IMAGE TRANSFORMATION TASK ON THE DANSAEKHWA DATASETS
Technological advances have brought tremendous changes not only to our general lives but also to the domain of art. As technology continues to progress, it has been functioning as a provider of new tools and applications for the arts and design disciplines. Among myriad applications, one of the trending topics is image transformation tasks, such as image-to-image translation, and style transfer.
Style transfer conventionally refers to the technique of reconstructing images in the style of other images. With the introduction and development of neural stylization methods, it is now possible to effectively, and artistically transfer input images into the images of any desired styles. The technique has allowed people who do not have any prior knowledge in art or coding to confidently produce artistic images that they desire to create. Various applications of style transfer are now available, ranging from the CycleGAN, a variant of GANs (Generative Adversarial Networks), to Prisma, a popular photo-editing app.
Closely related to neural style transfer, pix2pix is a classic example of image-to-image translation networks. These networks use a conditional generative adversarial network (cGAN) to learn a mapping from an input image to output image. The networks also learn a loss function to train the mapping, thus demonstrating a more effective approach to solving image-to-image translation tasks compared to other traditional models.
The images below are the examples of the image-to-image translation tasks available with pix2pix:
Retrieved from the paper Image-to-Image Translation with Conditional Adversarial Nets by Isola et al.)
The project was conducted as a part of the Dansaekhwa Project with the particular objectives of pondering the meaning of a creator in the artistic practice. The code used in the project were heavily borrowed from pix2pix-tensorflow (https://github.com/affinelayer/pix2pix-tensorflow). Among several options provided by the Dansaekhwa members, "Correspondence조응", the painting series by Lee Ufan, and "Untitled무제", a unique Umber Blue painting series, by Yun Hyong-keun are selected for their formal appropriacy, a perceivable similarity between each painting in the collections.
Examples of the target (original) images from the Untitled series by Yun Hyong-keun.
Examples of the target (original) images from the Correspondence series by Lee Ufan.
Each dataset consists of 500 pairs of the images, of 400 pairs used for training and the rest used for the test, which have a dimension of 512 × 512 pixels. 1,000 handmade equivalent input images for target images were created for these particular translation tasks. The images below are the examples of the pairs in each dataset.
The major issue when implementing the vanilla pix2pix was the pronounced checkerboard artifacts, one of the hallmarks of the artificially synthesized images. In order to alleviate the artifacts, the code has been added to the original model. The inserted code was referenced from Deconvolution and Checkerboard Artifacts, the article by Odena et al.
The pronounced checkerboard artifacts in the images generated by the original model. Images partially enlarged.
After inserting the code to resize convolution layers using bicubic resampling, the model stably generates more naturalistic images without exhibiting the pronounced checkerboard pattern of artifacts. The two collections of images below are the selection of output images comparing the results generated by the vanilla pix2pix and by the revised pix2pix networks.
Comparison of generated images with (bottom) and without (top) applying bicubic interpolation.