Shafkat Khan Siam

Research Activity

My research focuses are deep learning, wireless networking, reinforcement learning, natural laguage processing, large language models, computer vision, image processing and artificial intelligence.

Publications

"Rethinking Gradient Weight’s Influence over Saliency Map Estimation."

Masud An Nur Fahim, Nazmus Saqib, Shafkat Khan Siam, Ho Yub Jung. MDPI, Sensors, 22 (17), 6516, 2022.

Class activation map (CAM) helps to formulate saliency maps that aid in interpreting the deep neural network’s prediction. Gradient-based methods are generally faster than other branches of vision interpretability and independent of human guidance. The performance of CAM-like studies depends on the governing model’s layer response and the influences of the gradients. Typical gradient-oriented CAM studies rely on weighted aggregation for saliency map estimation by projecting the gradient maps into single-weight values, which may lead to an over-generalized saliency map. To address this issue, we use a global guidance map to rectify the weighted aggregation operation during saliency estimation, where resultant interpretations are comparatively cleaner and instance-specific. We obtain the global guidance map by performing elementwise multiplication between the feature maps and their corresponding gradient maps. To validate our study, we compare the proposed study with nine different saliency visualizers. In addition, we use seven commonly used evaluation metrics for quantitative comparison. The proposed scheme achieves significant improvement over the test images from the ImageNet, MS-COCO 14, and PASCAL VOC 2012 datasets.

Download

"Denoising Single Images by Feature Ensemble Revisited."

Masud An Nur Fahim, Nazmus Saqib, Shafkat Khan Siam, Ho Yub Jung. MDPI, Sensors, 22 (18), 7080, 2022.

Image denoising is still a challenging issue in many computer vision subdomains. Recent studies have shown that significant improvements are possible in a supervised setting. However, a few challenges, such as spatial fidelity and cartoon-like smoothing, remain unresolved or decisively overlooked. Our study proposes a simple yet efficient architecture for the denoising problem that addresses the aforementioned issues. The proposed architecture revisits the concept of modular concatenation instead of long and deeper cascaded connections, to recover a cleaner approximation of the given image. We find that different modules can capture versatile representations, and a concatenated representation creates a richer subspace for low-level image restoration. The proposed architecture’s number of parameters remains smaller than in most of the previous networks and still achieves significant improvements over the current state-of-the-art networks.

Download

Projects

Image denoising:

This project focused on using self-supervised learning and ensemble methods for image denoising. I developed an aggregated multiscale self-supervised denoising model, which leverages multiple features from different scales to enhance the image quality. I have done my Master’s thesis on this topic. The thesis can be download from here. Also the code and main idea for my thesis in details provided here.
I also co-authored a paper on this topic, where image denoising have been done with a supervised training method. We have developed a custom network where shallow features are extracted and ensambled to remove noise from images. The paper is titled “Denoising Single Images by Feature Ensemble Revisited.”

Explainable AI:

This project aimed to understand what a deep learning model sees when it classifies a specific class, an area of research known as explainable AI. I worked on developing a method to estimate the gradient weights of the saliency maps, which are used to visualize the regions of interest in an image. I also co-authored a paper on this topic, titled “Rethinking Gradient Weight’s Influence over Saliency Map Estimation.”

FRVT 1:1 face recognition:

This project involved developing advanced face recognition algorithms using deep learning and computer vision techniques. I was part of the team that implemented and evaluated different models and achieved state-of-the-art results on the FRVT 1:1 benchmark.