Understanding of Feature Representation in Convolutional Neural Networks and Vision Transformer

Hiroaki Minoura; Tsubasa Hirakawa; Takayoshi Yamashita; Hironobu Fujiyoshi

Research.Publish.Connect.

*Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Country:
Subject:

Advanced Search Affiliations Search

If you're looking for an exact phrase use quotation marks on text fields.

Proceedings

Proceedings Search *Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

Papers

Papers Search *Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

Authors

Authors Search *Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

Advanced Search

Paper

Understanding of Feature Representation in Convolutional Neural Networks and Vision Transformer

Topics: Deep Learning for Visual Understanding ; Features Extraction; Transfer Learning

In Proceedings of the 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 5 VISAPP: VISAPP, 429-436, 2023 , Lisbon, Portugal

Authors: Hiroaki Minoura ; Tsubasa Hirakawa ; Takayoshi Yamashita and Hironobu Fujiyoshi

Affiliation: Chubu University, Kasugai, Aichi, Japan

Keyword(s): Image Classification, Convolutional Neural Network, Vision Transformer.

Abstract: Understanding a feature representation (e.g., object shape and texture) of an image is an important clue for image classification tasks using deep learning models, it is important to us humans. Transformer-based architectures such as Vision Transformer (ViT) have outperformed higher accuracy than Convolutional Neural Networks (CNNs) on such tasks. To capture a feature representation, ViT tends to focus on the object shape more than the classic CNNs as shown in prior work. Subsequently, the derivative methods based on self-attention and those not based on self-attention have also been proposed. In this paper, we investigate the feature representations captured by the derivative methods of ViT in an image classification task. Specifically, we investigate the following using a publicly available ImageNet pre-trained model, i ) a feature representation of either an object’s shape or texture using the derivative methods with the SIN dataset, ii ) a classification without relying on object texture using the edge image made by the edge detection network, and iii ) the robustness of a different feature representation with a common perturbation and corrupted image. Our results indicate that the network which focused more on shapes had an effect captured feature representations more accurately in almost all the experiments. (More)

CC BY-NC-ND 4.0

Guest: Register as new SciTePress user now for free.

SciTePress user: please login.

My Papers

You are not signed in, therefore limits apply to your IP address 136.0.111.243

In the current month:

Recent papers: 100 available of 100 total

2⁺ years older papers: 200 available of 200 total

Paper citation in several formats:

Minoura, H., Hirakawa, T., Yamashita, T. and Fujiyoshi, H. (2023). Understanding of Feature Representation in Convolutional Neural Networks and Vision Transformer. In Proceedings of the 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2023) - Volume 5: VISAPP; ISBN 978-989-758-634-7; ISSN 2184-4321, SciTePress, pages 429-436. DOI: 10.5220/0011621300003417

@conference{visapp23,
author={Hiroaki Minoura and Tsubasa Hirakawa and Takayoshi Yamashita and Hironobu Fujiyoshi},
title={Understanding of Feature Representation in Convolutional Neural Networks and Vision Transformer},
booktitle={Proceedings of the 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2023) - Volume 5: VISAPP},
year={2023},
pages={429-436},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0011621300003417},
isbn={978-989-758-634-7},
issn={2184-4321},
}

TY - CONF

JO - Proceedings of the 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2023) - Volume 5: VISAPP
TI - Understanding of Feature Representation in Convolutional Neural Networks and Vision Transformer
SN - 978-989-758-634-7
IS - 2184-4321
AU - Minoura, H.
AU - Hirakawa, T.
AU - Yamashita, T.
AU - Fujiyoshi, H.
PY - 2023
SP - 429
EP - 436
DO - 10.5220/0011621300003417
PB - SciTePress

- Science and Technology Publications, Lda.

RESOURCES

CONTACTS

Science and Technology Publications, Lda
Avenida de S. Francisco Xavier, Lote 7 Cv. C,
2900-616 Setúbal, Portugal.

Phone: +351 265 520 185 (National fixed network call)
Fax: +351 265 520 186
Email: info@scitepress.org

EXTERNAL LINKS

PROCEEDINGS SUBMITTED FOR INDEXATION BY:

pFad - Phonifier reborn

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Alternative Proxies:

Paper

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.