Feature extraction from an image using pre-trained PyTorch model


Image recognition is a well-known and widely used application of machine learning. In this blog post, we will discuss how to extract features from an image using a pre-trained PyTorch model.

The vgg feature extraction pytorch is a pre-trained PyTorch model that can be used to extract features from images.

There are millions of parameters in modern convolutional neural networks. It takes a lot of labeled training data and a lot of processing resources to train them from start. Transfer learning is a method for speeding up this process by reusing a portion of a model that has previously been trained on a similar task in a new model.

Transfer learning for image classification is based on the idea that if a model is trained on a big and broad enough dataset, it may successfully serve as a generic model of the visual world. By training a big model on a huge dataset, you may use these learnt feature maps instead of starting from scratch.

Extraction of Features

To extract relevant characteristics from fresh samples, you may utilize a pre-trained model. You just layer a new classifier on top of the pre-trained model, which will be trained from scratch, allowing you to reuse the feature maps you learnt earlier for the dataset.

The whole model does not need to be retrained. The basic convolutional network already has characteristics that may be used to categorize images in general. The final, classification portion of the pre-trained model, on the other hand, is unique to the original classification job and, as a result, to the collection of classes on which it was trained.

We start with a pre-trained model in feature extraction and only update the final layer weights from which we generate predictions. Because we utilize the pre-trained CNN as a fixed feature extractor and just modify the output layer, it’s termed feature extraction.

This article shows how to use a resnet18 pre-trained model from torchvision models for image feature extraction, trained on the considerably bigger and more generic ImageNet dataset, to construct a PyTorch model for identifying five kinds of flowers.


Let’s take our Kaggle training examples and divide them into train and test sets. Flowers is a dataset of floral pictures with five potential class designations.

# username from the json file os.environ[‘KAGGLE USERNAME’] = “brijesh123” “fd625c630b11dfcdskdml34r23c278425d5d6” os.environ[‘KAGGLE KEY’] = “fd625c630b11dfcdskdml34r23c278425d5d6” os.environ[‘KAGGLE KEY’] = “fd625c630b11df download -d alxmamaev/flowers-recognition kaggle datasets path=”/content/flowers” BATCH SIZE=32 torch.device(“cuda:0” if torch.cuda.is available() is true, else “cpu”) The term “transforms” refers to the process of changing something into something else. Compose([ transforms.RandomResizedCrop(224), transforms.RandomResizedCrop(224), transforms.RandomResizedCrop(224), transforms.Ran Transforms using RandomHorizontalFlip(). Transforms using ToTensor(). Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225], [0.485, 0.456, 0.406], [0.229, 0.224, 0.225], [0.485, 0.456, 0.406], [0.485, 0.4

Finally, keep in mind that Resnet needs a minimum input size of (224,224).


This dataset has five classes and is organized in such a way that we may utilize it instead of creating our own unique dataset.

dataset=ImageFolder(path,transform=transfrom) train set,val set=train test split(dataset,test size=0.2,shuffle=True,random state=43) train loader=DataLoader(train set, batch size=BATCH SIZE,shuffle=True); train loader=DataLoader(train set, batch size=BATCH SIZE); train loader=DataLoader(train_ val loader=DataLoader(val set, batch size=BATCH SIZE) val loader=DataLoader(val set, batch size=BATCH SIZE) val loader=Data

Dataset Visualization

The flowers dataset contains tagged flower pictures. Each example includes a JPEG flower picture as well as a class label indicating the kind of flower. Let’s look at a few of pictures and their labels.

showimages(imgs,actual lbls,pred lbls=None): def showimages(imgs,actual lbls,pred lbls=None): plt.figure(figsize=(21,12)) = plt.figure(figsize=(21,12)) = plt.figure(figsize=(21 in enumerate(imgs) for i,img: fig.add subplot(4,8, i+1) fig.add subplot(4,8, i+1) fig.add subplo if pred lbls!=None, y=actual lbls[i]: y pre=pred lbls[i] prediction: 0nlabel:1″ title=”prediction: 0nlabel:1″ format(dataset.classes[y],dataset.classes[y pre]) else: label=”0″ title=”Label: 0″ title=”Label: 0″ title=”Label: 0 format(dataset.classes[y]) plt.title(title) img.numpy = img.numpy = img.numpy = img.num (). ((1, 2, 0)) transpose np.array = mean ([0.485, 0.456, 0.406]) np.array = std ([0.229, 0.224, 0.225]) img = std * img = std * img = std np.clip = image + mean img (img, 0, 1) plt.axis(“off”) plt.imshow(img) classes = next(iter(train loader)) plt.show() inputs showimages(inputs,classes)

PyTorch Dataloader visualize images

Make a model

The ResNet model will be used to build the base model. This has been pre-trained on the ImageNet dataset, which has 1.4 million pictures and 1000 classifications. ImageNet is a research training dataset that includes categories such as jackfruit and syringe.

model = torchvision.models.resnet18(pretrained=True); model = torchvision.models.resnet18(pretrained=True); model = torchvision.models

Layers should be frozen.

The convolutional base generated in the previous phase will be frozen and used as a feature extractor in this step. On top of it, you add a classifier and train the top-level classifier.

param.requires grad = False in model.parameters() for param

Before compiling and training the model, it’s critical to freeze the convolutional basis. The weights in a particular layer are not changed during training if the layer is frozen (by setting requires grad == False). 

A trainable weight’s value is no longer updated during training when it becomes non-trainable.

Change the Layer’s Shape

Except for the last completely linked layer, we’ll freeze the weights for all of the networks here. Only this layer is trained, since the previous completely linked layer is replaced with a new one with random weights.

model.fc.in features = num ftrs model.fc = nn.Linear(num ftrs, len(dataset.classes)) model.fc = nn.Linear(num ftrs, len(dataset.classes)) model.fc = n

Reshape the final layer(s) such that they have the same number of outputs as the new dataset’s number of classes.

Make an Optimizer

The last stage in feature extraction is to build an optimizer that only changes the parameters you want to change. All parameters with requires grad=True should be optimized, we know. Following that, we create a list of such parameters and provide it to the SGD algorithm constructor.

nn.CrossEntropyLoss = loss fn () in model.named parameters(), params to update = [] for name,param: if param.requires grad is equal to params to update is true. optimizer ft = optim.SGD(params to update, lr=0.001, momentum=0.9) append(param) optimizer ft = optimize.SGD(params to update, lr=0.001, momentum=0.9) optimizer ft = optimize.SGD(param

Train Simulator

Let’s build a generic model training function now.

‘train’:[], ‘val’:[] losses = ‘train’:[], ‘val’:[] accuracies = ‘train’:[], ‘val’:[] accuracies = ‘train’:[], ‘val’:[] accuracies = ‘train’: epochs=20 range(1,epochs+1) for epoch: train(model,loss fn,train loader,optimizer ft,epoch) test(model,loss fn,val loader,epoch) plt.subplot(2, 1, 1) plt.plot(accuracies[‘train’], label=”Training Accuracy”) plt.plot(accuracies[‘train’], label=”Training Accuracy”) plt.plot(accuracies[‘train’], label=”Training Accuracy”) plt.plot(accuracies[‘train’], label=”Training Ac plt.plot(accuracies[‘val’], label=”Validation Accuracy”) plt.plot(accuracies[‘val’], label=”Validation Accuracy”) plt.legend(loc=”lower right”) plt.legend(loc=”lower left”) plt.legend(loc # plt.ylim([min(plt.ylim()),1]) # plt.ylabel(‘Accuracy’) plt.title(‘Accuracy in Training and Validation’) plt.subplot(2, 1, 2) plt.plot(losses[‘train’], label=”Training Loss”) plt.plot(losses[‘val’], label=”Validation Loss”) plt.legend(loc=”upper right”) plt.legend(loc=”upper right”) plt.legend(loc=”upper right”) plt.legend(loc=”upper right”) plt.legend(loc=”upper right”) plt.legend plt.ylabel(‘Cross Entropy’) plt.ylim([0,1.0]) plt.title(‘Training and Validation Loss’) plt.xlabel(‘epoch’) plt.show(‘Cross Entropy’) plt.show(‘Cross Entropy’) plt.show(‘Cross Entropy’) plt.show(‘Cross Entropy’) ()

The train() method is in charge of training and validating a model. A PyTorch model, a dictionary of data loader, a loss function, an optimizer, and a given number of epochs to train and verify are all required inputs.

PyTorch Plot Accuracy and Loss

Image Prediction

model.eval() with torch.no grad() def predict images(model,images,actual label): inputs = images.to(device) outputs = model(inputs) _, preds = torch.max (outputs, 1) predict images(images,actual label,preds.cpu()) showimages(images,actual label,preds.cpu()) images, classes = next(iter(val loader)) (model,images,classes)

PyTorch Predict Images

It’s standard practice to use features learnt by a model trained on a bigger dataset in the same domain while dealing with a small dataset. This is accomplished by combining the pre-trained model with a fully-connected classifier. Only the weights of the classifier are changed during training, and the pre-trained model is “frozen.” In this instance, the convolutional base retrieved all of the features associated with each picture, and you simply trained a classifier to classify the images based on the extracted characteristics.

Use Google colab to run this code.

The pytorch modify pretrained model is a feature extraction method that uses pre-trained PyTorch models. It can be used to extract features from images, and then convert the extracted features into numerical values.

Frequently Asked Questions

How do you extract features in PyTorch?

You can extract features from a PyTorch tensor using the following code:

How features are extracted from an image?

The features of an image are extracted using the Fourier transform.

How do you use PyTorch Pretrained models?

PyTorch Pretrained models are trained on the ImageNet dataset. This means that they are optimized for recognizing images, not text. You can use them to recognize text in a similar way to how you would use a pre-trained model from Googles Inception V3 or Microsofts COCO to recognize text.

  • pytorch feature extraction pretrained model
  • pytorch extract features
  • pytorch pretrained models
  • resnet feature extraction pytorch
  • image feature extraction python github
You May Also Like