Meta-FDMixup: Cross-Domain Few-Shot Learning Guided by Labeled Target Data
ACM MM 2021



A recent study finds that existing few-shot learning methods, trained on the source domain, fail to generalize to the novel target domain when a domain gap is observed. This motivates the task of Cross-Domain Few-Shot Learning (CD-FSL). In this paper, we realize that the labeled target data in CD-FSL has not been leveraged in anyway to help the learning process. Thus, we advocate utilizing few labeled target data to guide the model learning. Technically, a novel meta-FDMixup network is proposed. We tackle this problem mainly from two aspects. Firstly, to utilize the source and the newly introduced target data of two different class sets, a mixup module is re-proposed and integrated into the meta-learning mechanism. Secondly, a novel disentangle module together with a domain classifier is proposed to extract the disentangled domain-irrelevant and domain-specific features. These two modules together enable our model to narrow the domain gap thus generalizing well to the target datasets. Additionally, a detailed feasibility and pilot study is conducted to reflect the intuitive understanding of CD-FSL under our new setting. Experimental results show the effectiveness of our new setting and the proposed method.

CD-FSL with Few-Labeled Target Examples

The most standard CD-FSL assumes there is a single source data and several novel target dataset with different domains. Models are required to train on the source data only and then transfers to the novel data. Technically, CD-FSL intergrates the challenges of few-shot learning and domain generalization.

Facing such a challenginig task, though much efforts have been made to improve the single-source CD-FSL, the performance is still limited. Thus, we advocate to learn CD-FSL with few-labeled examples from target domain. Specically, here comes the formulation of the new setting.


For sepecific target dataset, we will divide it into disjoint target base and target novel. For target base, num_target labeled examples per class will be sampled to construct the auxiliary target training data, while the target novel will be kept as testing data. Note that:

  • the class sets of the source training, auxiliary target training, and target novel testing are strictly disjoint, obeying the basic settiing of FSL;
  • collecting few labeled examples is always feasible in real-world applications. We recomment num_target as 5 and use this setting for our main experiments;
  • with small-regime data introduce, the performance of CD-FSL models could be improved obviously.


As we said, to balance the cost and performance, we recommend num_target as 5.

Meta-FDMixup Method

Main challenges to be solved:

  1. data imbalance: the number of examples for the source training and the auxiliary target training are extremly imbalanced i.e., 600:5
  2. domain gap: the domain shift problem still exists.

Our Meta-FDMixup contains two stages:

  1. pretraining stage: use the classical supervised classification learning task to learn a good feature extractor;
  2. meta-train stage: meta-train the meta-FDMixup model on source dataset and auxiliary target dataset with our novel meta-mixup module and feature disentangle module.


Only source data is used for pretraining, while in the meta-train stage, a source episode and a auxiliary target episide are randomly sampled from the source and target datasets, respectively.

For the novel meta-mixup, we:

  1. mixup the query images of the source episode and the auxiliary target episode with ratio λ while keep their support images unchanged.
  2. classify the mixed query to source support with confidence score as λ and to target support with confidence score as (1-λ).


For the novel feature disentange module, we:

  1. design two branches to extract the domain-irrelevant and domain-specific features from the whole visual features.
  2. perform domain classification tasks i.e. domain-specific features should be classied into its corresponding domain while domain-irrelevant features shoule confuse the domain classifier.


The details of how we achieve the meta-mixup, the feature distentange, and pretrain/meta-train the network please refer to our paper.


We take mini-ImageNet as source, and conduct experiments on CUB, Cars, Places, and Plantae.


We also provide the visulization result. We highlight that with the help of auxiliary target data and our method, we are able to adjust the models' attention to more important areas.


This figure shows that we do disentangle the domain-irrelevant and domain-specific features. [Fig is taken from our extension work GMeta-FDMixup(TIP) .]


Related Links