Non-iid data and Continual Learning processes in Federated Learning: a long road ahead

Yüklə 1,96 Mb.

Pdf görüntüsü

səhifə	14/31
tarix	11.06.2023
ölçüsü	1,96 Mb.
	#128584

1 ... 10 11 12 13 14 15 16 17 ... 31

1-s2.0-S1566253522000884-main

Definition 2. 2 clients 𝑖, 𝑗 ∈ 𝑁 present different conditional proba-
bilities 𝑃 (𝑦
|𝑥) if there exist a significant quantity 𝑇 of data samples
{𝑥
𝑘
}
𝑇
𝑘
=1
, and distinct outputs {𝑦
𝑘
1
, 𝑦
𝑘
2
}
𝑇
𝑘
=1
such that the participants
𝑖, 𝑗
own the data samples (𝑥
𝑘
, 𝑦
𝑘
1
)
and (𝑥
𝑘
, 𝑦
𝑘
2
)
respectively, ∀ 𝑘 ∈
{1, … , 𝑇 }
.
To be precise we need to specify the meaning of ‘‘distinct outputs’’:
𝑦
𝑘
1
, 𝑦
𝑘
2
are distinct if
‖𝑦
𝑘
1
− 𝑦
𝑘
2
‖ > 2𝐿 for a certain margin of error
𝐿
(see
Fig. 2
), which may vary depending on the specific problem. In
a classification problem 2𝐿 needs to be lower than 1, so the margin
of error has to be less than 1∕2, whereas in regression problems the
margin of error is fixed depending on the level of accuracy desired.
The main problem regarding this context is that, unlike the previous
one, a single global model cannot fit all of the users behaviours, since it
will not be able to produce different predictions for the same input data.
A model in ML is, by definition, a mapping 𝑀 ∶ 𝑋 ⟶ 𝑌 that assigns
one value to each possible input [
106
]. Given the input 𝑥
𝑘
, a traditional
model in a distributed setting would process it equally for all clients,
Fig. 2. Regression model in one variable. The samples (𝑥
𝑘
, 𝑦
𝑘
1
)
and (𝑥
𝑘
, 𝑦
𝑘
2
)
belong
each to one different client. If the distance between those samples is bigger than 2𝐿,
there is no output that could be closer than 𝐿 for both of them. Hence, at most only
one output would be considered correct.
Fig. 3. Classification of the different approaches that deal with the spatial
heterogeneity in the behaviours of clients.
thus predicting one, and only one, output. As a result, the predicted
value could be 𝑦
𝑘
1
, 𝑦
𝑘
2
or a different one, possibly intermediate, but
in any case one of the clients would obtain a prediction with an error
bigger than 𝐿. In addition, to detect the existence of several behaviours
in the clients, we need to study the result of their loss functions, which
implies having a certain amount of labelled data.
To overcome this matter, it is essential to consider some kind of
model architecture that grants the possibility of variations among the

Yüklə 1,96 Mb.

Dostları ilə paylaş:

1 ... 10 11 12 13 14 15 16 17 ... 31