Date of publication XXXX 00, 0000, date of current version XXXX 00, 0000


Table 1. The labeled clinic event data from the MIMIC III dataset by using the PEACE-home method



Yüklə 217,03 Kb.
Pdf görüntüsü
səhifə7/13
tarix07.01.2024
ölçüsü217,03 Kb.
#211335
1   2   3   4   5   6   7   8   9   10   ...   13
FL for clinical events classification IEEE

Table 1. The labeled clinic event data from the MIMIC III dataset by using the PEACE-home method
.
3.3
MACHINE LEARNING PART 
Machine learning can be applied to clinical event classification 
tasks in several ways. One common approach is to use 
supervised machine learning algorithms, such as decision 
trees, random forests, or support vector machines, to predict 
the class of a given clinical event based on a set of features or 
attributes. The algorithm is trained on a labeled dataset of past 
clinical events and their corresponding classes and then used 
to make predictions on new, unseen data. In a clinical event 
classification task, the features used as inputs to the machine 
learning algorithm could include demographic information, 
vital signs, laboratory test results, medications, and other 
relevant information. The target variable or output of the 
algorithm would be the class of the clinical event, such as 
sepsis, pneumonia, or a heart attack. Overall, the use of 
machine learning in the clinical event classification task has 
the potential to enhance the accuracy and efficiency of 
healthcare delivery by enabling the rapid and reliable 
identification of patients with specific conditions. In our study, 
we implemented several ML methods to compare and get 
eventual the best result on clinic event classification task such 
as Random Forest Classifier, XGBOOT classifier, AdaBoots 
classifier, Stochastic Gradient Decent, and Bayesian Ridge 
classifier.
3.3.1
BAYESSIAN RIDGE CLASSIFIER 
Bayesian Ridge Classifier is a type of linear regression 
algorithm that uses Bayesian inference to fit a linear model to 
the data. The Bayesian Ridge Classifier is a regularized linear 
regression algorithm, which means that it tries to find a 
balance between the fit of the model to the data and the 
simplicity of the model. The Bayesian Ridge Classifier uses 
Bayesian inference to determine the values of the model 
parameters that best fit the data. The algorithm starts with a 
prior distribution on the parameters, and then updates the 
distribution based on the data using Bayes' theorem. This 
results in a posterior distribution of the parameters that 
summarize the uncertainty about their values. The Bayesian 
Ridge Classifier has several advantages compared to other 
linear regression algorithms. For example, it is less sensitive 
to the presence of outliers in the data, it can handle 
multicollinearity in the features, and it provides a natural way 
to estimate uncertainty in the model parameters and 
predictions. In a clinical event classification task, the Bayesian 
Ridge Classifier can be used to predict the class of a clinical 
event based on a set of features or attributes, such as 
demographic information, vital signs, laboratory test results, 
medications, and other relevant information. 
3.3.2 
RANDOM FOREST CLASSIFIER
Random Forest Classifier is an ensemble learning method 
that uses multiple decision trees to make predictions. It is a 
type of supervised machine-learning algorithm used for 
classification and regression problems. The basic idea behind 
Random Forest Classifier is to generate multiple decision 
trees, each of which is trained on a different subset of the data. 
The predictions made by each decision tree are combined to 
form a final prediction, which is typically more accurate than 
the predictions made by a single decision tree. In a Random 
Forest Classifier, each decision tree is generated by randomly 
selecting a subset of the features and a random sample of the 
data, and then using the selected data to train a decision tree. 
The final prediction is made by taking the majority vote or 
average of the predictions made by each decision tree. 
Random Forest Classifier is widely used in a variety of 
applications, including medical diagnosis, credit scoring, and 
marketing analysis. In a clinical event classification task, 
Random Forest Classifier can be used to predict the class of a 
clinical event based on a set of features or attributes or etc. 
3.3.3 
ADABOOTS CLASSIFIER
AdaBoost (Adaptive Boosting) is a boosting algorithm that 
can be used for both binary and multi-class classification 
problems. AdaBoost is an ensemble learning method that 
combines multiple weak classifiers to form a strong classifier. 
The idea behind AdaBoost is to adjust the weights of the 
samples in the training data at each iteration in order to give 
more emphasis to the samples that are misclassified by the 
current ensemble of classifiers. In AdaBoost, a weak classifier 
is first trained on the data and used to make predictions. The 
samples that are misclassified by the weak classifier are given 
a higher weight, and a new weak classifier is trained on the re-
weighted data. This process is repeated multiple times, and the 
predictions of each weak classifier are combined to form the 
final prediction. AdaBoost is a simple and effective algorithm 
that has been used in a wide range of applications, including 
image and speech recognition, bioinformatics, and medical 
diagnosis.
3.3.4 
XGBOOTS CLASSIFIER
XGBoost (Extreme Gradient Boosting) is an open-source 
software library that provides an implementation of gradient 
boosting for tree-based learning algorithms. It is a scalable, 
high-performance machine learning algorithm that is widely 
used in a variety of applications, including computer vision, 
natural language processing, and recommendation systems. 
XGBoost is an optimized implementation of gradient boosting 
that is designed to handle large datasets and to perform well 
on both structured and unstructured data. It uses decision trees 
as the base learner and iteratively improves the prediction 
accuracy by adding new trees that focus on the residuals or 
errors of the previous trees. XGBoost provides several key 
features that make it well-suited for clinical event 
classification tasks. These include the ability to handle missing 
data, support for parallel and distributed computing, and the 
ability to handle high-dimensional and sparse data. In addition, 


Ruzaliev R: 
Federated Learning for Clinical Event Classification Using Vital 
Signs Data 

VOLUME XX, 2023 
XGBoost provides several hyperparameters that can be tuned 
to control the behavior of the algorithm and improve its 
performance.
3.3.5 
STOCHASTIC GRADIENT DECEDMT 
Stochastic Gradient Descent (SGD) is an optimization 
algorithm used for training machine learning models, 
especially for linear models like linear regression, logistic 
regression, and support vector machines. It is called 
"stochastic" because it uses a randomly selected subset of the 
training data, or a single training example, at each iteration to 
update the model parameters. In SGD, the model parameters 
are updated in the direction of the negative gradient of the loss 
function with respect to the model parameters. The gradient is 
estimated using the randomly selected subset of the training 
data, rather than using the entire training set. This makes SGD 
more computationally efficient than batch gradient descent, 
where the entire training set is used to calculate the gradient at 
each iteration. SGD is a popular optimization algorithm 
because it is simple to implement and can be applied to very 
large datasets, making it well-suited for training machine 
learning models on big data. However, the stochastic nature of 
the algorithm can make it more sensitive to the choice of the 
learning rate and can result in more oscillations in the 
optimization path compared to batch gradient descent. To 
mitigate these issues, SGD is often combined with techniques 
such as mini-batch learning, learning rate schedules, and 
regularization. 

3.4 
FEDERATED LEARNING 
Federated Learning is a distributed machine learning 
framework that enables multiple parties to train a shared 
model without sharing their raw data. Instead, the raw data 
remains on the devices of the participants and only the model 
parameters are communicated and aggregated to form the final 
model. In a federated learning structure, each participant has a 
local model that is trained on its own data. The local models 
are then used to make predictions on new data, and the 
gradients of the loss function with respect to the model 
parameters are calculated. These gradients are then 
communicated to a central server, which aggregates the 
gradients and updates the global model parameters. The 
updated model parameters are then sent back to the 
Yüklə 217,03 Kb.

Dostları ilə paylaş:
1   2   3   4   5   6   7   8   9   10   ...   13




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©www.azkurs.org 2024
rəhbərliyinə müraciət

gir | qeydiyyatdan keç
    Ana səhifə


yükləyin