FEDERATED DEEP Q-LEARNING WITH SELF-SUPERVISED ENCODING AND RAG-BASED REWARD SHAPING FOR LASER TREATMENT RECOMMENDATION
This work is licensed under Creative Commons Attribution–NonCommercial International License
(CC BY-NC 4.0).
Abstract
Background: Periodontal treatment mainly uses scaling and root planing (SRP) and now often includes laser
therapy. SRP is the primary initial treatment, but laser options, such as diode and Er: YAG, can temporarily
reduce inflammation and pain. The decision between laser and traditional methods depends on patient factors,
highlighting the need for automated support. We introduce a federated deep Q-learning system to recommend
laser therapy based on patient features. We incorporate self-supervised encoding (PCA) to reduce feature
dimensionality and a RAG-based reward shaping strategy to integrate domain knowledge in training.
Methods: We trained a DQN agent at five sites with patient data, reducing features through PCA to 8
components. It used a 32-unit MLP for treatment decisions, with rewards based on RAG feedback from similar
cases. Training employed Federated Averaging to safeguard privacy, and performance was assessed using
accuracy, ROC AUC, Average Precision, confusion matrix, classification report, and feature importance
analysis.
Results: Across the test set, the federated DQN achieved an accuracy of 60%. As shown in Table 1, 26 of 33
laser recommendations were correctly classified, while only 10 of 27 conventional cases were correctly
identified. The ROC curve yielded an AUC of ~0.69 (Figure 3), indicating moderate discriminative ability.
Conclusions: Our results demonstrate the feasibility of federated deep Q-learning for personalized periodontal
therapy recommendations. The moderate performance (AUC ~0.69) suggests that the model learns to make
meaningful distinctions between treatment pathways.