Grad Seminar: Creation of a Custom Language Model for Pediatric Occupational Therapy Documentation
Abstract
KidsAbility is a pediatric rehabilitation center that offers services including occupational therapy (OT) to youth. Documentation, including writing progress notes for each treatment appointment, is essential to OT treatment but can also be time-consuming and tedious. If the time spent on writing progress notes was reduced, KidsAbility believes that their capacity for treatment would increase. This thesis explores the creation of a custom large language model that is intended to decrease the amount of time that clinicians spend writing progress notes by transforming point-form scratch notes from pediatric OT treatment appointments into draft full-form documentation in SOAP format for the clinicians to edit.
A dataset of thousands of historical progress notes, with personal health information redacted, was used in the model training paradigm for which different training techniques were explored including domain-adaptive pre-training and LoRA fine-tuning. As there were no corresponding scratch notes in the dataset, few-shot prompting with a human-in-the-loop evaluation process was used to generate matching scratch notes. The historical progress notes and generated point-form notes were used to fine-tune Llama 2 and 3 models on the desired task. Different models’ outputs were evaluated and compared before the final model, a fully fine-tuned Llama 3 8B Instruct model, was selected for a pilot study at KidsAbility in which the custom model was compared against the proprietary Microsoft Co-Pilot model. Ten OT’s participated in the study, using Co-Pilot and then the custom model to write their progress notes for three weeks each.
It was found that providing training on how to most effectively use the custom model is important in reducing the amount of time spent on the process. After training, the average time taken to write a note was 7.6 minutes compared to an average of 13.8 minutes before training, both of which are based on subjective reporting. The progress notes written during the pilot study were also used in a quality assessment, in which four OTs scored the custom model notes, Co-Pilot notes, and manually written notes on multiple criteria. Results for this evaluation demonstrated that the notes written with the custom model were of high quality, receiving the highest score for three criteria and the second highest score for the remaining two. For all criteria, the custom model notes scored higher than the manually written notes. Objective timing data collection for determining the impact of using the custom model compared to not using any model was limited by the availability of clinicians.
Presenter
Rachel DiMaio, MASc candidate in Systems Design Engineering
Join in-person E5-6111