Informatik
Refine
Document Type
- Master's Thesis (9)
- Bachelor Thesis (2)
- Preprint (2)
Language
- English (13) (remove)
Has Fulltext
- yes (13)
Is part of the Bibliography
- no (13)
Keywords
- Neuronales Netz (3)
- Maschinelles Lernen (2)
- Alzheimer (1)
- Application (1)
- Bildgebung (1)
- Data Mining (1)
- Datenanalyse (1)
- Deep learning (1)
- Electronic Commerce (1)
- Gangzyklus-Erkennung (1)
Institute
- Informatik (13)
Implementation of an interactive pattern mining framework on electronic health record datasets
(2019)
Large collections of electronic patient records contain a broad range of clinical information highly relevant for data analysis. However, they are maintained primarily for patient administration, and automated methods are required to extract valuable knowledge for predictive, preventive, personalized and participatory medicine. Sequential pattern mining is a fundamental task in data mining which can be used to find statistically relevant, non-trivial temporal dependencies of events such as disease comorbidities. This works objective is to use this mining technique to identify disease associations based on ICD-9-CM codes data of the entire Taiwanese population obtained from Taiwan’s National Health Insurance Research Database.
This thesis reports the development and implementation of the Disease Pattern Miner – a pattern mining framework in a medical domain. The framework was designed as a Web application which can be used to run several state-of-the-art sequence mining algorithms on electronic health records, collect and filter the results to reduce the number of patterns to a meaningful size, and visualize the disease associations as an interactive model in a specific population group. This may be crucial to discover new disease associations and offer novel insights to explain disease pathogenesis. A structured evaluation of the data and models are required before medical data-scientist may use this application as a tool for further research to get a better understanding of disease comorbidities.
Development and validation of a neural network for adaptive gait cycle detection from kinematic data
(2020)
(1) Background: Instrumented gait analysis is a tool for quantification of the different
aspects of the locomotor system. Gait analysis technology has substantially evolved over
the last decade and most modern systems provide real-time capability. The ability to
calculate joint angles with low delays paves the way for new applications such as real-time
movement feedback, like control of functional electrical stimulation in the rehabilitation
of individuals with gait disorders. For any kind of therapeutic application, the timely
determination of different gait phases such as stance or swing is crucial. Gait phases are
usually estimated based on heuristics of joint angles or time points of certain gait events.
Such heuristic approaches often do not work properly in people with gait disorders due to
the greater variability of their pathological gait pattern. To improve the current state-ofthe-
art, this thesis aims to introduce a data-driven approach for real-time determination
of gait phases from kinematic variables based on long short-term memory recurrent neural
networks (LSTM RNNs).
(2) Methods: In this thesis, 56 measurements with gait data of 11 healthy subjects,
13 individuals with incomplete spinal cord injury and 10 stroke survivors with walking
speeds ranging from 0.2 m
s up to 1 m
s were used to train the networks. Each measurement
contained kinematic data from the corresponding subject walking on a treadmill for 90
seconds. Kinematic data was obtained by measuring the positions of reflective markers on
body landmarks (Helen Hayes marker set) with a sample rate of 60Hz. For constructing a
ground truth, gait data was annotated manually by three raters. Two approaches, direct
regression of gait phases and estimation via detection of the gait events Initial Contact
and Final Contact were implemented for evaluation of the performance of LSTM RNNs.
For comparison of performance, the frequently cited coordinate- and velocity-based event
detection approaches of Zeni et al. were used. All aspects of this thesis have been
implemented within MATLAB Version 9.6 using the Deep Learning Toolbox.
(3) Results: The mean time difference between events annotated by the three raters
was −0.07 ± 20.17ms. Correlation coefficients of inter-rater and intra-rater reliability
yielded mainly excellent or perfect results. For detection of gait events, the LSTM RNN
algorithm covered 97.05% of all events within a scope of 50ms. The overall mean time
difference between detected events and ground truth was −11.62 ± 7.01ms. Temporal
differences and deviations were consistently small over different walking speeds and gait
pathologies. Mean time difference to the ground truth was 13.61 ± 17.88ms for the
coordinate-based approach of Zeni et al. and 17.18 ± 15.67ms for the velocity-based
approach. For estimation of gait phases, the gait phase was determined as a percentage.
Mean squared error to the ground truth was 0.95 ± 0.55% for the proposed algorithm
using event detection and 1.50 ± 0.55% for regression. For the approaches of Zeni et al.,
mean squared error was 2.04±1.23% for the coordinate-based approach and 2.24±1.34%
for the velocity-based approach. Regarding mean absolute error to the ground truth, the
proposed algorithm achieved a mean absolute error of 1.95±1.10% using event detection
and one of 7.25 ± 1.45% using regression. Mean absolute error for the coordinate-based
approach of Zeni et al. was 4.08±2.51% and 4.50±2.73% for the velocity-based approach.
(4) Conclusion: The newly introduced LSTM RNN algorithm offers a high recognition
rate of gait events with a small delay. Its performance outperforms several state-of-theart
gait event detection methods while offering the possibility for real-time processing
and high generalization of trained gait patterns. Additionally, the proposed algorithm
is easy to integrate into existing applications and contains parameters that self-adapt
to individuals’ gait behavior to further improve performance. In respect to gait phase
estimation, the performance of the proposed algorithm using event detection is in line
with current wearable state-of-the-art methods. Compared with conventional methods,
performance of direct regression of gait phases is only moderate. Given the results,
LSTM RNNs demonstrate feasibility regarding event detection and are applicable for
many clinical and research applications. They may be not suitable for the estimation
of gait phases via regression. For LSTM RNNs, it can be assumed, that with a more
optimal configuration of the networks, a much higher performance is achieved.
The e-commerce turnover has a constant growth rate of about 10%. An additional increase
in complexity and traffic spikes clarify the need for a scalable software architecture to prevent
a potential technical debt, higher financial cost, longer maintenance, or a reduced reliability.
Due to the fact, that existing approaches like the Palladio Approach require a high modelling
overhead and the importance of dropping this overhead was identified this master thesis is
focused on the modelling and simulation of e-commerce web application architectures using
a high-level approach to provide a faster, but possibly more inaccurate prediction of the
scalability.
This is done by the usage of the Design Science Research Process as a frame, a scientific
literature review for use of the existing knowledge base and the Conical Methodology for the
artefact creation. The artefact is a graphical model which is evaluated using a simulation
developed with Python and its framework SimPy. For model creation and evaluation a total
of twelve papers investigating the scalability of e-commerce web application architectures is
split into a test and train group. The training group and parts of the scientific research are
used to identify the components load balancer, application server, web tier, ERP system,
legacy system and database as well as some general characteristics that need to be considered.
The components with the most modelling variables are the application server and web
tier with a total of thirteen, while the ERP and legacy system only required five.
The model is evaluated using a total of three papers from the test group, where an average
throughput error of 5.78% and a response time error of 46.55% or 26.46% was identified. An
additional evaluation based on two non-e-commerce architectures shows the usability of the
model for other types of architectures. Even though the average error gives the impression,
that the model is not providing a good estimation, the graphical results show, that the model
and its simulation can be used to provide a faster scalability prediction. The model is least
accurate for the prediction of the situation, where the response time increases exponentially,
as this is the point, where variables, only accountable for some percentage and thus ignored
for the model, have the highest influence.
Future research can be found in the extension of the model by either adding or investigating
additional components, adding features ignored within this work or applying it to other
types of web application architectures. Additionally, both the low-level and the high-level
approaches can be brought together to combine the advantages from both approaches.