Thursday, February 21, 2008

Tools to battle the complexity of biomedical software and medical information systems

Those who regularly read this blog know that one of my pet peeves is the increasing complexity of biomedical software. My belief is that complex systems are chaotic and unpredictable, and the best way to deal with software complexity is to eliminate it.

Here is a list of the basic intellectual tools that I believe can help reduce complexity.

1 Classifications. A class inherits properties in a direct lineage from a parent class. An object can only occupy a single class. Classifications are easy to understand and compute. This is the definition of classification that is used by biologists (as in the classification of all living organisms) and applies well to computer science. Classifications are related to (but different from) ontologies. Ontologies, unlike classifications, can become hightly complex. Classifications always reduce the complexity of a knowledge domain.

2 Flat data files that can be extended but not re-written. A telephone book is a close example. If people never changed their names, never died, and never changed their telephone numbers, a telephone directory would be an ideal example. Data that can be sensibly organized in this kind of flat file is very simple to work with.

3 The EMR (electronic medical record). The EMR is the digital equivalent of the patient chart. In this model, all new clinical reports pertaining to a patient are inserted into the EMR object for the patient. This is a simple data model that can work well so long as one and only one record is created for each patient.

4 Small, self-contained specialized information systems. These applications are designed for a specific and narrow function (e.g. cytopathology information system). Complexity does not intervene until the specialized information system needs to interact with other systems in the hospital.

5 Fundamental algorithms. Almost all important algorithms are simple and can be explained in a few steps. From these simple algorithms, complex systems can arise.

6 Simple protocols. Very simple protocols can support incredibly complex systems. TCP/IP (the internet protocol)is a simple strategy for transferring packets of information over a network of computers.

7 Elegant object oriented programming languages, such as Ruby. Though Ruby is a simple and elegant language, it can be used to create hopelessly complex software. Programmers need extensive training in design principles that minimize complexity.

8 Specifications. Specifications are formal ways of explaining what you've done so that computers and humans can understand and replicate your work. It is important to have a standard syntax for describing data and for organizing information into meaningful statements that can be interpreted by software agents ( RDF is a fine example). I distinguish specifications from standards. Informatics standards impose an idiosyncratic, specialized format on data and tend to increase the complexity of information across different data domains.

9 Unique data identifiers. Computers are good at creating and tracking unique identifiers.

10. Encryption algorithms. It is easy to make something a secret.

11 De-identified public datasets. Publicly released de-identified data simplifies research by permitting multiple projects on the same set of data. With remarkably few exceptions (zero, in my opinion), de-identified public medical datasets have not hurt patients.

Most programmers would include UML (Unified Modeling Language) in this list. I left it out because UML seems very complex to me and it permits programmers to manage complexity (rather than reduce or eliminate complexity). I confess that I do not know much about UML, but this is my current perception.

The topic of medical software complexity is a topic that I discuss at great length in my recently published book, Biomedical Informatics.

- Jules Berman

key words: medical informatics, informatics complexity, classification, ontologies, ontology, hospital information systems, laboratory information systems
My book, Principles of Big Data: Preparing, Sharing, and Analyzing Complex Information was published in 2013 by Morgan Kaufmann.

I urge you to explore my book. Google books has prepared a generous preview of the book contents.