Thursday, March 29, 2007

Intro to RDF biomedical specifications

Many of us in the healthcare and life sciences spend a non-significant portion of our professional lives preparing data that will be entered into large databases.

Sometimes databases disappoint us. For a variety of reasons, we can never seem to merge our data with data produced by colleagues who have used a different database. Often, databases within a single institution are incompatible. The solution, we are told, is data standards. If we could mutually decide what data we need to store and how the stored data should be structured, then our databases would be compatible. Our trust in the wisdome of standards has fueled thousands of standards initiatives in the healthcare field.

The problem is that compliance with standards is often very low, and standards themselves can be flawed. As technologies change, standards do not always keep apace. This often results in obsolete standards or standards with multiple versions with idiosyncratic implementations.

RDF (Resource Description Framework) is a formal method for describing specified data objects with paired metadata and data and is prepared in XML syntax. In the next few blogs, I will try to show that RDF-specified data provides some of the functionality of standards. In addition, RDF-specifications greatly expand our ability to understand information. In my opinion, all life science professionals can benefit from understanding the basics of RDF.

-Jules Berman tags: biomedical, data standards, metadata, rdf, specifications
Science is not a collection of facts. Science is what facts teach us; what we can learn about our universe, and ourselves, by deductive thinking. From observations of the night sky, made without the aid of telescopes, we can deduce that the universe is expanding, that the universe is not infinitely old, and why black holes exist. Without resorting to experimentation or mathematical analysis, we can deduce that gravity is a curvature in space-time, that the particles that compose light have no mass, that there is a theoretical limit to the number of different elements in the universe, and that the earth is billions of years old. Likewise, simple observations on animals tell us much about the migration of continents, the evolutionary relationships among classes of animals, why the nuclei of cells contain our genetic material, why certain animals are long-lived, why the gestation period of humans is 9 months, and why some diseases are rare and other diseases are common. In “Armchair Science”, the reader is confronted with 129 scientific mysteries, in cosmology, particle physics, chemistry, biology, and medicine. Beginning with simple observations, step-by-step analyses guide the reader toward solutions that are sometimes startling, and always entertaining. “Armchair Science” is written for general readers who are curious about science, and who want to sharpen their deductive skills.

Sunday, March 25, 2007

Questions to ask before creating a new standard

As any regular reader of this blog must know, I'm not a big fan of standards development organizations. In most instances where a community would like a standard, they would be better served by a specification.

However, if you're considering a new standards effort, you might want to first answer the following set of questions.

1 Is there a pre-existing standard that covers the same technology?

2 If there is a pre-existing standard, can it be enhanced or modified to provide a desired functionality?

3 How much will it cost to develop the standard?

4 How long will the standards development process take?

5 Will the intended beneficiaries of the standard pay for the standards development process?

6 Who will develop the standard? Are the selected developers competent to produce an adequate standard?

7 Are any of the developers conflicted? Do they stand to profit if the standard is developed in a specific way?

8 Do any of the developers have proprietary software or data that they may wish to include in the standard?

9 Are the expected developers committed to work through the duration of the standards development process, and are they committed to providing all of the time and energy needed to develop the standard?

10 Will there be a mechanism whereby drafts of the standard are reviewed openly by the public? Will the minutes of the working committee be made public? Will public comments be used to modify successive drafts of the standard?

11 Will the standard have dependencies on other standards? If so, are there intellectual property issues that must be resolved before development begins? Will these issues require licenses or royalty agreements from the standards developers or the standards users?

12 Once created, is the standard likely to be adopted? Is the anticipated standard easily implemented?

13 Who will be the adopters of the standard? Are the expected standard adopters included in the development process for the standard?

14 Will the standard benefit a range of users beyond the standards developers?

15 What are the hazards that the standard may produce, and who might be hurt by the standard? In particular, will any entities be disadvantaged if they cannot readily adopt the standard?

16 Is it necessary to have the standard approved by an external organization?

17 If so, who will pay for the extra costs of obtaining approval from an external standards organization?

18 Will the standard need to be continuously updated and modified? Is there a planned process for producing multiple versions of the standard?

19 Is it really important to have the standard? Is it worth the effort?

-Jules Berman, exceprted from Biomedical Informatics
My book, Principles of Big Data: Preparing, Sharing, and Analyzing Complex Information was published in 2013 by Morgan Kaufmann.



I urge you to explore my book. Google books has prepared a generous preview of the book contents.

tags: big data, metadata, data preparation, data analytics, data repurposing, datamining, data mining, conflicts, data standards, new standards, sdo, specifications, standards committees

Saturday, March 24, 2007

Data specification is rewarded with funding opportunities

This blog is devoted to the topic of data specification (including data organization, data description, data retrieval and data sharing) in the life sciences and in medicine. You might wonder why anyone would think that data specification is sufficiently important for anyone to host a blog on the topic. Well, there are many reasons that I hope to describe in future posts. For today, just consider an RFA (request for applications) announced yesterday by the NIH.

The RFA is entitled: Genome-Wide Studies in Biorepositories with Electronic Medical Record Data.

Release Date: March 23, 2007
Letters of Intent Receipt Date(s): April 17, 2007
Application Receipt Date(s): May 17, 2007
Peer Review Date(s): July/August 2007
Council Review Date(s): August 2007
Earliest Anticipated Start Date(s): September 30, 2007
Additional Information To Be Available Date (Url Activation Date): N/A
Expiration Date: May 18, 2007

It's a very good RFA. Basically, if you have a tissue repository and the tissues are linked to a hospital EMR (Electronic Medical Record), the NHGRI (National Human Genome Research Institute) is interested in receiving a grant application from you.

The NHGRI will perform or pay for genomic studies on collected tissues [provided by awardees] that can be integrated with clinical data in the EMR.

This approach rewards institutions that have made serious efforts in biorepository science, EMR data organization and genomic testing (basically, the bread-and-butter of biomedical data specification).

Why is data specification (for biorepository data and EMR data) so important? Why can't the NHGRI just figure everything out by experiments conducted on cells in a culture dish?

Several years ago there was a lot of hype written about the imminent impact of pharmacogenomics on medical care. Everyone would get drugs specifically tailored to their own genome. Well, that was six or seven years ago, and with very few exceptions, people are prescribing drugs the old-fashioned way (one drug/dose fits all, unless there are bad side-effects or a poor response, in which case, try another drug/dose).

Before you can get much benefit from pharmacogenomics, you need to collect a lot of phenotypic (treatment, outcome, clinical, historical, physical) data on a lot of patients and match these with genotypic data. There's no substitute for clinical correlation with millions of patients. To integrate genotypic and phenotypic data, you need to have large amounts of organized and specified data.

It will take years and years before we have rich collections of well-annotated medical data sets on large numbers of patients. Smart data specification is one of the hurdles that we, as a society, must cross. Yesterday's RFA announcement is a step in the right direction.

-Jules Berman tags: ehr, electronic medical record, emr, funding, medical research, nhgri, nih, rfa, specifications, science
Science is not a collection of facts. Science is what facts teach us; what we can learn about our universe, and ourselves, by deductive thinking. From observations of the night sky, made without the aid of telescopes, we can deduce that the universe is expanding, that the universe is not infinitely old, and why black holes exist. Without resorting to experimentation or mathematical analysis, we can deduce that gravity is a curvature in space-time, that the particles that compose light have no mass, that there is a theoretical limit to the number of different elements in the universe, and that the earth is billions of years old. Likewise, simple observations on animals tell us much about the migration of continents, the evolutionary relationships among classes of animals, why the nuclei of cells contain our genetic material, why certain animals are long-lived, why the gestation period of humans is 9 months, and why some diseases are rare and other diseases are common. In “Armchair Science”, the reader is confronted with 129 scientific mysteries, in cosmology, particle physics, chemistry, biology, and medicine. Beginning with simple observations, step-by-step analyses guide the reader toward solutions that are sometimes startling, and always entertaining. “Armchair Science” is written for general readers who are curious about science, and who want to sharpen their deductive skills.

Friday, March 23, 2007

Patenting the uses of the DICOM medical image standard

In a prior post, I described how the uses of standards can be patented, using DICOM as an example. DICOM (Digital Imaging and Communications in Medicine) is being actively promoted as a universal and exclusive standard for all medical images.

This blog examines one instance, in some detail, of a patent related to a common use of the DICOM standard.

U.S. Patent 6,725,231 is entitled "DICOM XML DTS/schema generator". This patent was filed March 27, 2001 by Jingkun H and Kwok Pun Lee and assigned to Philips Electronics. The patent was awarded on April 20, 2004.

The methods covered by a patent are enumerated in a list of patent claims. Claim 1 (of 15 claims) for this patent is:

"1. A method for mapping a DICOM specification into an XML document, comprising: mapping each entry of a DICOM table of the DICOM specification into a corresponding XML element of a plurality of XML elements, outputting each XML element of the plurality of XML elements to the XML document, in an output format that conforms to at least one of: an XML document-type-definition and an XML Schema. "

The reach of a patent is extended if it is awarded in both U.S. and International Patent Offices. The same application, has been awarded by the World Intellectual Property Organization, (WIPO), WO/2002/077896 DICOM XML DTS/schema generator.

Even after a patent has been awarded, it can sometimes be successfully argued, in court, that the claims are obvious or non-original and cannot be asserted against a user.

How do scientists demonstrate that an idea is original and non-obvious? They publish their work in a respected journal in their field of work. Journals are expected to reject submissions that are obvious or for which prior art (earlier publications) exist. For centuries, scientists have used publications as evidence of the validity, originality and scientific value of their work.

Jingkun H and Kwok Pun Lee published three original papers in the Journal of the American Medical Informatics Association (JAMIA), that are contemporary with their patent applications, and that describe methods related to their patents.

Zhao L, Lee KP, Hu J.
Generating XML schemas for DICOM structured reporting templates.
J Am Med Inform Assoc. 2005 Jan-Feb;12(1):72-83.

Lee KP, Hu J.
XML Schema Representation of DICOM Structured Reporting.
J Am Med Inform Assoc. 2003 Mar-Apr;10(2):213-23.

Tirado-Ramos A, Hu J, Lee KP.
Information object definition-based unified modeling language
representation of DICOM structured reporting: a case study of
transcoding DICOM to XML.
J Am Med Inform Assoc. 2002 Jan-Feb;9(1):63-71.

These described steps exemplify the way that the uses of a standard were patented. As shown previously, many uses of the DICOM standard have been included in current patent applications.

Having multiple image standards provides users an "out" when one standard becomes excessively encumbered. As discussed in a prior post, there are times when using a specification, rather than a standard, facilitates the the free, unencumbered exchange of annotations and image binary data.

-Jules Berman

tags: DICOM, intellectual property, medical images, patents, specifications, standards
My book, Principles of Big Data: Preparing, Sharing, and Analyzing Complex Information was published in 2013 by Morgan Kaufmann.



I urge you to explore my book. Google books has prepared a generous preview of the book contents.

Sunday, March 18, 2007

Searching patents related to DICOM

In yesterday's blog, I described how standards developers and standards users can search the USPTO (US Patent and trademark organization) for patents that might encumber a data standard.

As an example, let's look at some specific patents (issued or pending) related to DICOM (Digital Imaging and Communications in Medicine).

A search for pending patents (patent applications) on the term ttl/DICOM pulls just those patent applications that have DICOM in their title. A second search (through the issued patent search engine available from the same site) would pull issued patents.

Here is the output for pending patents submitted since 2001 and containing DICOM in the title.

1 20070041647 Method for increasing the flexibility of DICOM tags management in application-specific integration
2 20060282447 Ndma db schema, dicom to relational schema translation, and xml to sql query transformation
3 20060259513 System and method to submit image requests to DICOM server
4 20060259463 System and method for the automatic generation of a query to a DICOM server
5 20060242268 Mobile radiology system with automated DICOM image transfer and PPS queue management
6 20060242148 System and method for integrating ancillary data in DICOM image files
7 20060239589 System and method for definition of DICOM header values
8 20060197968 Dicom print driver
9 20060064328 System and method for utilizing a DICOM structured report for workflow optimization
10 20060056680 3D volume construction from DICOM data
11 20050246629 Framework of validating dicom structured reporting documents using XSLT technology
12 20050237776 System and method for patient controlled communication of DICOM protected health information
13 20050031181 Method and system for analyzing bone conditions using DICOM compliant bone radiographic image
14 20040205563 Specifying DICOM semantic constraints in XML
15 20040025110 Precise UML modeling framework of the DICOM information model
16 20030149680 Methods and apparatus for streaming DICOM images through data element sources and sinks
17 20030101291 Application programming interface for provision of DICOM services
18 20020143824 DICOM to XML generator
19 20020143727 DICOM XML DTD/Schema generator
20 20020133373 Integration of radiology information into an application service provider DICOM image archive and/or web based viewer
21 20020052866 Methods and apparatus for streaming DICOM images through data element sources and sinks

If the word DICOM is in the title, it's a good bet that the patent will involve a method that uses the DICOM standard. The claims of such methods may possibly cover a user's intended uses of the standard. Had we simply done a search on the word "DICOM" without limiting the location of the search term to the title of the patent application, we would have retrieved 1144 patents from the USPTO patent application database. And these would just be those patents that are currently under review!

Notice that several of these methods seem to involve common tasks for informaticians who wish to tease out annotated data from a DICOM image and port the data and metadata into XML.

In the next blog, we'll look at one of the DICOM patents to determine the claims of the patent and the assignee of the patent.

-Jules Berman


Science is not a collection of facts. Science is what facts teach us; what we can learn about our universe, and ourselves, by deductive thinking. From observations of the night sky, made without the aid of telescopes, we can deduce that the universe is expanding, that the universe is not infinitely old, and why black holes exist. Without resorting to experimentation or mathematical analysis, we can deduce that gravity is a curvature in space-time, that the particles that compose light have no mass, that there is a theoretical limit to the number of different elements in the universe, and that the earth is billions of years old. Likewise, simple observations on animals tell us much about the migration of continents, the evolutionary relationships among classes of animals, why the nuclei of cells contain our genetic material, why certain animals are long-lived, why the gestation period of humans is 9 months, and why some diseases are rare and other diseases are common. In “Armchair Science”, the reader is confronted with 129 scientific mysteries, in cosmology, particle physics, chemistry, biology, and medicine. Beginning with simple observations, step-by-step analyses guide the reader toward solutions that are sometimes startling, and always entertaining. “Armchair Science” is written for general readers who are curious about science, and who want to sharpen their deductive skills.


Saturday, March 17, 2007

Searching for U.S. patents that encumber a standard

The USPTO (US Patent and Trademark Office) has a website that permits searches of issued patents (database extends to 1790). A separate search engine finds patent applications currently under review by the USPTO.

If you are developing a standard and you wish to avoid including patented technology in the methods included in your standard,

or

if you have developed a standard and are interested in protecting your users against trivial or non-innovative patents attached to the uses of your standard,

or

if you are a standards user and wish to avoid using standards that are encumbered by patents,

or

if you are a standards user who has chosen a standard and wish to avoid infringement on a patent that encumbers the standard,

you will want to visit the USPTO patent search site.


Here is the web page at http://appft1.uspto.gov/netahtml/PTO/search-adv.html for a search of pending applications with the word "DICOM" in the title of the application.



Here is the returned web page:



In the next blog, I will use DICOM (Digital Imaging and Communications in Medicine) as an example of a standard for which a user can conduct USPTO patent searches.

- Jules Berman
My book, Principles of Big Data: Preparing, Sharing, and Analyzing Complex Information was published in 2013 by Morgan Kaufmann.



I urge you to explore my book. Google books has prepared a generous preview of the book contents. If you like the book, please request your librarian to purchase a copy of this book for your library or reading room.

tags: big data, metadata, data preparation, data analytics, data repurposing, datamining, data mining, coercive standards, data standards, DICOM, embedded patents, intellectual property, patent search, sdo, specifications, standards development organizations, uspto

Friday, March 16, 2007

Data standards should not be coercive

In a prior post, I wrote about monopolistic standards: "Yet somehow, when a committee gets together to write a data standard, they often develop a very self-centered culture that tries to eliminate the 'competing' standards."

Standards committees sometimes display group behavior that can be described as antisocial or even sociopathic. They often want their standard to be the only standard used in a data domain (self-centered behavior). If there are other standards in the data domain, they sometimes use coercive methods (bullying) to force everyone to use their standard. They might also actively enlist organizations to enforce the standard on their members.

The most common coercive argument involves telling people that everyone else is using the standard, and if they don't use the standard, they'll lose business or they will be ostracized or ignored by the user community. Coercive behavior should not be tolerated in the user community.

I've personally seen coercive behavior in colleagues who are very decent people who would never dream of bullying another person. Somehow, when sociopathic behavior is developed through a committee process, people lose sense of what they are really doing. To my way of thinking, this is just another reason to favor specifications over standards, when feasible.

-Jules Berman
My book, Principles of Big Data: Preparing, Sharing, and Analyzing Complex Information was published in 2013 by Morgan Kaufmann.



I urge you to explore my book. Google books has prepared a generous preview of the book contents.

tags: big data, metadata, data preparation, data analytics, data repurposing, datamining, data mining, coercive standards, data standards, specifications, standards development organizations

Thursday, March 15, 2007

Data standards should not be monopolistic

In a prior blog, I listed 16 good practice suggestions for SDOs (Standards Development Organizations).

One suggestion was:

"Make optional standards, not required standards, so that the user community is not locked into one implementation."

This suggestion seems to defy common sense. The purpose of a standard is to provide a common process for a user community. Wouldn't a standard lose its significance if it were designed to be one of many?

First off, remember that I'm only addressing data standards (not physical standards). Data standards are special because, in many cases, you can interconvert data from one standard to another quite easily. Data standards are usually developed to facilitate data exchange and interoperability in a defined data domain. It is seldom the case that a given data standard will have universal appeal. We have dozens (if not hundreds) of image format standards. The multiplicity of standards can be useful. There are times when a GIF format is superior to a JPEG and other times when a PNG format is appropriate. Most people who work with images have robust file conversion applications that make it easy to exchange many different image formats.

Yet somehow, when a committee gets together to write a data standard, they often develop a very self-centered culture that tries to eliminate the "competing" standards.

If a data domain has one standard, then patents that encumber the uses of the standard will impact negatively on everyone. If a data domain has multiple standards, then the user community can simply switch between available standards to avoid patent prosecution. They might use one standard to accomplish a task that is exempt from patent infringement (typically the task for which the standard was designed and for which no patents apply). If/when a newly patented use of the one standard emerges, the user can avoid legal headaches by switching to another data standard not covered by the patent. It's really quite simple.

Members of data standards committees should understand that the purpose of any standards effort is to serve the user community with improved methods for exchanging data, for software interoperability and for enhanced opportunities to use data. A data standard is just an arbitrary document. It hardly even rates as a "thing" since it has no physical existence. SDOs should try to make new standards that fill a particular utility "niche" not covered by other standards in the same domain. If users gravitate to the standard in preference to other standards, that's OK. But crushing the "competition" should not be a goal for any SDO.

-Jules Berman
My book, Principles of Big Data: Preparing, Sharing, and Analyzing Complex Information was published in 2013 by Morgan Kaufmann.



I urge you to explore my book. Google books has prepared a generous preview of the book contents. If you like the book, please request your librarian to purchase a copy of this book for your library or reading room.

tags: big data, metadata, data preparation, data analytics, data repurposing, datamining, data mining

Sunday, March 11, 2007

Why governments rarely create data standards

As someone who has been involved in a variety of standards initiatives, I'm amused when the suggestion is raised that the federal government create the standard. The reasoning often goes something like this: "If the federal government produced the standard, the costs of developing and maintaining the standard would be absorbed by taxpayers, and the standard would be legitimized and maybe even required by government regulation."

Well it doesn't work like that. Aside from the show-stopping law that severely limits the U.S. government from creating data standards, there are practical reasons for the government to demure.

First, most data standards fail. They are either never finished, or they are immediately ignored by the intended user community, or they are replaced by competing standards that cover the same data domain, or they eventually become so obsolete that they are abandoned. There's very little reason for the government to become embroiled in efforts that typically fail.

Second, standards often impose significant implementation costs on the user community. As discussed in earlier blogs, standards can be encumbered by intellectual property, requiring users to pay license fees or patent royalties for the uses of the standard. Also, the standard may benefit some users and hurt others. If a standard benefits the members of the committee that created the standard at the expense of members of the user community who were excluded from the standards development process, lawsuits from allegedly injured users may result. SDOs are aware that, unless their standards are created fairly, the SDO (and its entity members) may be vulnerable to prosecution under the RICO Act.

Why would the government want to get involved in this kind of mess?

As discussed in a prior blog, there are instances when the functionality of standards can be achieved with specifications. The flexibility and freedom of specifications reduces many of the problems inherent in standards. Methods for developing specifications as an alternate to standards, have been described in a draft white paper and will be will be the subject of future blogs.

-Jules Berman


Science is not a collection of facts. Science is what facts teach us; what we can learn about our universe, and ourselves, by deductive thinking. From observations of the night sky, made without the aid of telescopes, we can deduce that the universe is expanding, that the universe is not infinitely old, and why black holes exist. Without resorting to experimentation or mathematical analysis, we can deduce that gravity is a curvature in space-time, that the particles that compose light have no mass, that there is a theoretical limit to the number of different elements in the universe, and that the earth is billions of years old. Likewise, simple observations on animals tell us much about the migration of continents, the evolutionary relationships among classes of animals, why the nuclei of cells contain our genetic material, why certain animals are long-lived, why the gestation period of humans is 9 months, and why some diseases are rare and other diseases are common. In “Armchair Science”, the reader is confronted with 129 scientific mysteries, in cosmology, particle physics, chemistry, biology, and medicine. Beginning with simple observations, step-by-step analyses guide the reader toward solutions that are sometimes startling, and always entertaining. “Armchair Science” is written for general readers who are curious about science, and who want to sharpen their deductive skills.


Saturday, March 10, 2007

Protecting the basic uses of a Standard

Yesterday's post took an example from DICOM to describe how the uses of an existing standard can be patented. A standard, even if it is a free and open standard, has little value if the intended uses of the standard are encumbered by patents. This would mean, in effect, that the user community must license the standard for uses that are covered by patents (or risk infringeing on one or more patents).

What can SDOs (Standards Development Organizations) do to prevent this problem? I am not a lawyer, and cannot give legal advice, but I would suggest that the following approach is sensible:

When the standard is being developed, the SDO should think about all the intended uses for the standard and publish a document (as an SDO white paper or as a journal publication) that describes, in detail, the ways that the standard can be used, supplying source code, instructions, sample implementations, user commentary, citations to relevant publications in the field, etc. This would help create prior art for the described uses of the patent. When the SDO provides public documentation for the common, expected uses of the standard, it would make it difficult for someone to come along and claim the those methods in a patent.

Also, SDOs should be prepared to work with their Patent Office to explain how patent applications related to their standard may be preceded by scientific art or may provide no new or non-obvious functionality to the standard. As described in an earlier post, the USPTO recognizes that software patents are a difficult area and has a program to seek guidance from the software community.

Suppose an inventor conceives of a totally new use of an existing standard and develops a patentable process or application for this new use. How would an SDO defend the standard in this case. Well, there might not be any defense. After all, if someone really comes up with a novel use for a standard that has a real-world application, why shouldn't their intellectual property be covered by a patent? The problem for SDOs comes from patents that cover customary, expected uses of a patent. SDOs with nothing in place to protect the basic uses of the standard have not done their job very well.

-Jules Berman
My book, Principles of Big Data: Preparing, Sharing, and Analyzing Complex Information was published in 2013 by Morgan Kaufmann.



I urge you to explore my book. Google books has prepared a generous preview of the book contents. If you like the book, please request your librarian to purchase a copy of this book for your library or reading room.

tags: big data, metadata, data preparation, data analytics, data repurposing, datamining, data mining

Friday, March 9, 2007

The uses of free standards can be patented

In a prior post, I described several ways in which data standards can become encumbered with intellectual property. One of these involves patenting the way that a standard is used. Even when a patent is free, there is nothing to stop an inventor from patenting uses for the standard.

DICOM (Digital Imaging and Communications in Medicine) exemplifies a standard that has a patented use. DICOM is a widely used image standard for radiologic images. Currently, there is an effort to have all medical specialties adopt DICOM as the exclusive format for all medical images.

U.S. Patent 6725231 , issued Apr 20, 2004, to Jingkun Hu and Kwok Pun Lee and assigned to Koninklijke Philips Electronics N.V., has the following claim.

"1. A method for mapping a DICOM specification into an XML document, comprising: mapping each entry of a DICOM table of the DICOM specification into a corresponding XML element of a plurality of XML elements,outputting each XML element of the plurality of XML elements to the XML document, in an output format that conforms to at least one of: an XML document-type-definition and an XML Schema."

A similar patent by the same parties sits at the European Patent Office (EPO).

Informaticians will note that teasing the data elements from a data object and porting them into XML is the bread-and-butter of modern informatics. A patent claim that covers this basic use of DICOM may be highly problematic.

SDOs(Standards Development Organizations) cannot stop inventors from patenting new and useful applications of their standards. However, there are easy ways for SDOs to reduce the risk of inventors patenting the common, expected uses of their standards. These will be described in a future post.

-Jules Berman tags: biomedical informatics, converting to xml, data standards, DICOM, embedded patents, european patent office, medical images, patent claims, radiology images, sdo, uspto, xml, science
Science is not a collection of facts. Science is what facts teach us; what we can learn about our universe, and ourselves, by deductive thinking. From observations of the night sky, made without the aid of telescopes, we can deduce that the universe is expanding, that the universe is not infinitely old, and why black holes exist. Without resorting to experimentation or mathematical analysis, we can deduce that gravity is a curvature in space-time, that the particles that compose light have no mass, that there is a theoretical limit to the number of different elements in the universe, and that the earth is billions of years old. Likewise, simple observations on animals tell us much about the migration of continents, the evolutionary relationships among classes of animals, why the nuclei of cells contain our genetic material, why certain animals are long-lived, why the gestation period of humans is 9 months, and why some diseases are rare and other diseases are common. In “Armchair Science”, the reader is confronted with 129 scientific mysteries, in cosmology, particle physics, chemistry, biology, and medicine. Beginning with simple observations, step-by-step analyses guide the reader toward solutions that are sometimes startling, and always entertaining. “Armchair Science” is written for general readers who are curious about science, and who want to sharpen their deductive skills.

Thursday, March 8, 2007

Specifications versus Standards

In a prior blog I suggested 16 ways that SDOs (Standards Development Organizations) can protect their standards from embedded patents. Suggestion 12 was "Make specifications, not standards."

This suggestion, I'm sure, is cryptic to most people. A major theme of this blog site is that specifications are different from standards and have a number of features that make them more suitable than standards for describing and exchanging many types of biomedical information.

Though informaticians often use the terms "specification" and "standard" interchangeably, a specification is just a formal way (usually employing RDF) of describing any data object. A data standard is a set of requirements, created by an SDO, that comprise a pre-determined content and format for a set of data related to a very specific kind of data object.

Features of a "specified" object:

1. Anyone can understand the composition and construction of the object

2. If the object is unique, anyone can distinguish the object from all other objects.

3. If the object falls into a known class of objects, anyone can determine, from the specification, the class of the object.


A specification serves most of the purposes of a standard, and much more (data description, data exchange, data merging, data interoperability, semantic logic). Data specifications spare us most of the heavy baggage that comes with a standard (limited flexibility to include changing data objects, locked-in data descriptors, licensing and other intellectual property issues, competing standards for the same domain resulting in limited interoperability, bureaucratic overhead, etc.).

Readers of this blog might want to read an introduction to RDF data specifications written by myself and Dr. G. William Moore. I believe that standards are important, but that specifications are even more important. There are instances in the field of biomedical informatics where specifications could serve better than standards. This was a developed theme in my book, Biomedical Informatics.I hope to provide many examples of specifications (how they are created and used) in future blogs here.

-Jules Berman
Science is not a collection of facts. Science is what facts teach us; what we can learn about our universe, and ourselves, by deductive thinking. From observations of the night sky, made without the aid of telescopes, we can deduce that the universe is expanding, that the universe is not infinitely old, and why black holes exist. Without resorting to experimentation or mathematical analysis, we can deduce that gravity is a curvature in space-time, that the particles that compose light have no mass, that there is a theoretical limit to the number of different elements in the universe, and that the earth is billions of years old. Likewise, simple observations on animals tell us much about the migration of continents, the evolutionary relationships among classes of animals, why the nuclei of cells contain our genetic material, why certain animals are long-lived, why the gestation period of humans is 9 months, and why some diseases are rare and other diseases are common. In “Armchair Science”, the reader is confronted with 129 scientific mysteries, in cosmology, particle physics, chemistry, biology, and medicine. Beginning with simple observations, step-by-step analyses guide the reader toward solutions that are sometimes startling, and always entertaining. “Armchair Science” is written for general readers who are curious about science, and who want to sharpen their deductive skills.

Wednesday, March 7, 2007

Don't ask the government to write your standard

In an earlier post, I wrote that it was a bad idea for user communities to ask the U.S. government to create their standards.

The government is seldom inclined to create new standards. The government expects user communities to form SDOs (Standards Development Organizations) to create their standards. NIST (National Institue for Standards and Technology) is interested in helping SDOs create standards and is not really interested in taking primary responsibility for new standards.

This government's hands-off approach towards standards is specified by law in the National Technology Transfer and Advancement Act of 1995 (NTTAA), Public Law 104-113, and is explained at some length in a NIST white paper:

Johnsen K, Pugh N. NISTIR 6778 Guidelines for NIST Staff Participating in Voluntary Standards Developing Organizations' Activities. US Department of Commerce, June 2002.

This Act directs Federal agencies to use standards developed by private standards sevelopment organizations, (not government agencies), whenever feasible.

Furthermore, it may be possible for individuals to sue the government for violations of NTTAA. In an opinion published by the Center for Regulatory effectiveness:

"Unless there is `clear and convincing' evidence that Congress intended to preclude judicial review under a statute, persons adversely affected or aggrieved are entitled to seek redress for federal agency violations of that statute under the APA (Administrative Procedure Act)."

There are some very practical reasons why the U.S. government is disinclined to write standards. This general area is discussed at some length in my book, Biomedical Informatics , and I'll be expanding on the issue in future blogs here.

- Jules Berman
My book, Principles of Big Data: Preparing, Sharing, and Analyzing Complex Information was published in 2013 by Morgan Kaufmann.



I urge you to explore my book. Google books has prepared a generous preview of the book contents. If you like the book, please request your librarian to purchase a copy of this book for your library or reading room.

Jules J. Berman, Ph.D., M.D.
tags: big data, metadata, data preparation, data analytics, data repurposing, datamining, data mining, biomedical informatics, federal government, medical standards, NIST, NTAA, sdo, specifications, technology transfer

Tuesday, March 6, 2007

Standards developers can help the Patent Office

In an earlier post, I listed 16 ways for standards development organizations (SDOs) to reduce the likelihood that their standard will contain embedded patents.

One of those ways was for the SDO to work with the USPTO, the US Patent and Trade Organization.

The USPTO seeks help from the community of software developers and has set up a new project, called peer to patent for this purpose. SDOs should avail themselves of this opportunity to ensure that trivial or non-original patents are not awarded patents that would encumber their standard. In addition, SDOs should contact the USPTO and offer their collective expertise to patent examiners who are reviewing applications related to their standards. SDOs should monitor new patent applications (all of which are publicly available) and contact the USPTO patent examiner when they see a trivial or non-original application related to their standard.

- Jules Berman
My book, Principles of Big Data: Preparing, Sharing, and Analyzing Complex Information was published in 2013 by Morgan Kaufmann.



I urge you to explore my book. Google books has prepared a generous preview of the book contents.

tags: big data, metadata, data preparation, data analytics, data repurposing, datamining, data mining, encumbered standard, european patent office, non-original patents, patent review, sdo, standards development organizations, trivial patents, uspto

Monday, March 5, 2007

Bad ideas to stop patent farmers

Earlier, I posted a blog that described patent farming (inserting patented methods into new standards with the intention of asserting the patent against the users of the standard). Yesterday, I suggested 16 steps that SDOs can take to reduce their vulnerability to patent farmers

SDOs sometimes pick the wrong tactics to protect themselves from patent farmers. Here are some ideas that are likely to be counterproductive.


1. (Bad idea) Try to get the U.S. government to create the standard.

2. (Bad idea) Make the standard a requirement for your user community (usually done by lobbying the government and/or user organizations.

3. (Bad idea) Encumber the standard under a pre-paid user license.

4. (Bad idea) Focus the standard for a single imagined user (e.g., radiology
departments)whose needs may not fall under an existing patent claim.

5. (Bad idea)Make no special accommodations for research/testing activities that arise from or use the standard.

6. (Bad idea) Pretend there is no problem and try to marginalize people who disagree.

In a future blog, I'll explain why these ideas are bad for the SDO or the intended user community.

- Jules Berman

tags: intellectual property, ip, patent farming, patent infringement, risk, sdo, standards development organizations
Science is not a collection of facts. Science is what facts teach us; what we can learn about our universe, and ourselves, by deductive thinking. From observations of the night sky, made without the aid of telescopes, we can deduce that the universe is expanding, that the universe is not infinitely old, and why black holes exist. Without resorting to experimentation or mathematical analysis, we can deduce that gravity is a curvature in space-time, that the particles that compose light have no mass, that there is a theoretical limit to the number of different elements in the universe, and that the earth is billions of years old. Likewise, simple observations on animals tell us much about the migration of continents, the evolutionary relationships among classes of animals, why the nuclei of cells contain our genetic material, why certain animals are long-lived, why the gestation period of humans is 9 months, and why some diseases are rare and other diseases are common. In “Armchair Science”, the reader is confronted with 129 scientific mysteries, in cosmology, particle physics, chemistry, biology, and medicine. Beginning with simple observations, step-by-step analyses guide the reader toward solutions that are sometimes startling, and always entertaining. “Armchair Science” is written for general readers who are curious about science, and who want to sharpen their deductive skills.

Sunday, March 4, 2007

Protecting standards from embedded patents

In a prior blog, I discussed patent farming within standards. I promised another blog with suggestions for SDOs (Standard Development Organizations) that would reduce the risk that their standards were victimized by patent farming.

Here is the list:

1. Work closely with the USPTO (US Patent and Trade Office) or the EPO (European Patent Office) to block trivial or non-original patents applied to your standard. Take advantage of the USPTO peer to peer project.

2. Collect and publish a list of prior art for all the methods included in your standard

3. Where no prior art exists, develop and publish your own "prior" art as open source projects

4. Do your own careful patent search to ensure that your standard does not include any previously patented methods

5. Require your members to search their company's patents to ensure that they have no patents within the standard.

6. The patent searches conducted by companies that are members of the standards committee should include all patents transferred to patent holding
companies

7. Require members of the standards committee to sign agreements (co-signed by authorized representtives of their companies) that no company patents (held or transferred) or claims will apply to the standard.

8. Whenever possible, use open, or public domain, or old (> 20 years) methods within your standards.

9. Whenever possible, use "escape" methods in the standard so that users are not locked into a single method that implements the standard

10. Make optional standards, not required standards, so that the user community is not locked into one implementation

11. Make interoperable standards (that can port to-and-from related standards)

12. Make specifications, not standards (to be explained in a later blog - JB)

13. Have open [to the public] committee meetings and publish the minutes of your meetings

14. Include a "user advocate" in the standards committee

15. Publish the efforts you have made to comply with some or all of the suggestions in items 1 through 14.

16. As a user, whenever possible, use standards that were developed with most of the suggestions from this list. Remember, it is the user (not the SDO) that will pay for patents embedded within standards.

Nothing can reduce your risk to zero, but following these items can help. I will be writing future blogs that explain specific items from the list.

- Jules Berman

tags: embedded patents, european patent office, hidden patents, medical standards, patent farming, prior art, sdo, specifications, standards development organizations, trivial patents, uspto
Science is not a collection of facts. Science is what facts teach us; what we can learn about our universe, and ourselves, by deductive thinking. From observations of the night sky, made without the aid of telescopes, we can deduce that the universe is expanding, that the universe is not infinitely old, and why black holes exist. Without resorting to experimentation or mathematical analysis, we can deduce that gravity is a curvature in space-time, that the particles that compose light have no mass, that there is a theoretical limit to the number of different elements in the universe, and that the earth is billions of years old. Likewise, simple observations on animals tell us much about the migration of continents, the evolutionary relationships among classes of animals, why the nuclei of cells contain our genetic material, why certain animals are long-lived, why the gestation period of humans is 9 months, and why some diseases are rare and other diseases are common. In “Armchair Science”, the reader is confronted with 129 scientific mysteries, in cosmology, particle physics, chemistry, biology, and medicine. Beginning with simple observations, step-by-step analyses guide the reader toward solutions that are sometimes startling, and always entertaining. “Armchair Science” is written for general readers who are curious about science, and who want to sharpen their deductive skills.

Saturday, March 3, 2007

Patent farming in medical standards

In a prior blog, I listed some of the IP issues involving medical standards. One of the biggest problems is the inclusion of patented methods within standards. Hidden patents within standards can be asserted against anyone who implements the standard.

Bruce Perens, an open source Guru, has written extensively on this problem. He uses the term "patent farming" to describe the act of planting the seed, a patented method, into the standard while the standard is being developed. Once the standard is approved and implemented by the user community, the patent farmer reaps his crop by asserting the hidden patent. Perens provides several specific examples where patent farming has benefited standards committee members to the detriment of the users of the standard.

You might think that it would be a simple matter to require the participants in a standards effort to disclose any of their owned patents that might be included in the standard. In the Rambus case described by Perens, such a disclosure agreement was in evidence, but the patent disclosure policy of the standards organization proved unenforceable in court.

Even if an SDO (Standards Development Organization) had an enforceable disclosure agreement with its members, it would seem that a committee member could sell strategic patents to a patent holding company, striking a deal with the holding company that would yield IP earnings on an asserted patent positioned within a standard. By working with a holding company, the original patent holder divests the patent and would not need to disclose the existence of the patent to the standards committee.

Patent holding companies (called patent trolls by their detractors) buy portfolios of patents to assert IP rights in strategically chosen business sectors. Recently, hospital-based technologies (including radiology imaging) have been the target of at least one major patent holding company.

In the next blog, I will discuss measures that SDOs may take to reduce their vulnerability to patent farmers. I will also discuss some defensive strategies that are often suggested but which really do not work.

This blog will also appear on Bruce Friedman's Labsoft
site

- Jules Berman
My book, Principles of Big Data: Preparing, Sharing, and Analyzing Complex Information was published in 2013 by Morgan Kaufmann.



I urge you to explore my book. Google books has prepared a generous preview of the book contents.

tags: big data, metadata, data preparation, data analytics, data repurposing, datamining, data mining, hidden patent, intellectual property, ip, medical standards, patent holding company, patent troll, specification, viral patent

Friday, March 2, 2007

New version of neoplasm classification available

The latest version of the Developmental Lineage Classification and Taxonomy of Neoplasms is now available at:

NEOCLXML.GZ 716,963 bytes and
NEOSELF.GZ 1,086,677 bytes

Neoclxml.gz expands to over 10 Megabytes and is an XML file.
Neoself.gz expands to over 20 Megabytes and is a flat-file.

Each file contains over 145,000 neoplasm terms grouped in >6,000 concepts, and classified according to embryonic lineage. This is, by far, the most extensive nomenclature and classification of neoplasms in existence. It is copyrighted to Jules J. Berman and distributed under a GNU document license.

Detailed information on the classification is available in my article:
Tumor classification: molecular analysis meets Aristotle

tags: cancer, medical terminology, nomenclature, open access, open source
Science is not a collection of facts. Science is what facts teach us; what we can learn about our universe, and ourselves, by deductive thinking. From observations of the night sky, made without the aid of telescopes, we can deduce that the universe is expanding, that the universe is not infinitely old, and why black holes exist. Without resorting to experimentation or mathematical analysis, we can deduce that gravity is a curvature in space-time, that the particles that compose light have no mass, that there is a theoretical limit to the number of different elements in the universe, and that the earth is billions of years old. Likewise, simple observations on animals tell us much about the migration of continents, the evolutionary relationships among classes of animals, why the nuclei of cells contain our genetic material, why certain animals are long-lived, why the gestation period of humans is 9 months, and why some diseases are rare and other diseases are common. In “Armchair Science”, the reader is confronted with 129 scientific mysteries, in cosmology, particle physics, chemistry, biology, and medicine. Beginning with simple observations, step-by-step analyses guide the reader toward solutions that are sometimes startling, and always entertaining. “Armchair Science” is written for general readers who are curious about science, and who want to sharpen their deductive skills.