Saturday, May 11, 2019

Index: Evolution's Clinical Guidebook

In the past few blogs, I've been discussing the recent publication of my book, Evolution’s Clinical Guidebook: Translating Ancient Genes Into Precision Medicine. The premise of this book is that modern medicine is based on an understanding of evolutionary processes. Evolution shows us the relationships between the subdisciplines of medicine that benefit directly from Precision Medicine (i.e., pathology, microbiology, clinical genetics, pharmacology, and bioinformatics). In Evolution's Clinical Guidebook, all of these diverse fields are brought together, under the subject of evolution. To illustrate, I have listed below the first few pages of the index to the book (letters A through H). Just by perusing these index terms, you can get some idea of the role played by evolution as the great unifier of modern medicine.

Partial Book Index
A 
Abiogenesis 
  catalysts, 2 
  cellular life, 
  earliest signs of, 4 
  definition, 1
  DNA, 4 
  evolution, 2 
  life on earth, 1 
  natural selection, 2
  RNA, 4­5
Acanthodians, 215
Acarnus erithacus, 192, 192f 
Acidianus Tailed Virus, 160
Acquired disease, 20, 29-­30
Actin, 17
Actinistia, 216
Actinopterygii, 215
Adaptive immune system, 214
Adaptive immunity, 161, 164
Adenocarcinoma, 125, 126-­127f Adult organisms, 94, 102
Agenesis of the Corpus Callosum (ACC), 222
Aging, 77, 216, 257-­258, 261-­265 
  vs. diseases of old people, 257-­259 
  evolution of, 252­-265 gene, 257­-258
Agnatha, 214
Allele, 77
Allium cepa, 66
Allium ursinum, 66 alphaA-Crystallin, 18
Alstrom syndrome, 213-­214
Alternative RNA splicing, 126
Amanita phalloides, 155
Amborella trichopoda, 150
Ambulacraria, 197
Amniotes, 218
Amoebozoa, 184, 290
Amphibia, 216­218
Amphioxus, 150
Amyloid world, 30
Anatomy, 246
Ancestral classes, 175-­176, 196­-197
Ancestral lineage, 12
Ancestral species, 12
  eukaryotic development, steps in, 14-­15, 15f 
  gene families, 13­-14
Ancylostoma duodenale, 292
Androgenesis, 221
Aneuploidy, 69­70, 70f Angelman syndrome, 222
Angiogenesis, 126
Angiosperma, 29­-44
Animal cells, 262
Animalia, 27f 
Animal model, human disease 
  Koch's postulates and reliance, 299-­300 
  nonhuman eutherians, 285 
  non-vertebrate models, cancer research, 298-­299 
  for orthodiseases
    Caenorhabditis elegans (nematodes), 294, 296-­298, 297f
    Danio rerio (zebrafish), 294, 297-­298, 298f 
    Drosophila melanogaster (fruit fly), 294, 297 
  homologous genes, 296 
  human pathologic processes, 295 
  orthologous genes, 295­-296
  Saccharomyces cerevisiae (yeast), 294­-296, 296f 
  rabbits, myxoma virus on, 300-­302 
  rats, 285 
  specificities and idiosyncrasies
    clinical trial, 286­-287
    Gram-negative organisms, 286 
    infections, history of, 290­-291, 293-­294 
    inflammatory response, 287 
    lipopolysaccharide, 286 
    mice, 286­288 
    microorganisms, potential pathogens, 288-­290 
    rodent models, 287-­288
  TGN1412, 286­287
Anlagen, 127
Aplastic anemia, 270
Apoikozoa, 186­-187
Apomorphy, 198
Apoptosis, 53
Archaea, 26, 27f 
Archaeplastida, 14-­15, 28, 186
Archiplastidae, 185
Arrhythmogenic cardiomyopathies, 191
Arthropods, hepatopancreas of, 118-­119
Ascaris lumbricoides, 292
Aspergillus flavus, 155
Association vs. cause, 77
Ataxia telangiectasia, 30
Atlantogenata, 229
Autism, 58
Autoantibody disease vs. autoimmune disease, 233
Autosomal dominance, 270
Azacytidine, 122

B 
Bacillus globigii, 288
Bacillus subtilis, 244
Bacteria, 26
Baraitser-Winter syndrome, 57
Bartonella species, 26
Basal cell carcinomas, 223-­224, 223f Basal layer, 255
Benign tumor, 218, 233
Bikonta, 183­184
Bilateria, 94, 193­203, 254
Bioinformatics, 78
Biological diversity, 152
Biological theory, 308
Biopsy specimen, 255f Biosphere, 157
Biosynthetic cycle, 17
BK polyomavirus, 292
Blastocystis hominis, 289
Blastula, 191
Blastulation, 189, 190f Blended class, 322, 325
Blood, photomicrograph of, 108f Bloom syndrome, 262
Bone marrow, 256
Bookie, 266
Bootstrapping paradoxes, 5
  chicken and egg paradox, 6­8 
  enzyme and enzyme-synthesizing machinery, 8 
  general solution for, 11-­12 
  hardware or software, 5­6 
  process of evolution and product of evolution, 9-­10
  RNA and DNA, 8­9, 10f 
  species and class of animals, 10-­11
Borderland of Embryology and Pathology, 118-­120
Boreoeutheria, 229
Borhing-Opitz syndrome, 57
Brassica oleracea, 158
BRCA, 271
BRCA1 gene, 269
BRCA2 gene, 269
Breast cancer, 260
Breeds, 248, 249f Brugia malayi, 292
Bryophyte life cycle, 7
Bryophytes, 7
BUB1B gene, 70
Bubonic plague, 292
Bungarus caeruleus, 110f "But-for" test, 30

C 
Caenorhabditis elegans, 65, 123, 294, 296-­298, 297f Calvin cycle, 17
Cambrian explosion, 21­-25, 21­-22f, 155-­156, 185 
  coexistence and coevolution, 25­26
  animals (class Metazoa), 28
  Archaea, 26, 27f 
  Archaeplastida (plants), 28 
  bacteria, 26 
  fungi, 28 
  single-celled eukaryotes, 28 
  viruses, 26
Cambrian period, 21-­22, 22f, 24
Cancer, 259­261
Cancer cells, 51
Cancer progression, 30
Carcinogen, 127
Carcinogenesis, 17, 30-­31, 223
Carcinoid tumors, 211
Carcinosarcomas, of uterus, 119
Caretaker diploid organism, 7
Carotenoids, 156
Carrier, 26, 31 asymptomatic, 27
Catarrhini, 231
Cause, 19, 24, 28
Cell types, epigenome and evolution of, 103-­115
Cell-type-specific gene expression, 112
Cenancestor, 37
Cephalochordata, 198­-203
Cercopithecoidea, 231
Chagas disease, 28, 269-­270
Chance occurrence, 9­10, 31
Channelopathy, 127
Charcot-Marie-Tooth disease, 261
CHARGE syndrome, 57
Chemical diversity, 154
Chemokine, 293, 301
Child class, 198
Chimeric Antigen Receptor for T cells (CAR-T) therapy, 157, 164
Chitin, 185
Chlamydia trachomatis, 292
Chloroplast evolution, 14­15, 17, 31
Choanoflagellatea, 186­187
Choanozoa. See Apoikozoa Chondrichthyes, 215
Chordata, 197­203
Chordoma, 198
Choriocarcinoma, 221
Chromatin, 156
Chromosomal disorder, 271
Chromosomes, 61
  number, variations in, 66
Chronic obstructive pulmonary disease (COPD), 271
Chytrids, 186
Cichlids, 153­154, 154f Ciliopathies, 213-­214, 233
Cis-acting vs. trans-acting, 127-­128
CISD2 gene, 264
Cisd2-null mice, 264
Clade, 128
Cladistics, 198
Class, 7, 32
  of animals, 10, 12, 24­-25 
  of cells, 6 
  of metazoan organisms, 25 
  of organisms, 4­5 
  of paradoxes, 5
Classification, 11 
  data retrieval, 176 vs. diagnosis, 199 
  flying animals, 175 
  formal definition of, 175 
  inferencing, 176 
  mammals, Aristotle, 173-­174 vs. ontology, 177, 198 
  pseudo-scientific assertion, 177 
  self-correction, 177 
  simplification, 175 swimming animals, 175 walking animals, 175
Classification system vs. identification system, 325
Class noise, 322. See also Blended class Clinical trial, 286-­287, 301
Clostridium feseri (blue bacteria), 107f 
Clustered Regularly Interspaced Short
  Palindromic Repeats (CRISPR), 251
Cnidaria, 193
Cnidarian organisms, 193, 195f Coccidia, 28
Cockayne syndrome, 262
Codon, 8, 32
Cofactor, 156, 165
Collision tumor theory, 120
Colon cancers, 119
Combined deficiency, 128
Commensal, 32­33
Competence of classification, 176
Complex disease, 33
Composition theory, 120
Congenital anomaly, 271
Congenital chondrodystrophy, 17-­18
Congenital disorder, 17-­18, 33
Congenital hemangiomas, 119
Connective tissue, 128
Contig disease, 271
Contiguous gene deletion syndrome, 271
Convergence, 165
Conversion theory, 120
Copy number, 78
Corbels, 239-­240
Cornelia de Lange syndrome, 112
Corpus callosum, 222
Cousin class, 167
CpG island, 128
CpG sites, 103
Cranial neural crest, 208
Craniata, 128, 148-­149, 198­-203, 207-­212
Craniates. See Craniata Crocodilia, 218
Crohn's disease, 273
Ctenophora, 193
Ctenophorans, 193, 194f Cyanobacteria, 14­15, 15f, 17, 23, 33
Cyclic neutropenia, 55
Cyclostomata, 211-­212
Cynodonts, 219
Cystic fibrosis transmembrane conductance regulator (CFTR), 113
Cytokine storm, 286-­287
Cytopenia, 78
Cytotrophoblasts, 229


D 
Danio rerio (zebrafish), 294, 297­-298, 298f Daphnia pulex, 154
Darwin's theory, 153
Decitabine, 122 delta1-Crystallin, 18
Demodex, 291
Demospongiae, 192
De novo disease mutations, 56-­58
De novo genes, 74­-76
De novo mutation, 78
Dense core granules, 210
Dermal bones, 209
Dermis, 255
Dermoptera, 230
Desmosomes, 187, 188f, 189, 190f, 191
Deuterostomia, 197
Deuterotomia, 148-­149
Developmental disorder, 128
Devolution, 241
Diagnosis vs. classification, 199
Diamond Blackfan anemia, 210
Diethystilbestrol (DES), 117
Differentiation, 78
Digenic disease, 128­-129
DiGeorge syndrome, 82
Dinosauria, 218
Diploid organism, 7
Dipnomorpha, 216
Dipnotetrapodomorpha, 216
Dipoblasts, 193
Direct mutagen, 52
Direct transdifferentiation, 111
DNA, 73­87
DNA-DNA reassociation kinetics, 151-­152
DNA methylation, 4, 33
DNA repair, 17, 33
Dollo's law, 271
Dormancy, 33
Down syndrome, 56
Driver pathway, 19, 34
Drosophila melanogaster, 101, 294, 297
Drug development, economics of, 20
Druggable driver, 34
Dysgerminomas in women, 102
Dyskeratosis congenita, 262
Dysplasia, 199


E 
Echidnas, 226
Echinodermata, 197
Ectoderm, 129
Eikenella corrodens, 289
Embryo, 99­103 vs. fetus, 129
Embryogenesis, 161
Embryology, relationship between evolution and, 93-­103
Embryonic anlagen, 102
Embryonic stem cell, 129
Endoderm, 6, 34
End-stage condition, 272
Enhancer, 78
Enigmatic pacific hagfish, 211-­212, 212f 
Enterocoelomata. See Deuterostomia Epidermis, 255, 256f 
Epigenome, 4, 34, 221
Epigenome disruptors, 121-­122
Epigenomic methylation, variations in, 65
Epimutation, 129-­130
Epipubic bones, 239
Epistasis, 34­35
Epithelial cell, 130
Epitheliozoa, 193
Epithelium, 189, 189f Erasure, 105-­106, 221
ERCC6 gene, 262
ERCC8 gene, 262
Etiology, 79
Euarchonta, 230
Euarchontoglires, 229-­230
Eugenics, 247-­252
Eugnathostomata, 215
Eukaryota, 15, 148­149
  Bikonta, 183-­184 eukaryotes, 179
  Excavata, 183-­184 
  mitochondria, 180 
  nucleus, 179, 182
  Podiata, 183­184 
  prokaryotic life forms, 179 
  single-celled eukaryotes, 179, 180f 
  Syringammina fragilissima, 179 
  undulipodia, 182
  Unikonta, 183-­184
Eukaryotes, 7, 67, 153­-155, 161, 176, 189
Eumetazoa, 191-­193
Euteleostomi, 215
Eutheria, 97-­98, 226-­235
Eutherians, 226-­235
Evo-devo, 130
Evolutionary convergence, 166
Evolutionary frustration, principle of, 248
Evolution, 
  as fantasy bacterial pathogen, 317­-319 
  disease diagnosis by symptoms, 321­-323 
  drug development and testing, 319 
  homologous genes, 316­317 
  science fiction aficionados, 324 
  speciation, 324 
  taxonomic organisms, treatments for, 319-­321 
  theory of intelligent design, 324-­326
Evolution deniers, 307
Evolvability, 153, 165
Exaptation, 35
Excavata, 183­184, 290
Exome sequencing, 79
Extraembryonic cells, 233
Extraembryonic tissues, 221, 233


F 
Facultative intracellular organism, 79
Fanconi anemia bone marrow failure, 257, 263
Female Anopheles mosquito, 107
Fetal period, 101
Filarial nematodes, 292
Filozoa, 186
First Law of Bioinformatics, 59
Fish, 118-­119
Forme fruste, 130
Founder effect, 13, 35
FOXL2 gene, 112
Fungi, 28, 186, 189


G 
Gallertoids, 130, 187, 191­192
Gametes, 6, 35, 148­-149
Gametic organism, 7
Gametophytes, 7
Gastrointestinal stromal tumors (GISTs), 19
Gastropods, 118-­119
Gene(s), 269
Gene conservation, 58-­61
Gene diversity, 154
Gene editing techniques, 251
Gene pool, 13, 26, 35, 58­61, 151­152, 314, 317­319, 324
Generalization, 35
Gene regulation, 35
Gene sharing, 18
Gene size, 66­67
Gene-targeted therapy, 19
Genetically engineered mouse (GEM), 233-­234
Genetic fine-tuning, 124­-138
Genetic heterogeneity, 130
Genetic instability, 79
Genetic mutations, 151
Genetic surplus disorder, 79
Genome, 12, 14-­15, 26, 35
Genome Wide Association Study (GWAS), 272
Genomic architecture, 64­73
Genomic disorder, 79
Genomic regulation, 76­87
Genomic regulatory processes, 125­138
Genomic regulatory systems, 
  pathologic conditions of, 121-­138
Genomic structural abnormalities (GSVs), 69
Genomic structural variation, 79
Germ cell, 6, 35
Germ cell line, 35­-36
Germ layers, 131
Germline, 79
Germline mutation, 80
Gestational trophoblastic disease, 234
Giant viruses, 162
Glires, 230
Globins, 13
Gnathostomata, 161, 214-­215
Gorillini, 232


H 
HACEK, 272
Haeckel's theory, 96-­97
Haemophilus influenzae, 244
Hair follicles, 223­224, 224f Hamartoma, 158, 166
Haploid, 7­8, 36
Haploid organisms, 7, 36
Haplorrhini/haplorhini, 231
Haplotype, 80
HAS2 gene, 248
Hemichordata, 197
Hepatitis B, 292
Hepatocyte, 100
Hepatoid adenomas, 119
Hepatoma, 131
Hereditary nonpolyposis colorectal cancer syndrome, 131
Heritability, 131
Heterokonts, 289, 301
Hirschspring disease, 209
Histone, 65, 199
Histone disruptors, mild effects of, 122
Histopathology, 131
Histozoa, 193
Hodgkin lymphoma, 273
Holometabolism, 156, 166
Holomycota, 186
Holozoa, 186
Holt-Oram syndrome, 123
Homeobox, 36
Hominidae, 231­232
Homininae, 232
Hominini, 232
Hominoidea, 231
Homo, 232
Homo erectus, 232
Homolog, 12, 18, 24, 36
Homologous genes, 316-­317
Homologous recombination, 
  during meiosis, 62
Homoplasy, 166
Homo sapiens, 147­149, 176, 232­235
Homozygosity, 80
Hookworms, 292
Horizontal gene transfer, 80
Horse, 
  gestation period of, 241f 
Horseshoe crabs, 149, 149f 
Host, 155­157, 159­161, 163, 166
HOX gene diseases, mild clinical course of, 123
HOX genes, 24
Human(s), 232­235
Human diseases, 115­118
Human embryo, dorsum of, 97f Human embryology, 246
Human embryonic stem cells, 100f Human gene pool, 55-­56
Human kidney, 101
Human phylogenetic lineage, 177­-179
Hutchinson-Gilford progeria syndrome, 263
Hydatidiform mole, 221, 234
Hydractinia carnea, 253
Hydrops-ectopic calcification-"moth-eaten" (HEM), 17-­18
Hylobatidae, 231
Hyperplasia, 273
Hypoxanthine-guanine phosphporibosyl transferase (HGPRT), 60­-61
Evolution’s Clinical Guidebook: Translating Ancient Genes Into Precision Medicine is available from Amazon or from the publisher's website. If you are fortunate enough to have full institutional access to ScienceDirect, you can download chapters at no cost.

Jules Berman



key words: evolution, precision medicine, genetics, rare disease, clinical genetics, bioinformatics, evo-devo, Jules J. Berman, Ph.D., M.D.

Friday, May 10, 2019

Contents: Evolution’s Clinical Guidebook: Translating Ancient Genes Into Precision Medicine

In yesterday's blog, I announced the publication of my book, Evolution’s Clinical Guidebook: Translating Ancient Genes Into Precision Medicine. The premise of this book is that modern medicine is based on an understanding of evolutionary processes. Basically, without evolution, the fledgling field of precision medicine would wither and die, and we would lose our opportunity to prevent, diagnose, and treat the diseases that account for the bulk of morbidity and mortality in humans and in animals.

This book is available from Amazon or from the publisher's website. If you are fortunate enough to have full institutional access to ScienceDirect, you can download chapters at no cost. Here is the Table of Contents.

Contents: Evolution’s Clinical Guidebook: Translating Ancient Genes into Precision Medicine

1. Evolution, From the Beginning 1

Section 1.1 In the Beginning 1
Section 1.2 Bootstrapping Paradoxes 5
Section 1.3 Our Genes, for the Most Part, 
Come From Ancestral Species 12
Section 1.4 How do Metabolic Pathways Evolve? 15
Section 1.5 Cambrian Explosion 21
Section 1.6 After the Cambrian: Coexistence 
and Coevolution 25
Glossary 29
References 44

2. Shaking Up the Genome 51
Section 2.1 Mutation Burden 51
Section 2.2 Gene Pools and Gene Conservation 58
Section 2.3 Recombination and Other 
Genetic Tricks 61
Section 2.4 Genomic Architecture: 
An Evolutionary Free-for-All 64
Section 2.5 Rummaging Through the DNA Junkyard 73
Glossary 77
References 87

3. Evolution and Embryonic Development 93
Section 3.1 The Tight Relationship Between 
Evolution and Embryology 93
Section 3.2 The Epigenome and the Evolution 
of Cell Types 103
Section 3.3 An Embryonic Detour for 
Human Diseases 115
Section 3.4 The Borderland of Embryology 
and Cancer 118
Section 3.5 Pathologic Conditions of the 
Genomic Regulatory Systems 121
Glossary 125
References 138

4. Speciation 145
Section 4.1 A Species is a Biological Entity 145
Section 4.2 The Biological Process of Speciation 147
Section 4.3 The Diversity of Living Organisms 152
Section 4.4 The Species Paradox 157
Section 4.5 Viruses and the Meaning of Life 159
Glossary 164
References 168

5. Phylogeny: Eukaryotes to Chordates 173
Section 5.1 On Classification 173
Section 5.2 The Complete Human 
Phylogenetic Lineage 177
Section 5.3 Eukaryotes to Obazoans 179
Section 5.4 Opisthokonts to Parahoxozoa 185
Section 5.5 Bilaterians to Chordates 193
Glossary 198
References 203

6. Phylogeny: Craniates to Humans 207
Section 6.1 Class Craniata and the Ascent 
of the Neural Crest 207
Section 6.2 Vertebrates to Synapsids 212
Section 6.3 Mammals to Therians 220
Section 6.4 Eutherians to Humans 226
Glossary 233
References 235

7. Trapped by Evolution 239
Section 7.1 Spandrels, Pendentives, Corbels, 
and Squinches 239
Section 7.2 Evolving Backwards 240
Section 7.3 Eugenics: Proceed With Caution 247
Section 7.4 The Evolution of Aging, 
and the Diseases Thereof 252
Section 7.5 Why Good People Get Bad Diseases 265
Glossary 270
References 277

8. Animal Models of Human Disease: Opportunities 
and Limitations 285
Section 8.1 The Animal Model Problem, in a 
Nutshell 285
Section 8.2 Specificities and Idiosyncrasies 286
Section 8.3 New Animal Options 294
Section 8.4 The Proper Study of Mankind 300
Glossary 301
References 302

9. Medical Proof of Evolution 307
Section 9.1 What Does Proof Mean, 
in the Biological Sciences? 307
Section 9.2 The Differences Between Designed 
Organisms and Evolved Organisms 309
Section 9.3 What if Evolution Were Just 
a Foolish Fantasy 316
Glossary 325
References 326
Index 329


Jules Berman



key words: evolution, precision medicine, genetics, rare disease, clinical genetics, bioinformatics, evo-devo, Jules J. Berman, Ph.D., M.D.

Thursday, May 9, 2019

Just Published: Evolution’s Clinical Guidebook: Translating Ancient Genes Into Precision Medicine


This month, Academic Press has published my book, Evolution’s Clinical Guidebook: Translating Ancient Genes Into Precision Medicine. The premise of this book is that modern medicine is based, in one way or another, on an understanding of evolutionary processes. If evolution were a fabrication, then we would not be able to make any sense of the genomic data that is pouring out of research laboratories. We would not be able to design rational, cost effective, screening protocols to test the effectiveness of new drugs. We would not be able to identify the human sub-populations that will benefit from gene-targeted therapies. We would not be able to find the cause of rare diseases, and we would not be able to apply such knowledge to the treatment of common diseases. Without evolution, we would not understand how cancer develops, or how we might intervene in the process. Basically, without evolution, the fledgling field of precision medicine would wither and die, and we would lose our opportunity to prevent, diagnose, and treat the diseases that account for the bulk of morbidity and mortality in humans and in animals. This book demonstrates, through hundreds of examples, that modern medicine is built on the theory of evolution.

This book is available from Amazon or from the publisher's website. If you are fortunate enough to have full institutional access to ScienceDirect, you can download chapters at no cost.

Jules Berman



key words: evolution, precision medicine, genetics, rare disease, clinical genetics, bioinformatics, evo-devo, Jules J. Berman, Ph.D., M.D.

Saturday, August 4, 2018

Second Edition of Principles and Practice of Big Data now on Science Direct

The Second edition of my book Principles and Practice of Big Data has just been released and is available for purchase at many sites, including Amazon.

For those of you fortunate enough to have access to Science Direct, you can download chapters of my book at:

https://www.sciencedirect.com/science/book/9780128156094



TABLE OF CONTENTS

  Author's Preface to Second Edition 

  Author's Preface to First Edition 

  Chapter 1. Introduction
    Section 1.  Definition of Big Data
    Section 2.  Big Data Versus small data
    Section 3.  Whence Comest Big Data?
    Section 4.  The Most Common Purpose of Big Data is to Produce small data
    Section 5.  Big Data Sits at the Center of the Research Universe
    Section 6.  Case Study: From the Press: Big Claims for Big Data

  Chapter 2. Providing Structure to Unstructured Data
    Section 1.  Nearly all Data is Unstructured and Unusable in its Raw Form
    Section 2.  Term Extraction
    Section 3.  Autocoding
    Section 4.  Concordances
    Section 5.  Indexing
    Section 6.  Machine Translation
    Section 7.  Case Study: Sorted Lists (Why and Why Not)
    Section 8.  Case Study: Doublet Lists 
    Section 9.  Case Study: Ngram Lists 
    Section 10.  Case Study: Proximity Searches Using Only a Concordance  
    Section 11.  Case Study (Advanced): Burrows Wheeler Transform (BWT) 

  Chapter 3. Identification, Deidentification, and Reidentification
    Section 1.  What are Identifiers?
    Section 2.  Difference Between an Identifier and an Identifier System
    Section 3.  Generating Identifiers
    Section 4.  Really Bad Identifier Methods
    Section 5.  Registered Unique Object Identifiers
    Section 6.  Deidentification
    Section 7.  Reidentification
    Section 8.  Case Study: Data Scrubbing
    Section 9.  Case Study: Identifiers in Image Headers
    Section 10.  Case Study: Hospital Registration
    Section 11.  Case Study: One-Way Hashes

  Chapter 4. Metadata, Semantics, and Triples
    Section 1.  Metadata
    Section 2.  eXtensible Markup Language
    Section 3.  Namespaces
    Section 4.  Semantics and Triples
    Section 5.  Case Study: Syntax for Triples 
    Section 6.  Case Study: RDF Schema
    Section 7.  Case Study: RDF Parsers and the Fungibility of Triples
    Section 8.  Case Study: Dublin Core 

  Chapter 5. Classifications and Ontologies
    Section 1.  It's All About Object Relationships 
    Section 2.  The Difference Between Object Relationships and Object Similarities
    Section 3.  Classifications, the Simplest of Ontologies
    Section 4.  Ontologies, Classes with Multiple Parents
    Section 5.  Choosing a Class Model
    Section 6.  Paradoxes
    Section 7.  Class Blending
    Section 8.  Common Pitfalls in Ontology Development
    Section 9.  Case Study: An Upper Level Ontology 
    Section 10.  Case Study: Visualizing Class Relationships 
    Section 11.  Case Study: Bringing Order from Chaos with the Classification of Living Organisms

  Chapter 6. Introspection
    Section 1.  Knowledge of Self
    Section 2.  Data Objects
    Section 3.  How Big Data Uses Introspection 
    Section 4.  Case Study: Timestamping Data 
    Section 5.  Case Study: A Visit to the TripleStore 

  Chapter 7. Data Integration and Software Interoperability
    Section 1.  Another Big Problem for Big Data
    Section 2.  The Standard for Standards
    Section 3.  Standard Trajectories
    Section 4.  Specifications and Standards
    Section 5.  Versioning
    Section 6.  Compliance Issues
    Section 7.  Interfaces to Big Data Resources
    Section 8.  Case Study: Standardizing the Chocolate Teapot

  Chapter 8. Immutability and Immortality
    Section 1.  The Importance of Data that Cannot Change  
    Section 2.  Immutability and Identifiers
    Section 3.  Persistent Data Objects
    Section 4.  Coping with the Data that Data Creates
    Section 5.  Reconciling Identifiers Across Institutions
    Section 6.  Case Study: The Trusted Timestamp
    Section 7.  Case Study: Blockchains and Distributed Ledgers
    Section 8.  Case Study: Zero-Knowledge Reconciliation   

  Chapter 9. Assessing the Adequacy of a Big Data Resource
    Section 1.  Looking at the Data 
    Section 2.  The Minimal Necessary Properties of Big Data 
    Section 3.  Case Study: Utilities for Viewing and Manipulating Very Large Files
    Section 4.  Case Study: Flattened Data 
    Section 5.  Case Study: Data that Comes with Conditions 

  Chapter 10. Measurement
    Section 1.  Accuracy and Precision
    Section 2.  Data Range
    Section 3.  Counting
    Section 4.  Normalizing, and Transforming Your Data
    Section 5.  Reducing Your Data
    Section 6.  Understanding Your Control
    Section 7.  Practical Significance of Measurements
    Section 8.  Case Study: Gene Counting
    Section 9.  Case Study: The Significance of Narrow Data Ranges
    Section 10.  Case Study (Advanced): Fast Fourier Transform
    Section 11.  Case Study (Advanced): Principal Component Analysis

  Chapter 11. Indispensable Tips for Fast and Simple Big Data Analysis
    Section 1.  Speed and Scalability
    Section 2.  Fast Operations, Suitable for Big Data, that Every Computer Supports
    Section 3.  Fast Correlation Methods
    Section 4.  Clustering 
    Section 5.  Methods for Data Persistence (Without Using a Database)
    Section 6.  Back_of_Envelope Computations for Big Data
    Section 7.  Fast Data Retrieval for Lists of any Size 
    Section 8.  Case Study: One-Pass Mean and Standard Deviation
    Section 9.  Case Study: Climbing a Classification
    Section 10.  Pre-computing lookup lists: Google's PageRank
    Section 11.  Case Study: A Database Example 
    Section 12.  NoSQL and other Non-Relational Big Data Databases

  Chapter 12. Finding the Clues in Large Collections of Data
    Section 1.  Denominators 
    Section 2.  Frequency Distributions
    Section 3.  Multimodality
    Section 4.  Outliers and Anomalies
    Section 5.  Case Study: Discarding the Noisiest Frequencies in a Data Signal
    Section 6.  Case Study: Predicting User Preferences
    Section 7.  Case Study: Multimodality in Legacy Data
    Section 8.  Case Study: Big and Small Black Holes

  Chapter 13. Using Random Numbers to Your Big Data Analytic Problems Down to Size
    Section 1.  The Remarkable Utility of (Pseudo)Random Numbers 
    Section 2.  Resampling and Permutating 
    Section 3.  Case Study: Sample Size and Power Estimates
    Section 4.  Monte Carlo Simulations
    Section 5.  Case Study: Monty Hall Problem: Solving What We Cannot Grasp
    Section 6.  Case Study: Frequency of Unlikely String of Occurrences 
    Section 7.  Case Study: The Infamous Birthday Problem
    Section 8.  Case Study: A Bayesian Analysis of Insurance Costs 

  Chapter 14. Special Considerations in Big Data Analysis
    Section 1.  Theory in Search of Data 
    Section 2.  Data in Search of Theory
    Section 3.  Overfitting
    Section 4.  Bigness Bias
    Section 5.  Too Much Data
    Section 6.  Fixing Data
    Section 7.  Data Subsets in Big Data: Neither Additive nor Transitive
    Section 8.  Additional Big Data Pitfalls
    Section 9.  Case Study: Curse of Dimensionality

  Chapter 15. Big Data Failures and How to Avoid (Some of) Them
    Section 1.  Failure is Common
    Section 2.  Failed Standards
    Section 3.  Blaming Complexity
    Section 4.  Perils of Redundancy
    Section 5.  Save Time and Money; Don’t Protect Data that Does not Need Protection
    Section 6.  An Approach to Big Data that May Work For You
    Section 7.  After Failure
    Section 8.  Case Study: Cancer Biomedical Informatics Grid, a Bridge too Far
    Section 9.  Case Study: The Gaussian Copula Function

  Chapter 16. Legalities
    Section 1.  Responsibility for the Accuracy and Legitimacy of Data
    Section 2.  Rights to Create, Use, and Share the Resource
    Section 3.  Copyright and Patent Infringements Incurred by Using Standards
    Section 4.  Protections for Individuals
    Section 5.  Consent
    Section 6.  Unconsented Data
    Section 7.  Good Policies are a Good Policy
    Section 8.  Case Study: The "Inconclusive" Data Analysis
    Section 9.  Case Study: The Havasupai Story
    Section 10.  Case Study: Double-edged Sword of the U.S. Data Quality Act 

  Chapter 17. Data Sharing 
    Section 1.  What Is Data Sharing, and Why Don't We Do More of It?
    Section 2.  Common Complaints
    Section 3.  Case Study: Life on Mars
    Section 4.  Case Study: Who Shares Their Data 
    Section 5.  Case Study: National Patient Identifier

  Chapter 18. Data Reanalysis: Much More Important than Analysis
    Section 1.  First Analysis (Nearly) Always Wrong 
    Section 2.  Why Reanalysis is More Important than Analysis
    Section 3.  Case Study: Reanalysis of Old JADE Collider Data 
    Section 4.  Case Study: Vindication Through Reanalysis 
    Section 5.  Case Study: Finding New Planets from Old Data 

  Chapter 19. Repurposing Big Data
    Section 1.  What is Data Repurposing? 
    Section 2.  Dark Data, Abandoned Data, and Legacy Data 
    Section 3.  Case Study: From Postal Code to Demographic Keystone 
    Section 4.  Case Study: Fingerprints and Data-driven Forensics
    Section 5.  Scientific Inferencing from a Databases of Genetic Sequences
    Section 6.  Case Study: Linking global warming to high-intensity hurricanes
    Section 7.  Case Study: Inferring climate trends with geologic data
    Section 8.  Case Study: Old tidal data, and the iceberg that sank the Titanic
    Section 9.  Case Study: Lunar Orbiter Image Recovery Project
    Section 10.  Case Study: The Cornucopia of the Natural Sciences

  Chapter 20. Societal Issues
    Section 1.  How Big Data Is Perceived by the Public
    Section 2.  Reducing Costs and Increasing Productivity with Big Data
    Section 3.  Public Mistrust
    Section 4.  Saving Us from Ourselves 
    Section 5.  Who is Big Data?
    Section 6.  Hubris and Hyperbole
    Section 7.  Case Study: The Citizen Scientists
    Section 8.  Case Study: 1984, by George Orwell

  




- Jules Berman

Wednesday, May 9, 2018

Read Precision Medicine and the Reinvention of Human Disease on ScienceDirect

It is regrettable that many of my textbooks are unaffordable to the majority of the potential market. For Example, Precision Medicine and the Reinvention of Human Disease sells on Amazon for $125. This book contains nearly a quarter-million words, and it must have cost the publisher a lot of money to print and distribute, but I certainly wish it could have been sold at a lower price.

As a remedy, for some of you, this book is being marketed by Elsevier (the owner of the Academic Press imprint under which is was published) through ScienceDirect, a subscription online book catalog bought by university libraries. This means that if you have online access to a university library that has paid for a ScienceDirect subscription, you may have free access to my book.

Precision Medicine and the Reinvention of Human Disease was published January 30, 2018, and it is possible that your university library may have a ScienceDirect subscription that does not yet access my book. After speaking today with my editor, it's my impression that ScienceDirect access for libraries is something akin to cable channel access for homes. You can add access to specific books or you can add access to bundles of books that cover areas of interest. If you have access to ScienceDirect, but your university doesn't yet have access to my book(s), please talk to your librarian and ask if he/she will add my Elsevier publications to their ScienceDirect subscription.

There is an excellent preview of Precision Medicine and the Reinvention of Human Disease at the Google books site.

- Jules Berman

key words: precision medicine, ScienceDirect, library acquisitions, book subscriptions, jules j berman Ph.D. M.D.

Thursday, February 15, 2018

Inscrutable Genes

  • "In most cases, the molecular consequences of disease, or trait-associated variants for human physiology, are not understood." from: Manolio TA, Collins FS, Cox NJ, Goldstein DB, Hindorff LA, Hunter DJ, et al. Finding the missing heritability of complex diseases. Nature 2009;461:747–53.

The 1960s was a wonderful decade for the field of molecular genetics. Hundreds of inherited metabolic diseases were being studied. Most of these diseases could be characterized by a simple inherited mutation in a disease-causing gene. Back then, we thought we understood genetic diseases. Here’s how it all might have worked, if life were simple: one mutation! one gene ! one protein ! one disease. This lovely genetic parable, from a bygone generation, seldom applies in the era of Precision Medicine. The purpose of this section is to explain some of the complexities of modern genetics and to lay out the job of the Precision Medicine scientist who must dissect the pathways that lead from gene to disease.

In Precision Medicine and the Reinvention of Human Disease, two of the most confuding aspects of modern disease genetics are discussed: that a single disease may result from one of many distinct molecular defects; and that a single gene may produce many different diseases. These two countervailing phenomena tell us something very important about disease development. The first is that different pathways may converge to the same disease, and that any single gene may perturb a biological system (i.e., a living organism) in different ways. Some of that discussion is excerpted here.

There are numerous examples wherein mutations in one gene may result in more than one disease [2]. In some cases, each of the diseases caused by the altered gene is fundamentally similar (e.g., spherocytosis and elliptocytosis, caused by mutations in the alpha-spectrin gene; Usher syndrome type IIIA and retinitis pigmentosa-61 caused by mutations in the CLRN1 gene). In other case, diseases caused by the same gene may have no obvious relation to one another. For example, the APOE gene encodes apolipoprotein E, which is involved in the synthesis of lipoproteins. One common allele of the APOE locus, e4, increases the risk of Alzheimer disease and of heart disease, two disorders of no obvious clinical similarities [3,4].

Let’s look at a few other examples where mutations in a single gene play causal roles in the development of diverse diseases. For example, different mutations of the same gene, desmoplakin, cause the following diseases [2]:

  • Arrhythmogenic right ventricular dysplasia 8

  • Dilated cardiomyopathy with woolly hair and keratoderma

  • Lethal acantholytic epidermolysis bullosa

  • Keratosis palmoplantaris striata II

  • Skin fragility-woolly hair syndrome

How is it possible that errors in the gene coding for desmoplakin, a constituent protein found in intercellular junctions, could account for such apparently unrelated diseases as arrhythmogenic right ventricular dysplasia and lethal acantholytic epidermolysis bullosa? It happens that we know that specialized desmosomes in cardiac cells (i.e., intercalated discs) tightly couple myocytes so that they can function as a coordinated group. Desmosomes are also required to adhese skin epidermal cells to one another and to the underlying basement membrane. In the case of desmoplakin mutations, it is relatively easy to see the pathogenetic relationship among these diseases.

In other sets of diseases that result from an error in one specific gene, the pathogenetic relationship may not be so easily discerned. Some cases of Charcot-Marie-Tooth axonal neuropathy, lipodystrophy, Emery-Dreyfus muscular dystrophy, and premature aging syndromes are all caused by mutation in the LMNA (Lamin A/C) gene. Stickler syndrome type III, Fibrochondrogenesis-2, and a form of nonsyndromic hearing loss are all caused by mutations in the COL11A2 gene. In these cases, how can variations in a single gene cause many different diseases?

Let’s look at just a few of the possibilities:

  • One gene can control the synthesis of more than one protein [6].

  • A single protein may have multiple functions. For example, nuclear lamina (lamin a/c) has several biological roles: controlling nuclear shape, influencing transcription, and organizing heterochromatin. Mutations in the LMNA gene cause more than 10 different clinical syndromes, including neuromuscular and cardiac disorders, premature aging disorders, and lipodystrophy. Likewise, the polyfunctional TP53 gene has been linked to 11 clinically distinguishable cancer-related disorders [7].

  • A single protein with a single function may have different biological effects based on the cell type in which the protein is expressed, the stage of development in which the protein is expressed, and the cellular milieu (e.g., concentrations of substrate or protein inhibitors) for a given cell type, at a particular moment in time.

  • Diseases develop through a sequence of biological events occurring over time. A mutation may exert a different biological effect based on where and when, in the sequence of pathogenetic events, it is expressed.

more to follow

- Jules Berman

key words: precision medicine, genetics, multi-step, pathogenesis, genetic heterogeneity, jules j berman Ph.D. M.D.

Wednesday, February 14, 2018

Infections Develop Via a Sequence of Biological Steps

A prior post listed 7 assertions regarding the role of infectious organisms on the human genome. In the next few blogs we'll look at each assertion, in excerpts from Precision Medicine and the Reinvention of Human Disease. Here's the seventh:

By dissecting the biological steps involved in the pathogenesis of infectious disease, it is possible to develop new treatments, other than antibiotics, that will be effective against a range of related organisms.

Nature, by interfering with the different steps in the development of infectious diseases, has a variety of protective mechanisms against organisms. For example, to defend against malaria, nature has preserved various mutations that render red cells unsuitable hosts for malarial guests. For example, individuals with hemoglobin variants HbS (sickle cell trait), HbC, and HbE increase the likelihood that an infected red cell will lyse. Likewise, but for obscure reasons, regulatory defects in hemoglobin synthesis, as seen in thalassemia, may also confer some protection against malaria. Also, variations in a structural protein of erythrocytes, SLC4A1, causing ovalocytosis; and polymorphisms of the glucose-6-phosphate dehydrogenase gene [57] both seem to protect against malaria.

We see individuals resistant to malaria due to absence of the Duffy protein required for Plasmodium vivax to bind and enter erythrocytes [58]. Knowing this, the Duffy-binding protein in the malaria parasite is now being studied as a potential drug or vaccine target as a new strategy against malaria [58]. More generally, drugs known as entry inhibitors are being developed based on knowledge that the attachment and entry of organisms may depend upon specific cooperative pathways, in host and invader cells, that can be targeted by drugs. We know that there are many steps in the infection process that could be blocked by small changes in proteins that are unrelated to the immune process. For example, for an infectious agent to invade and flourish in an organism, it must gain entry into the tissues of the body, evading physical and chemical defenses along its way. It must find a place in which it can receive nourishment appropriate to its species and avoid any toxins that may be produced by its host. It must be able to grow as a collection of organisms, and this typically means that the host must permit some degree of invasion through its own tissues. These are just a few of the nonimmunological hurdles that invasive organisms must jump over, if they are to infect an organism. Every step in the pathogenesis of infectious disease provides another therapeutic opportunity. As we learn more about the pathways of development of infectious diseases that have become increasingly resistant to antibiotics, we will come to rely on Precision Medicine to prevent, diagnose, and treat infections.

- Jules Berman

key words: precision medicine, infections disease, biological steps, pathogenesis, jules j berman Ph.D., M.D.

Tuesday, February 13, 2018

Non-immunologic Causes of Increased Susceptibility to Disease

A prior post listed 7 assertions regarding the role of infectious organisms on the human genome. In the next few blogs we'll look at each assertion, in excerpts from Precision Medicine and the Reinvention of Human Disease. Here's the sixth:

Cellular defects that have no direct connection to immunity may increase susceptibility to infectious organisms.

If we want to understand why certain individuals are susceptible to infections and other individuals are not, we must understand that immune deficiencies cannot account for all infections. Infectious diseases, just like any other disease, develop in steps, and it stands to reason that there must be many different pathways through which those steps can be enhanced or blocked. Theory aside, what is the actual evidence that susceptibility to infectious diseases arise through deficiencies unrelated to the immune system?

  • Time and again, we encounter serious infections from organisms thought to be nonpathogenic, occurring in immunocompetent individuals [48–51].

  • Not everyone with an immune deficit will succumb to an infectious disease, implying that these individuals are protected by resistance mechanisms other than immunity.

  • We know of various genetic conditions that increase our susceptibility to infectious diseases, and some of these genetic flaws have nothing to do with the adaptive (i.e., antibody-forming) immune systems. For example, children with sickle cell disease or congenital asplenia will have a heightened susceptibility to invasive pneumococcal diseases [52]. Otherwise-normal children with IRAK4 or NEMO gene mutations will also have a high risk of invasive pneumococcal disease [52]. IRAK4 or NEMO genes code for proteins involved in the phagocytosis of bacteria by splenic macrophages. Likewise, in mice, natural resistance to infection is influenced by the Bcg gene, which affects the early phagocytosis and destruction of intracellular organisms by macrophages [53]. As a final example, both humans and zebrafish that have mutations that reduce the synthesis of a proinflammatory leukotriene have heightened susceptibility to Mycobacterium tuberculosis [54]. It is easy to find examples of nonimmunologic mechanisms for susceptibility to infections [55,56].

- Jules Berman

key words: precision medicine, immune system, susceptibility to disease, non-immunologic, jules j berman Ph.D., M.D.

Monday, February 12, 2018

Infection without Disease (from Precision Medicine and the Reinvention of Human Disease)

A prior post listed 7 assertions regarding the role of infectious organisms on the human genome. In the next few blogs we'll look at each assertion, in excerpts from Precision Medicine and the Reinvention of Human Disease. Here's the fifth:

Normal defenses can block every infectious disease. Hence, every infectious disease results from a failure of our normal defenses, immunologic and otherwise.

For any given infectious agent, no matter how virulent they may seem, there are always individuals who can resist infection. Moreover, as a generalization, the majority of individuals who are infected with a pathogenic microorganism will never develop any clinically significant disease [42].

As one example, Naegleria fowleri is often found in warm freshwater. Swimmers in contaminated waters may develop an infection that spreads from the nasal sinuses to the central nervous system, to produce an encephalitis that is fatal in 97% of cases [43]. Despite the hazard posed by Naegleria, health authorities do not generally test freshwater sources to determine the presence of the organism. Do not expect to find warning signs posted at swimming holes announcing that the water is contaminated by an organism that produces a disease that has a nearly 100% fatality rate. It is simply assumed that anyone who spends any time around freshwater will eventually be exposed to Naegleria. As it happens, although many thousands of individuals are exposed each year to Naegleria in the United States, only a few cases of Naegleria encephalitis occur in this country. In fact, since Naegleria was recognized as a cause of encephalitis, in 1965, fewer than 150 cases have been reported [44]. Most of the reported cases have occurred in children and adolescents and are associated with recreational water activities [45,46]. The children who develop Naeglerian encephalitis, though exhibiting no signs of immune deficiency, are nonetheless susceptible to Naegleria. What makes these children different from all the other children and adults who were exposed to the same organisms?

Neisseria meningitidis, a cause of bacterial meningitis, can be cultured from nasal swabs sampled from the general population. If N. meningitidis were a primary pathogen, then why doesn’t it cause disease in the vast majority of infected individuals. If N. meningitidis were an opportunistic infection, then why does it typically cause disease in healthy college-age individuals (not immunocompromised individuals)? If this organism is neither a primary pathogen nor an opportunistic pathogen, then what kind of pathogen is it? More importantly, why is N. meningitidis a potentially fatal pathogen in some individuals and a harmless commensal in others [47]?

Organisms that were formerly thought to be purely pathogenic are now known to frequently live quietly within infected humans, without causing symptoms of disease. For example, parasites such as the agents that cause Chagas disease, leishmaniases, and toxoplasmosis are commonly found living in apparently normal individuals. Viruses, including the agents that cause herpes simplex infections and infections by hepatitis viruses B and C, can be found in healthy individuals. Mycobacterium tuberculosis can infect an individual, produce a limited pathologic reaction in the lung, and remain in the body in a quiescent state for the life of the individual. In fact, it has been estimated that about one out of three individuals, worldwide, is infected with Mycobacterium tuberculosis, and will never suffer any consequences. Luckily, asymptomatic carriers of tuberculosis, in whom the there is no active pulmonary disease, are noninfective. Staphylococcus aureus, a bacterial pathogen that is known to produce abscesses, invade through tissues, and release toxins, is also known to circulate in the blood, without causing symptoms, in a sizeable portion of the human population [40].

We now know that potentially virulent organisms are normally tamed within our bodies. Hence, the root cause of every clinical infection results from a deficiency in the defenses of particular subpopulations of individuals.

- Jules Berman

key words: precision medicine, commensals, symbiotes, symbiotic, host organisms, latent infection, jules j berman Ph.D. M.D.

Sunday, February 11, 2018

Cellwise, We Are Mostly Inhuman

A prior post listed 7 assertions regarding the role of infectious organisms on the human genome. In the next few blogs we'll look at each assertion, in excerpts from Precision Medicine and the Reinvention of Human Disease. Here's the fourth:

Most of the cells residing in human bodies are nonhuman

There are about 10 times as many nonhuman cells living in our bodies as there are human cells [40]. The human intestines alone contain 40,000 different species of bacteria [9]. These 40,000 species contain about 9 million different genes. Compare that with the paltry 23,000 genes in the human genome, and we quickly see that we homo sapiens contribute very little to the genetic diversity of the human body’s ecosystem.

- Jules Berman

key words: precision medicine, commensals, symbiotes, symbiotic, host organisms, jules j berman Ph.D. M.D.

Saturday, February 10, 2018

Genome-Specific Responses to Infection

A prior post listed 7 assertions regarding the role of infectious organisms on the human genome. In the next few blogs we'll look at each assertion, in excerpts from Precision Medicine and the Reinvention of Human Disease. Here's the third:

A good portion of the genes in humans (perhaps 10%) are involved in responses to infectious organisms.

It has been estimated that over 1000 human genes are involved in inflammation pathways [37]. Several studies have shown that following an inflammatory challenge or challenged by the introduction of a pathogen, more than a hundred genes are activated [38–40]. The activated genes include some of the same genes that have been associated with autoimmune diseases, suggesting that these disease-associated genes are conserved because they have a beneficial role, protecting us from invading pathogens [39]. The genetic profile of genes activated by inflammation is very similar from human to human, but quite dissimilar from the profile of genes activated by inflammation in the mouse [41]. This would suggest that species develop their own genome-wide responses to agents that cause inflammation (e.g., invading organisms).

- Jules Berman

key words: precision medicine, evolution, virus, viral, jules j berman Ph.D. M.D.

Friday, February 9, 2018

Vertebrate Evolution Driven by DNA from Infectious Organisms

A prior post listed 7 assertions regarding the role of infectious organisms on the human genome. In the next few blogs we'll look at each assertion, in excerpts from Precision Medicine and the Reinvention of Human Disease. Here's the second:

Some of the key steps in the development of vertebrate animals, and mammals in particular, have come from DNA acquired from infectious organisms.

The human genome has preserved its viral ballast, at some cost. At every cell division, energy is expended to replicate the genome, and the larger the genome, the more energy must be expended. Why do we spend a large portion of the energy required to replicate our genome, on inactive sequences, of viral origin? Why doesn’t our genome simply eject the extra DNA, a biological process that is commonplace in the evolution of obligate intracellular parasitic organisms? Maybe it's because we use viral genes to our own advantage.

Two evolutionary leaps, benefiting the ancestral classes of humans, and owed to the acquisition of viral genes, include the attainment of adaptive immunity and the development of the mammalian placenta. Let’s take a moment to see how these innovations came about.

Adaptive immunity evolved at about the same time that jawed vertebrates first appeared on earth. The crucial gene responsible for the great leap to adaptive immunity, the recombination activating gene (RAG), was stolen from a retrovirus. To understand the pivotal evolutionary role of RAG, we need to review a bit of high school biology. The adaptive immune system responds to the specific chemical properties of foreign antigens, such as those that appear on viruses and other infectious agents. Adaptive immunity is a system wherein somatic T cells and B cells are produced, each with a unique and characteristic immunoglobulin (in the case of B cells) or T-cell receptor (in the case of T cells). Through a complex presentation and selection system, a foreign antigen elicits the replication of a B cell whose unique immunoglobulin molecule (i.e., so-called antibodies) matches the antigen. Secretion of matching antibodies leads to the production of antigen-antibody complexes that may deactivate and clear circulating antibodies, or may lead to the destruction of the organism that carries the antigen (e.g., virus or bacteria).

To produce the many unique B and T cells, each with a uniquely rearranged segment of DNA that encodes specific immunoglobulins or T-cell receptors, recombination and hypermutation must take place within a specific gene region. This process yields on the order of a billion unique somatic genes, and requires the participation of recombination activating genes (RAGs). The acquisition of a recombination activating gene is presumed to be the key evolutionary event that led to the development of the adaptive immune system present in all jawed vertebrates (gnathostomes). Before the appearance of the jawed vertebrates, this sort of recombination was genetically unavailable to animals. Our genes simply were not equal to the task. Retroviruses, however, are specialists at cutting, moving, and mutating DNA. Is it any wonder that the startling evolutionary leap to adaptive immunity was acquired from retrotransposons? Thus,we owe our most important defense against infections to genetic material retrieved from the vast trove of retrovirally derived DNA carried in our genome [33]. As one might expect, inherited mutations in RAG genes are the root causes of several immune deficiency syndromes [34,35].

Many millions of years later, vertebrates acquired another gene that did much to enable the evolution of all mammals. Members of Class Mammalia are distinguished by the development of the placenta, an organ that grows within the uterine cavity (i.e., the endometrium). After birth, the placenta must detach from the uterus. You can imagine the delicate balancing act between attaching firmly to the wall of the uterus and detaching cleanly from the wall of the uterus. During placental development, large, flat cells called cytotrophoblasts form the interface between placenta and uterus. To create the thin membrane that borders the lining of the uterus and that borders the blood received from the uterus in the spaces between the placental villi, the cytotrophoblasts must somehow fuse into a syncytium (i.e., multinucleate collections of cells that have fused together by dissolving their individual cytoplasmic membranes).

There is one task at which all animals excel: maintaining a clear separation between one cell and another. In point of fact, the most distinctive difference between animal cells and all other cells of eukaryotic origin happens to be the presence of cell junctions, whose purpose is to bind cells to one another without fusing cells. This being the case, you can see that the normal direction of animal evolution would preclude the appearance of a gene intended to form a huge syncytium of placental cells. Whereas animal cells are failures at fusion, viruses are champions. One of the most often-deployed methods by which viruses invade cells is through fusion at the cytoplasmic membrane. It happens that retroviral envelope genes, preserved in the human genome, do a very good job at fusing membranes. Animals captured a retroviral fusogenic envelope gene and inserted it into one of the first syncytin molecules involved the development of the placenta. Apparently, this acquisition worked out so well for mammals that later-evolving mammalian classes made their own retrovirus gene acquisitions to obtain additional syncytins, thus refining the placenta for their own subclasses [23,36].

- Jules Berman

key words: precision medicine, evolution, virus, viral, jules j berman Ph.D. M.D.

Thursday, February 8, 2018

RELIC DNA IN THE HUMAN GENOME

Yesterday's post listed 7 assertions regarding the role of infectious organisms on the human genome. In the next few blogs we'll look at each assertion, in excerpts from Precision Medicine and the Reinvention of Human Disease. Here's the first:

A significant portion of the human genome consists of relic DNA derived from ancient invasive organisms.

About 8% of our genome is derived from sequences with similarity to known infectious retroviruses, and these longer sequences can usually be recognized by their contained subsequences (e.g., gag, pol, and env genes) and long terminal repeats. The viral sequences in our genomes are the remnants of ancient retroviral infections, and the occasional nonretroviral infection, that were branded into DNA, and subsequently amplified [21–23]. Because much of the endogenous retroviral load in the human genome is due to amplification, and subsequent mutation, it is hard to determine the number of retroviral species that established their niche in the human gene pool, but studies of these viral remains would suggest that we contain species from several dozen families of retroviruses, with an undetermined number of contributions from individual family members [24]. Based on comparisons of the viruses present in different species of primates, it would appear that the most recent acquisition of an endogenous retrovirus occurred in humans between 100,000 and 1 million years ago [25]. Most of the retroviral sequences in our genomes are inactivated due to an accumulation of degenerative mutations collected over the eons, indicating that there has been little or no selective pressure to conserve retroviruses in their pristine sequences.

- Jules Berman

key words: precision medicine, human genome, evolution, infectious diseases, jules j berman, Ph.D., M.D.

Wednesday, February 7, 2018

Infections have made their mark on the Human Genome

In the context of Precision Medicine, infections draw our attention because they have played an important role in the evolution of the eukaryotic genome. Over the next few blog posts, we will explore the following:

  • A significant portion of the human genome consists of relic DNA derived from ancient invasive organisms.
  • Some of the key steps in the development of vertebrate animals, and mammals in particular, have come from DNA acquired from infectious organisms.
  • A good portion of the genes in humans (perhaps 10%) are involved in responses to infectious organisms.
  • Most of the cells in the human (at least 90%) consist of infectious organisms and commensals that have adapted to life within human hosts. [Glossary Commensal]
  • Normal defenses can block every infectious disease. Hence, every infectious disease results from a failure of our normal defenses, immunologic and otherwise.
  • Cellular defects that have no direct connection to immunity may increase susceptibility to infectious organisms.
  • By dissecting the biological steps involved in the pathogenesis of infectious disease, it is possible to develop new treatments, other than antibiotics, that will be effective against a range of related organisms.
Over the next few blogs, we'll do our best to justify each of these (as yet) unproven assertions.

- Jules Berman

key words: precision medicine, infections, evolution, resistance to infection, jules j berman Ph.D., M.D.

Tuesday, February 6, 2018

Precision Medicine and Public Health (from Precision Medicine and the Reinvention of Human Disease)

Excerpted from Precision Medicine and the Reinvention of Human Disease

Despite having the most advanced healthcare technology on the planet, life expectancy in the United States is not particularly high. Citizens from most of the European countries and the highly industrialized Asian countries enjoy longer life expectancies than the United States. According to the World Health Organization, the United States ranks 31st among nations, trailing behind Greece, Chile, and Costa Rica, and barely edging out Cuba [42]. Similar rankings are reported by the US Central Intelligence Agency [43]. These findings lead us to infer that access to advanced technologies, such as those offered by Precision Medicine, will not extend lifespan significantly.

Every healthcare professional knows that most of the deaths occurring in this country can be attributed to personal lifestyle choices: smoking, drinking, drug abuse, and over-eating. Lifestyle diseases account for the majority of deaths in the United States and in otherwestern countries, these being:heartdisease,diabetes, obesity, andcancer.Population-basedtrials that seek to improve theways inwhichindividuals live, by introducing adaily exercise routine, healthydiet, and cigarette abstinence, have yielded huge benefits, in terms of extending average lifespans [44]. At the front end of the human life cycle, it has been demonstrated that infant mortalities can be markedly reduced with simple measures, focusing on improved maternal education [45]. It has been credibly argued that cleanwater, clean air, clean housing, clean food, and clean living yieldgreater societal benefits than clean operating rooms [46,47]. If this be the case, should we be investing heavily in Precision Medicine, when simple, low-tech public health measures are likely to provide a greater return on investment, in terms of overallmorbidity andmortality? In a certain sense, public health is the opposite of personalized medicine. Whereas personalized medicine involves finding the best possible treatment for individuals, based on their uniqueness, public health involves finding ways of treating whole populations based on their collective sameness. Let’s not dwell on these somewhat contrived philosophic points. Precision Medicine, as viewed in this book, is a new way of understanding human diseases. As such, Precision Medicine provides opportunities to advance both personalized medicine and public health.

Precision Medicine tells us that we should think of diseases as developmental process, with each step in the process representing an opportunity for intervention. Perhaps the most important function of Precision Medicine will be to give society the opportunity to institute public health measures aimed at blocking the pathogenesis of human diseases. Here are just a few examples:

– Population screening for early stages of common diseases.

The successful reduction in deaths from cervical cancer demonstrates the effectiveness of screening for early stages of disease. Cervical cancer is a type of squamous cell carcinoma that develops at the junction between the ectocervix (the squamous lined epithelium) and the endocervix (the glandular lined epithelium) in the os of the uterine cervix of women. Before the introduction of cervical precancer treatment, cervical carcinoma was one of the leading causes of cancer deaths in women worldwide. Today, in many countries that have not deployed precancer treatment, cervical cancer remains the leading cause of cancer deaths in women [48– 50]. In the United States, a 70% drop in cervical cancer deaths followed the adoption of routine Papsmear screening[51–53].Noeffort aimedat treatinginvasive cancers has providedanequivalent reduction in the number of cancer deaths. [Glossary Age-adjusted incidence, Pap smear] Today, we know that cervical carcinogenesis begins with a localized infection by one of several strains of human papillomavirus, transmitted during sexual intercourse by an infected male partner. In the late 1940s (and really up until the early 1980s), the viral etiology of cervical cancer was unknown. We did know that squamous cells sampled from the uterine os had highly characteristic morphologic appearances that preceded the development of invasive cancer. Thanks largely to the persistence of Dr. Papanicolaou and his coworkers, a standard screening test, known as the Pap smear, was developed to detect cervical precancers. If precancerous changes were found in a smear, a gynecologist could remove a superficial portion of the affected epithelium, and this would, in the vast majority of cases, stop the cancer from ever developing.

Morphologic and epidemiologic observations on Pap smears provided clues that eventually led to the identification of several strains of human papillomavirus as the major causes of cervical cancer. Today, a vaccine protective against carcinogenic strains of human papilloma virus is available [54].

As discussed in Precision Medicine and the Reinvention of Human Disease, Section 7.5, “What Is Precision Diagnosis?” new biomarkers are being developed for the early stages of disease, often preceding the development of any clinical symptoms. In general, diseases are easiest to treat in early stages, before they have had the chance to do any harm to organs. For example, precancers can often be effectively treated by excision, or, in some cases, by withdrawal of the agents that would otherwise lead to the progression of the precancer to the cancerous stage (e.g., cessation of hormonal replacement therapy to block breast cancer, cessation of smoking to block lung cancer, treatment of Helicobacter pylori infection to block MALToma).

We can hope that in the future advances in the field of Precision Medicine will identify the intermediate stages of development for common diseases. With this information, public health measures aimed at detecting and blocking diseases, in an early stage of development, will be deployed.

– The aggressive prevention and treatment for the most common patterns of diseases that lead to death

As discussed in Precision Medicine and the Reinvention of Human Disease, Section 2.3, “Cause of Death,” a well-composed death certificate contains a thoughtful sequence of medical conditions that develop over time, and that ultimately lead to the death of the patient. This data, if properly recorded and aggregated into a mortality database, should provide the most frequently occurring chains of events that account for human deaths. A public health effort aimed at breaking the early steps of these processes has the potential of extending the life expectancy of the population.

– Aggressive screening for carriers of infectious diseases

As discussed in Section 6.2, “Our Genome Is a Book Titled ‘The History of Human Infections,’” organisms that were formerly thought to be purely pathogenic are now known to frequently live quietly within infected humans, without causing symptoms of disease, and this would include the organisms that cause Chagas disease, leishmaniases, toxoplasmosis, tuberculosis, viruses such as Herpes viruses and hepatitis viruses B and C, and bacterial organisms, some of which circulate in the blood without causing disease under normal circumstances.

Sensitive diagnostic techniques, including genome sequencing of DNA in blood, may provide us with the opportunity to perform population screening for organisms that are opportunistic pathogens, or that produce long-term damage to carriers, or that are transmissible from carriers.

– Finding targets for vaccines that confer effectiveness against more than one target organism.

Thanks in no small part to Precision Medicine, we are learning that organisms play a role in many diseases that were once thought to have no infectious component. In particular, it is now widely accepted that infections contribute to at least one-fifth of all cancers occurring in humans. Examples of cancer causing organisms are:

– Epstein-Barr virus (B-cell lymphomas, Burkitt lymphoma, 
nasopharyngeal cancer, Hodgkin disease and T-cell lymphomas)
– Hepatitis B virus (hepatocellular carcinoma)
– Human papillomavirus types 5, 8, 14, 17, 20, 
and 47 (skin cancer)
– Human papillomavirus types 16, 18, 31, 33, 35, 39, 
45, 52, 56, 58 (cervical cancer, anogenital cancer)
– Human papillomavirus types 6 and 11 (verrucous 
carcinoma)
– Human papillomavirus types 16, 18, 33, 57, 73 
(cancers of oral cavity, tongue, larynx, nasal cavity, 
and esophagus)
– Merkel cell polyomavirus (MCPyV) (Merkel cell carcinoma)
– HTLV-1 (adult T-cell leukemia)
– Human herpesvirus 8 (Kaposi sarcoma)
– Hepatitis C virus—hepatocellular carcinoma 
and low-grade lymphomas
– JC, BK, and SV40-like polyoma viruses (tumors 
of brain and pancreatic islet tumors, and mesotheliomas)
– Human endogenous retrovirus HERV-K 
(seminomas and germ cell tumors)
– Schistosomiasis and squamous cell carcinoma of 
bladder
– Opisthorchis viverrini and Clinorchis sinensis, 
flatworms (flukes), found in Southeast Asia, 
(cholangiocarcinoma)
– Helicobacter pylori and gastric MALToma 
(Mucosa-Associated Lympoid tissue
lymphoma) [55]

Carcinogenic viruses profoundly influence the number of cancer deaths, worldwide. These include hepatitis B virus (associated with an increased incidence of hepatocellular carcinoma) and human papillomavirus (which causes cervical cancer). Liver cancer is the third leading cause of cancer deaths worldwide, accounting for 611,000 deaths in 2000 [50]. It is easy to understand that the importance of vaccine development for infections that contribute to chronic diseases and cancers cannot be overstated. As we learn more about the biological steps involved in the infection process, hope looms that vaccines and preventive drugs will be developed that target different types of organisms, based on shared properties of infection, invasion, immunologic resistance, persistence, or phylogeny, as discussed in Precision Medicine and the Reinvention of Human Disease, Section 4.4, “Pathway-Directed Treatments for Convergent Diseases,” [56–60].

- Jules Berman

key words: public health, prevention, precision medicine, cancer, cancer vaccines, jules j berman, Ph.D., M.D.