Science Blog: 2020

Tuesday, 22 December 2020

Myxovirus Resistance Protein A (MxA) & Antibodies to Nuclear Matrix Protein-2 (NXP-2) in Dermatomyositis Sine Dermatitis

There has long been debate on what is the most sensitive and specific marker that distinguishes Dermatomyositis (DM) from Polymyositis (PM) or Inclusion Body Myositis (IBM), and further whether anti-synthestase syndromes (ASS) should be included under DM.

Where subjects present with a skin rash and muscle weakness with one of 5 DM specific autoantibodies, i.e. Mi-2, NXP-2, MDA-5, TIF-1gamma or SAE, the diagnosis of DM is straightforward. Given the difficulty in obtaining muscle biopsies within the NHS, particularly in DGHs, most Rheumatologists would settle for a diagnosis of DM in this scenario, particularly if the EMG too is characteristic of a myopathy.

But what if DM presents without a skin rash, as it can in approximately 8% of subjects. The distinction between DM one one hand and PM, IBM and ASS on the other is more than academic, firstly because of a higher association of cancer with the former, and secondly because of increasing reports of efficacy of JAK inhibitors for DM but not the other conditions.

In this situation, two tests can be useful. The first requires a muscle biopsy, and the second is included in the DM specific autoantibody panel.

First, the staining of the sarcoplasm of muscle for Myxovirus Resistance Protein A (MxA) is highly sensitive (76%) and 100% specific for DM. This is a Type I interferon induced protein and therefore an interferon signature. It is not seen in PM, IBM or ASS. Unlike the latter three, DM is an interferonopathy (which is why it responds to JAK inhibitors).

While other interferon signatures such as RIG-1 and ISG-15 are also quite specific for DM, as indeed are muscle biopsy findings of perifascicular atrophy (PFA) and deposition of membrane attack complex on capillaries, none of these are as sensitive as MxA. (While ASS also displays PFA on biopsy, the characteristic necrotic and regenerating fibres in perifascicular fibres sets it apart from the mainly degenerative fibres seen in DM).

The other useful marker of Dermatomyositis sine (without) dermatitis is the muscle specific antibody NXP-2. This is seen in fully 86% of subjects who have DM without skin involvement at presentation, but only in 28% of subjects who have DM with rash. Therefore, a subject presenting with myositis, but no rash and a positive NXP-2 should be treated as DM rather than PM. In a minority of such subjects, a typical rash may appear after many months or even years.

DM sine dermatitis should not be confused with amyopathic DM, which is characterised by rash and lung involvement without muscle involvement. Most such subjects are positive for MDA-5 and have a severe lung phenotype.

References

1. https://jamanetwork.com/journals/jamaneurology/fullarticle/2764337

2. https://pubmed.ncbi.nlm.nih.gov/30267437/

Sunday, 20 December 2020

The New VUI-202012/01 COVID-19 Variant Found in the United Kingdom

The UK has just tightened its COVID tiers based on a fast spreading variant of the virus picked up by the COVID-19 Genomic Consortium. So far 1108 cases with this variant has been described. Apparently it is 70% more infectious than the prevalent D614G strain. Several European countries have imposed a summary bans on flights originating in the UK as a result.

It is fair to say that as a RNA virus, COVID-19 mutates continuously. A WHO analysis found that the rate of mutations for COVID-19 is 1.12 x 10^-3 mutations per site year, which is quite similar to the range of 0.80 x !0^-3 to 2.38 x 10^-3 per site year for the original SARS virus in 2003.

To put this into some context, the mutation rate in human beings is 1.2 x 10^-8 per generation. This translates to around 72 new mutations in a newborn. Mutagenesis is therefore an inevitable vicissitude of the genetic code.

You may be aware that the current circulating clade- D614G- originated in China in January 2020 and replaced the existing clade within 3 months. It is likely that VUI-202012/01 will become the dominant strain if allowed to spread.

Concern about emergence of new strains is reflected in the Danish government's recent decision to cull 17 million minks because they harboured a variant of the virus that was apparently not well neutralised by existing human antibodies.

The new variant VUI-202012/01 has 17 new mutations- the most important of which is N501Y involving the spike protein (which means that tyrosine has replaced asparagine in the 501st amino acid of the spike protein).

Looking at the DNA code, asparagine has 2 codons- AAC and AAT. Tyrosine also has two- TAC and TAT. This new strain is therefore due to A to T transversion at the first position of the putative codon.

When a purine is replaced by a purine or pyrimidine by a pyrimidine, it is called a transition. Crossovers between purine and pyrimidine is called transversion. In general mutations caused by transition outnumber those due to transversions manyfold. The commonest mutations are due to C to T transition as cytosine bases are prone to be methylated at the 5'position and thence are spontaneously deaminated to thymine. Thus, the WHO database for COVID 19 in Feb 2020 showed 1670 C to T transitions compared with only 128 A to T transversions.

Thus, it is fair to say that the N501Y mutation in the VUI-202012/01 has persisted because it offers a survival advantage- i.e. infectivity, much as the G614 variant was more infective than the D614 variant. However, it must be said that there is no evidence that it is more dangerous. If anything, the D614G carried a lower mortality than the original strain, although this may have been due to better established treatment protocols.

Nor is there any evidence to suggest that the current vaccines will be less effective against the new strain. The current vaccines were in fact formulated against the original COVID strain and are just as effective against the D614G strain. This is due to the fact that protective antibodies target several epitopes and unless there is a significant conformational change in the shape of a protein, the vaccine generated antibodies will still neutralise.

I do think that this particular variant will take over despite the restrictions- just as D614G did. I also think there need be no undue cause for concern. With rapidly executed vaccination programmes, it should be possible to control the virus by spring.

Thursday, 17 December 2020

Getting The Most Out of Azathioprine

Azathioprine is widely used in Rheumatology for conditions such as vasculitis and Lupus, and by some in Rheumatoid. It certainly has a place in gastroenterology for the management of ulcerative colitis and Crohn's disease, where it is more effective for maintenance of remission than 5-ASA.

However, the use of Azathioprine is beset with problems. Around 10% of subjects have hypersensitivity reactions to this drug, comprising fever, nausea and diarrhoea, fatigue, malaise and myalgia, mandating rapid discontinuation. A further 25-30% have side effects such as hepatotoxicity in the form of transaminitis or myelotoxicity manifesting as neutropenia. As a result, fully 40% of subjects that start azathioprine do not continue with the drug.

Observance of some simple principles can obviate these difficulties. The first of these is widely practiced- measuring TPMT levels before starting azathioprine. In the general population, 89% of subjects are homozygous for high metabolism of azathioprine (high TPMT) , 11% are heterozygous and 0.3% have low levels of TPMT.

It is common practice not to use azathioprine in subjects with lowish TPMT- these are the heterozygotes. This is a missed opportunity as these are the very subjects who are likely to respond the best to the drug. Thus, people with TPMT levels >25 U/ml need no dose adjustment. (Below 25 U/ml, halve the dose, and monitor more frequently). Conversely, those with levels above 65 U/ml, although reassuring on the face of it, are likely to have treatment failure.

Azathioprine is a prodrug of 6-MP. Most of the ingested azathioprine is non-enzymatically cleaved to 6-MP in the liver. Yet, there is a marked non-familiarity with 6-MP amongst Rheumatologists and perhaps to a lesser extent among Gastroenterologists. Several studies show that in subjects with hypersensitivity to azathioprine, nearly 70% tolerate a switch to 6-MP. The dose for 6-MP is half that of azathioprine (1-1.5 mg/kg body weight, rather than 2-2.5 mg/kg). Subjects with flu like reactions, nausea, emesis, myalgias and arthralgias on Azathioprine are likely to be able to switch successfully to 6-MP. OTOH, those with hepatotoxicity and pancreatitis are likely to have the same problems with 6-MP.

The converse does not apply. Thus, subjects who are intolerant to 6-MP should not be switched to azathioprine.

While most practitioners are aware of the importance of measuring TPMT before commencing thiopurines (azathioprine or 6-MP) , there is less awareness of the rather high usefulness of measuring thiopurine metabolites. Two metabolites are measured by most labs- 6-Thioguanine nucleotide (6-TGN) and 6-Methyl mercaptopurine (6-MMPN). The metabolite 6-MMPN is produced en-route to 6-TGN (please see diagram). It is the blood level of 6-TGN that indicates efficacy of azathioprine. Within a blood 6-TGN range of 235-450 pmol/8 x 10^8 RBC (send whole blood in an EDTA tube, just like TPMT), azathioprine is likely to be effective. Below 235, efficacy is likely to be low. This latter could be due to 2 reasons- non compliance, or the fact that not enough azathioprine is being metabolised to the active metabolite 6-TGN. In the latter case , blood 6-MMPN levels will be high (range 0-5700).

The combination of low 6-TGN and high 6-MMPN levels presages treatment failure and hepatotoxicity, and is described as azathioprine resistance. Unlike with non-compliance (where 6-TGN is low and 6-MMPN is normal), increasing the dose of azathioprine where resistance exists is only likely to lead to hepatotoxicity without increasing efficacy.

Keep in mind though that unlike hepatotoxicity, isolated neutropenia is a surrogate marker for the effectiveness of azathioprine. Here, the 6-TGN levels are likely to be high, and may require a reduction in dosage, rather than discontinuation, as with transaminitis. (If hepatotoxicity and neutropenia occur together, discontinue the drug).

There are anecdotal reports that in subjects with high 6-MMPN levels, splitting the dose of azathioprine is likely to improve efficacy and reduce toxicity, while lowering the level of 6-MMPN and maintaining that of 6-TGN. However, this is based on observations from a single study.

It is fair to say that we could be using azathioprine a lot more effectively than we currently do.

Figure. Thiopurine metabolic pathway. Metabolic pathway for AZA and 6MP is shown in the diagram. AZA: Azathioprine; 6-MP: 6-mercaptopurine; 6-TU: Thiouric acid; 6-MMP: 6-methylmercaptopurine; TIMT: Thiopurine methyl-transferase; 6-MMPR: Methyl-mercaptopurine ribonucleotide; TXMP: 6-thioxanthosine monophosphate; 6-TGN: Thioguanine nucleotide; 6-TG: Thioguanine; 6-TGDP: 6-thioguanine diphosphate; 6-TGTP: 6-thioguanine triphosphate; XO: Xanthine oxidase; TPMT: Thiopurine methyltransferase; HPRT: Hypoxanthine phosphoribosyl transferase.

Reference:

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3208360/

Sunday, 13 December 2020

The Purinergic Pathway in Inflammation, Heart Disease & Cancer

So you drink a cup of coffee because you are tired. It relieves your fatigue and headache? How does it do it?

Or for that matter, how do methotexate and sulfasalazine relieve inflammation in rheumatoid arthritis or inflammatory bowel disease? How do clopidogrel and dipyridamole work? And what are the most important cellular markers of effective adoptive cell transfer therapy?

The answer to all these questions lies in the all-important purinergic system. This is the dance between the nucleotides ATP and ADP on one hand and adenosine on the other. They often have diametrically opposite effects in health and disease.

ATP is of course the currency of energy in eukaryotic cells. However, here we are referring to extracellular ATP, acting in a paracrine fashion on specific receptors. Such extracellular ATP may be released from necrotic cells, or leave apoptotic cells through pannexin channels or exit inflammatory cells such as neutrophils through connexin channels- connexin 37 & 44, specifically. They can also be part of vesicles, released from cells through exocytosis.

Whatever the origin of ATP, or the closely related dinucleotide ADP, formed from ATP, they act through two groups of cell surface receptors- the first, called P2Y receptors, are metabotropic receptors, i.e. they directly respond to the nucleotides- these are G-protein coupled receptors, and the second, called P2X receptors, are ionotropic receptors, i.e. ligand gated ion channels that open in response to inward flux of calcium (mostly) or sodium, or outward flux of potassium.

On the other side of the spectrum, sit P1 receptors, which ligate extracellular adenosine. These are also GPCRs. There are 4 types, adenosine receptor A1 (also called ADORA1), A2A(ADORA2A), A2B(ADORA2B) and A3(ADORA3). Of these, A2A and A2B are coupled to Gs and lead to increased levels of cellular cAMP, resulting in profound immunosuppression. A1 & A3 inhibit the formation of cAMP through Gi/o and are therefore generally immune promoting. While A1, A2A and A3 are high affinity adenosine receptors, A2B has low affinity, and is only stimulated under pathologic conditions such as high prevailing levels of adenosine in a hypoxic tumour microenvironment.

How is adenosine formed extracellularly? It is principally formed by the action of 2 sequential cell surface enzymes called CD39 and CD73. CD39 is an ectonucleoside triphosphate diphosphohydrolase, which converts ATP and ADP to AMP. CD73 is an ecto-5'-nuleotidase, that converts AMP to adenosine.

Adenosine can sometimes be generated by other enzymes from ATP & AMP, namely alkaline phosphatase, which has been described as a "promiscuous" enzyme.

Extracellular adenosine is short lived and pushed inside the cell through a couple of channels called Equilibrative Nucleoside Transporters 1&2 , also called ENT1 & ENT2.

Extracellular ATP is pro-inflammatory. It activates the NLRPR3 inflammasome in neutrophils and monocytes. The resulting Caspase1 cleaves pro-IL1 and pro-IL18 into their active forms.

Adenosine, on the other hand is anti-inflammatory. It reduces inflammation through the A2A receptors present on neutrophils and lymphocytes, by increasing levels of cAMP. Remember, A2A is a high affinity receptor for adenosine.

Methotrexate and sulfasalazine owe their anti-inflammatory effect at least partly due to the fact that they stimulate CD73, which increases the formation of extracellular adenosine from AMP.

What of the A1, A2B and A3 receptors for adenosine? These have all been exploited pharmacologically. The heart blocking effect of adenosine in terminating SVT is exerted through the A1 receptor, while its vasodilating effect in cardiac stress testing is due to its action on A2B receptor on vascular endothelial cells. Stimulation of A3 receptors in non-pigmented cells in the anterior chamber of the eye leads to increased production of aqueous humour, and can be useful for treating sicca symptoms.

Dipyridamole inhibits the ENT1 & 2 channels, thus leading to increased levels of extracellular adenosine. Hence its use in pharmacological cardiac stress testing.

As adenosine is formed from ATP and ADP, their levels vary inversely with each other. Thus, in inflammatory bowel disease, tissue hypoxia leads to increased production of HIF, which, acting as a transcription factor, stabilises the promoters for CD73 and A2B. A similar transcription factor called Sp1 binds to and stabilises the promoter for CD39. The net result is an increase in extracellular adenosine and a reduction in ATP. This leads to reduction in inflammation, both due to a fall in extracellular ATP levels and stimulation of A2A receptors on neutrophils and lymphocytes by adenosine. The latter also stimulates A2B receptors and maintains epithelial integrity, presumably by promoting healing through augmented blood flow.

The purinergic system, in particular ADP, plays an important role in the function of platelets. Thus, ADP stimulates P2Y1 receptors on platelets and through the G-protein Gq, activates phospholipase C. The downstream effect of this is change in the shape of platelets through the actin cytoskeleton. Similarly ADP activates the P2Y12 receptor, which, through the intermediation of the Gi G-protein switches off adenylyl cyclase, decreases cAMP and activates the GPIIa/IIIb receptor, thus facilitating the binding of platelets to fibrinogen, resulting in platelet aggregation.

Clopidogrel is a P2Y12 inhibitor. It is a prodrug and needs to be activated in the liver. This particular stage can be affected in certain mutations and thus reduce the efficacy of clopidogrel in affected subjects.

Stimulation of A2A and A2B receptors leads to platelet inhibition, explaining the efficacy of dipyridamole in prevention of ischaemic stroke.

While extracellular adenosine is regarded as a safety signal in ischaemia and reperfusion, where it reduces inflammation and tissue damage, it can have the opposite effect in cancer. In general, while its immunosuppressive effect on T-lymphocytes reduces autoimmunity, it can be a hindrance in fighting cancerous cells. In a recent paper in Nature, the investigators found that the subset of cancer sufferers who had the highest benefit from adoptive T cell therapy had a higher proportion of CD8+CD39-CD69- T-cells in the infusate. It is possible that increased expression of CD39 on T-cells leads to production of extracellular adenosine, immunosuppression and T-cell exhaustion, although this is yet to be confirmed. In general, adenosine is found in higher quantities in hypoxic tumour microenvironments, although this may be consequence rather than cause of tumour survival.

Finally, to the salutary effects of that cup of coffee. It is thought that caffeine reduces fatigue by inhibiting cerebral adenosine A2A receptors.

Wednesday, 25 November 2020

The Science Behind DNA Analysis of Archaeological Human Remains

DNA, even when well preserved, will not last for more than a few hundred thousand years. Therefore the description of DNA sequences in million year old dinosaur remains is due to contamination from modern ambient DNA, which is widespread in labs and elsewhere. The PCR will amplify any DNA, modern or prehistoric.

There are however certain clues to the age of DNA remains. The first is fragmentation. Ancient DNA fragments are between 50 and 500 bases long due to degradation. In fact most are no longer than 100 nucleotides.

Second, and most reliable is the observation that at both ends of DNA (5' and 3'), Cytosine (C) is deaminated to Uracil (U). This is an effect that occurs after death due to bacterial cytosine deaminase. This latter enzyme is not present in mammals (not to be confused with AID or activation induced cytidine deaminase, which is present during development and plays a major role in the affinity maturation and somatic hypermutation of B cells, in effect giving B cells their antigen specificity).

Thus if an archaeological remains has a high proportion of uracil bases in DNA- remember, this base is only present in RNA, it indicates that the specimen is thousands of years old. Archaeologists use an enzyme called uracil-DNA- glycosylase (UDG) to break the bond between deoxyribose sugar and uracil and remove the uracil bases.

Not doing so leads to a curious phenomenon. The PCR used to amplify the DNA remains reads the uracil as thymine (T), since thymine is the native DNA base, not uracil. Hence, on the supplementary (anti-sense) strand, it puts an adenine in place of the guanine (which paired with the original cytosine). The uracil on the sense strand is now replaced with thymine to pair with adenine. The original C-G pairing has therefore now been replaced by an A-T pair. Thus there is miscoding caused by the PCR process manifesting as "transition " of C to T and G to A. (In contrast, when a purine is replaced by pyrimidine base, it is described as "transversion").

There can be blocking lesions caused by the formation of non-physiological chemical bonds between say DNA & protein, which can make it difficult for PCR to read through the ancient DNA. This is difficult to overcome.

Archaeologists have databases comprising thousands of single nucleotide polymorphisms (SNPs) mapped from certain lineages such as Armenian farmers or Iranian invaders which they then compare with the putative SNPs from the sequenced DNA sample to see if there is a match.

Here, mitochondrial DNA, which is of course maternal derived (the sperm does not contain mitochondria) is more useful than nuclear DNA in lineage tracing. Although mitochondrial DNA shares the same predisposition for degradation with time elapsed since death as nuclear DNA, there are far more copies of a given DNA sequence in mitochondria than in chromosomes. Thus, while the nucleus will only have two copies of a given allele (on the 2 chromosomes), each mitochondrion has 10-100 copies- a phenomenon known as heteroplasmy, which explains why mitochondrial mutations cause such variable phenotypes unlike mutations that occur in chromosomes. Mitochondrial DNA therefore is favoured by archaeologists for lineage tracing.

Are there any other ways to determine where a long dead person originated, apart from DNA sequencing? As it happens, there is.

The isotopes of Strontium which are deposited in bone and teeth (dentine and enamel) during development are a reflection of the area where the person grew up in. There are 4 isotopes of Strontium- Sr84, Sr86, Sr87 and Sr88. Sr 87 is not a natural isotope. It is formed by beta decay from Rubidium 87. The rocks and water and therefore, by extension, the remains of a person or animal growing up in a certain area will always display a specific Sr87:Sr88 ratio that is native to that area. When that person migrates to a different geographical area, his teeth, which are often well preserved thousands of years after death, will still carry the signature Sr87:Sr88 ratio of the area where he or she grew up in , as the signature is established in the developing teeth or bones, and does not change after death.

It is thus possible to say that the woman's remains found in a Harappan valley, in fact grew up in Iran.

Saturday, 14 November 2020

Paradoxical Effect of Caesin Kinase Inhibition in Deletion 5q Myelodysplastic Syndrome

Caesin Kinase (CK1) is common to the Wnt-beta catenin pathway and the Hedgehog pathway. Along with other kinases, it acts as an inhibitor in both pathways to reduce the transcription of genes in response to wnt and hedgehog proteins respectively.

Myelodysplastic syndrome (MDS) due to deletion of 5q (5q del) has an unique phenotype which can be explained by the differential action of CK1 on the wnt-beta catenin pathway.

Subjects with 5q del have macrocytic anaemia, mild thrombocytosis with dysplastic, hypolobated megakaryocytes. MDS in 5q del is uniquely responsive to immunomodulators such as lenalidomide, although the anaemia does relapse after 2-3 years due to new mutations in other genes such as RUNX1.

Deletion of 5q leads to haploinsufficiency of the CK1A1 gene located on 5q.32. Lenalidomide inhibits the remaining CK1A1 allele and reverses the phenotype of MDS. This is admittedly non-intuitive, as there seems to be an abrogation of the dose-response effect here. If haploinsufficiency leads to a phenotype, how can suppressing the remaining normal allele reverse that phenotype (rather than worsen it)?

This apparent paradox is explained by the effect of CK1 on the wnt-beta catenin pathway. Haploinsufficiency of CK1 removes some of the inhibition exercised on the wnt-beta catenin pathway. The latter leads to enhanced survival and proliferation of the neoplastic clone, causing MDS.

However, genetic knockdown in mice or pharmacological inhibition of the remaining CK1A1 allele by lenalidomide in 5q del MDS sufferers leads to complete disinhibition of the wnt-beta catenin pathway. While this initially stimulates the haemopoetic progenitors of the neoplastic clone, continued stimulation soon leads to stem cell exhaustion and death of the progenitor cells now bereft of any CK1 activity.

Wednesday, 11 November 2020

Reversal of The Earth's Magnetic Axis

The magnetic axis of celestial objects flips periodically, including a complete reversal (by 180 degrees). For example, the sun's magnetic axis reverses direction every 11 years. However, the Earth's geomagnetic axis flips much less often. The last time it reversed completely was 780,000 years ago, a phenomenon called the Matuyama-Bunhes reversal after the two scientists who described it. Since then, the Earth's magnetic field has tried to flip on 10 occasions, but on each occasion it has reverted back to its current axis.

It's fair to say therefore that such reversals happen very infrequently for the Earth. When it does happen though, it affects the polarity of magnetic material in lava flows, sea beds etc which can be detected in rocks and fossils.

The Curie temperature is one above which magnetic substances (which includes all minerals containing iron, nickel and cobalt) lose their magnetic properties. This varies between 580 and 680 degrees Celsius for the oxides of iron -Fe2O3 and Fe3O4. Conversely, when cooled below this temperature, such objects regain their magnetism. Thus, igneous rocks have inherent magnetism dating back to the time when they were formed from cooling lava flows millions of years ago.

The reversal of Earth's magnetic axis is not instantaneous- it occurs slowly- over thousands of years. During this period, magnetic material in cooling magma will take up the polarity of the reversed magnetic polarity of Earth, and the rocks formed therefrom would reflect this reversed polarity for ever. Thus, geologists can analyse such rocks, or fossils which have enclosed magnetic material in sea beds and make a fair guess as to their age. This is one way of fossil or rock dating.

Monday, 9 November 2020

How Significant are Pfizer-Biontech's COVID-19 Vaccine Results?

Pfizer has just announced that their COVID-19 mRNA vaccine has proven to be "90% effective" in preliminary stage 3 results. But what does this mean?

Let's assume that Pfizer randomised the vaccine and placebo arms in a 1:1 ratio. this is not always the case- it's quite usual for trials to use a 2:1 ratio for randomisation. However, this is the most intuitive scenario, so let's use it.

From figures given, we understand that 94 subjects have developed COVID-19 across both arms of the trial so far.

Thus, there would be 22,000 subjects in each arm. Using Pfizer's own figures of 90% efficacy, it follows that 9 subjects in the vaccine arm and 85 subjects in the placebo arm developed COVID during the study.

(x/22000 divided by (94-x)/22000= 0.10, gives a value of x=9, approximately)

Plugging this into a 2x2 table:

Vaccine Placebo

Disease 9 85

No Disease 21,991 21,915

Now you apply the chi-squared test, assuming that results would be acceptable with a confidence interval of 95%.

You thus get a p value of <0.00001.

It's very likely that the results are not due to chance, i.e. they show genuine protection against the virus.

However, there is a caveat. The Pfizer mRNA vaccine needs to be stored at -80 degree Celsius. It requires a cold chain. It must be used quickly after thawing. It might be a difficult ask particularly in countries such as India & Brazil.

Sunday, 8 November 2020

Relative Nutritional Availability of Nitrogen & Carbon Shapes Evolution in Oceans

The nitrogen/carbon content of DNA bases are inverse to each other. Thus while the pyrimidine base cytosine contains one more N than thymine, the latter contains one more C than cytosine. Similarly, the purine base guanine has the extra N , while adenine has the extra C.

G/C are therefore nitrogen rich, while A/T are carbon rich.

Evolution of life in oceans reflects the relative availability of N & C. Thus, microbes living in shallower ocean waters have access to abundant carbon due to fixation of the latter by photosynthetic plants dwelling at the surface of the ocean. However, these microbes are relatively nitrogen poor. The reverse situation applies to bacteria and archaea living at the depths, where plants are much less productive (meaning less photosynthesis), and are thus carbon poor. However, heterotrophic bacteria living in ocean depths have access to greater amounts of nitrogen from decaying plant and animal matter at the bottom.

This differential availability of nitrogen and carbon is reflected in the size and content of microbial genomes in oceans. Thus, microbes living in surface waters have smaller genomes due to the relative scarcity of nitrogen. Furthermore, surface microbes have a low GC content, compared with depth dwelling microbes, but are relatively enriched in AT, the carbon rich-nitrogen poor bases.

This of course has an effect on the absolute amount of proteins synthesised by microbes. Exons are over-represented in GC rich areas of the genome. More exons means translation of more proteins and therefore more nitrogen usage.

Nutritional constraints caused by availability of nitrogen and carbon is reflected even more vividly in the proteins produced by ocean dwelling microbes, and indeed all life forms in general. Redundancy of the genetic code means that there are multiple codons for most amino acids. The choice of a favoured codon also reflects nutritional pressures. Thus, a point mutation in the favoured codon will almost always give rise to an amino acid with a similar nitrogen or carbon content to the original amino acid rather than one with a higher nitrogen or carbon content. The non-favoured codons are not "used" by the mRNA (dictated by the genome from which the mRNA is transcribed) as point mutations here could give rise to more "expensive" amino acids, higher in either nitrogen or carbon content compared with the original. Thus, the genetic code is parsimonious in terms of nitrogen or carbon usage.

To illustrate, the amino acid threonine can be encoded by 4 triplet codes on DNA- ACC, ACA, ACG and ACT. C to G transversion (where a purine is replaced by pyrimidine or vice versa, rather than purine to purine, etc-the latter is called transition) at the second position of the triplet code will give rise to serine for ACT and ACC, but will produce arginine if the transversion occurs in ACG or ACA. Arginine is higher in both nitrogen and carbon content than threonine while serine and threonine only differ in the position of the oxygen atom. In a ground-breaking paper published in Science, Shenav & Zeevi found that in 187 microbial oceanic species, ACT was far more likely to be favoured than ACA, while ACC was similarly preferred to ACG. This demonstrates that the genetic code has evolved to favour lower usage of nitrogen and carbon, as these two elements are likely to be in nutritional deficit in the environment. The same constraints were not found for oxygen, which is abundant.

Once again, due to the inverse relationship between nitrogen and carbon availability, mutations that lead to lower N usage are inversely related to those that lead to lower C usage.

References:

1. L Shenav, D Zeevi. Science 370, 683 (2020).

2. JJ Gryzmski, AM Dussaq. ISME J. 6, 71 (2012).

Sunday, 25 October 2020

Codon De-optimisation, Beta Turns, etc

Why would anyone wish to design a live, attenuated vaccine for COVID-19? They cannot be administered in pregnant women or immunosuppressed subjects, but they do have one major advantage. They are the only form of vaccine that can be given intranasally. And given through this route, they stimulate the production of secretory IgA antibodies, which are the only isotype which protect the upper respiratory tract against COVID-19. (The lower respiratory tract is protected by circulating IgG).

Unlike other types of vaccines therefore, live attenuated vaccines would be expected to stop not just illness from COVID-19 in the vaccinated subject, but also transmission of the virus to contacts.

But how is the live virus attenuated? There are various means of doing so- by growing it at a lower temperature, or in a non-human cell line, but the one method used for 3 vaccine candidates using live attenuated virus during the current pandemic use a technique called codon de-optimisation.

The genetic code is redundant. That is to say, a given amino acid can be encoded by more than one codon. Yet, amongst these multiple codons, there is one that is favoured above all others- a phenomenon called codon bias.

Several live vaccines have attenuated the causative virus by reverse genetics (targeting the putative DNA triplet bases which codes for a certain amino acid in the peptide chain), through replacing the normally favoured codon in the viral DNA or RNA with a less favoured codon. In some cases, this goes a bit further and replaces a favoured codon pair with a less favoured codon pair. This is called codon de-optimisation.

While in theory, the replaced codon codes for exactly the same amino acid, in practice, this disrupts the tertiary structure of the peptide chain and leads to a dysfunctional viral protein being translated in the host cell. It is thought that the tRNA carrying the "non-favoured" anti-codon somehow interferes with the translation machinery.

Interestingly, when non favoured codon pairs are introduced during codon de-optimisation, it invariably introduces more CpG dinucleotides (nnCpGnn). The latter are under-represented in favoured codon pairs. It has however been shown that this excess of CpG nucleotides is not mechanistically responsible for the disruption of translation, which is thought to be induced by the unfavourable "fit" caused by the tRNA carrying the "unfavoured" anticodon, as described above.

The corollary to this is that codon optimisation (i.e. using the favoured codon) can improve the yield of useful proteins produced for medical usage in E.coli by phages.

A related concept is stabilisation of a recombinant viral protein vaccine by introducing two proline residues around a beta turn in the peptide molecule. A beta turn is a portion of the peptide chain when there is a sudden change in direction , say from an alpha helix to a beta pleated sheet, or between two alpha-helices. The artificially introduced proline residues at the beta turn stabilises the whole protein molecule and prevents misfolding.

This technique has been used in COVID-19 by Novavax for their recombinant Spike protein vaccine. It has also been used for the mRNA vaccines produced by Moderna and Pfizer-Biontech. The mRNA molecule in these vaccines is destined to be translated into the full length S-protein inside the cells of the vaccinated person, with the difference from the wild type S-protein being the 2 stabilising proline residues.

Saturday, 17 October 2020

Herd Immunity for COVID19- Who Do You Vaccinate?

So you have access to several 100 millions of a new vaccine for COVID, But there are billions of people that need to be vaccinated. Who do you vaccinate? What is the quickest way to (a). Protect the most vulnerable? (b). Achieve herd immunity?

Let's deal with the logistics of achieving herd immunity first. For a given R0 (R-naught or Reproduction number), the proportion of population (say P) that needs to be immune in order to achieve herd immunity is given by:

P= 1-1/R0.

Since the R0 for COVID19 is 3, the proportion of population that needs to be immune to achieve herd immunity is 2/3 or around 67%.

Yet, it is likely that for a large country of the size of say, India, a country of some 1.3 billion, you'd be struggling to lay your hands on nearly 900 million vaccines first up.

Or even if you did, which 2/3rds would you choose to vaccinate?

And here you are on the horns of a dilemma, almost game-theoryesque in its nature. Would you vaccinate the oldest third (and therefore epidemiologically the most vulnerable), the middle aged tertile, or the children?

Think before you answer!

Keep in mind that no matter which vaccine is used, the likelihood of elderly subjects developing immunity as a result is much lower than younger subjects. It's just the ageing immune system responds poorly to almost any antigen- be it natural or vaccine-carried- a phenomenon called immune senescence.

OTOH, children respond brilliantly to vaccines. National immunisation schedules are premised on this phenomenon. What's more, they then stop spreading the putative infective agent to the rest of the population, thus reducing its overall transmission to the older, vulnerable subjects.

To illustrate, when conjugate vaccines for Pneumococcus (PCV7, followed by PCV10 and PCV13) were introduced for childhood vaccination in 17 European countries, the occurrence of invasive Pneumococcal infections over the next 5 years (2011-15) fell in people over the age of 65- the most susceptible population-by over 75%, for the strains that were included in the vaccines, but increased for the strains that were not included. Overall disease burden among the elderly was at least moderately lower, simply due to childhood vaccination.

It follows therefore that if decision makers opt to vaccinate the older thirds of the population with the first few hundred millions of COVID vaccine, control of the pandemic would be by no means assured. In fact, failure is guaranteed, as these ageing group will respond poorly to the vaccine and will continue to be vulnerable to unfettered transmission from the young.

OTOH, vaccinating the group which is most likely to transmit the virus- the young- is likely to achieve the holy grail of herd immunity more quickly, due to the fact this younger population are more likely to respond to the vaccine and stop spreading it to the elderly.

Thursday, 1 October 2020

Carbon Nanoparticles To Treat Atherosclerotic Plaques

Andre Geim & Konstantin Novoselov, two scientists from the University of Manchester, won the Nobel Prize for Physics in 2010 for their work on carbon nanoparticles. Nanomedicine is now a reality. But before we understand Nanomedicine, we must understand the unique properties of nanoparticles themselves.

The most important characteristic of nanoparticles that facilitates medical usage is their very high surface area to volume ratio. If you cut a 1 cm cube into 10^21 cubes with 1 nm sides, that keeps the total volume unchanged, but increases the surface area by a factor of 10 million. This enormous increase allows hollow nanotubes or nanospheres to be used as "Trojan horses", loaded with drugs, antibodies, therapeutic small molecules, etc to great effect.

The other unique property of nanoparticles is that they impose constraints on the directions in which electrons spin in their orbits. Electrons are similar to tiny bar magnets, with a surrounding magnetic field that corresponds to the electron spin in an applied field. In iron oxide macoparticles (>20 nm in diameter), for example, electrons spin in both directions, thus neutralising each other's magnetic effect. OTOH, in iron oxide nanoparicles (<20 nm) all electrons spin in the same direction, and thus each tiny magnet has an additive effect, generating a much bigger magnetic field. This can be exploited in MRI scanners, for example. This particular property is mentioned here for interest and has no relevance to the example I am about to discuss below.

In January 2020, a team of scientists from Stanford published a study in Nature Nanotechnology that demonstrated that the build up of atherosclerotic plaques in the aorta of genetically engineered mice could be abolished by the use of nanotechnology. These mice had had both their Apo E alleles deleted, thus making them very prone to atherosclerosis. To understand how the scientists stopped plaque build up in these mice, we just need to understand a tiny bit of molecular biology.

Normal cells in the body carry a marker called CD47 on their surface that stops them from being "eaten" (phagocytosed) by macrophages. In apoptotic cells , this surface marker disappears, which is recognised by macrophages as an "eat-me" signal, allowing them to hoover up dead or dying cells, a process called efferocytosis.

The way CD47 prevents its bearer cell from being eaten is by binding to a ligand (something that ligates) called Signal Regulatory protein alpha (SIRP) on macrophages. When SIRP on macrophages is ligated, it activates a downstream enzyme called SHP-1. This latter is a phosphatase (removes phosphate), and belongs to a class of enzymes that in general, act as inactivating enzymes. In this case, it inactivates a type of myosin in the cytoskeleton of the macrophages, thus stopping it from eating the CD47 bearing cell.

What the Stanford team did, was to load carbon nanotubes with two things- firstly, an inhibitor of SHP-1, which would thus scupper the CD47-SIRP pathway, thus abrogating the "don't-eat-me" signal. Secondly they put in a fluorescent dye that would make it easy to track the involved cells through a process called flow cytometry.

But the scientists still had one problem. In previous animal experiments, where investigators had targeted CD47 on plaque cells from atherosclerotic areas with a specific monoclonal antibody to CD47, the antibody had killed lots of "innocent bystanders" such as red blood cells in the spleen, which also carry CD47. You see, macrophages carry something called Fc receptors, which bind to the Fc portion of antibodies and destroys anything that the antibody itself is attached to (in this case, the red cells). This led to quite troublesome anaemia in these original experimental animals.

This is where the genius of carbon nanotubes was exploited by the Stanford team. Because of the tiny size of these nanotubes (called single walled nanotubes or SWNT), they are taken up by 99% of inflammatory monocytes. It is these activated monocytes which recognise the hallmark inflammation in atherosclerotic plaques, enter them and are converted into active macrophages. By contrast, the SWNT are taken up by <3% of other immune cells. The upshot is that normal healthy cells carrying CD47 are largely spared.

Furthermore, the scientists coated the nanotubes with a substance called PEG (polyethylene glycol). PEG is the same stuff that will be familiar to doctors as a powerful purgative, and therefore used in bowel preparation, the same stuff that is attached to drugs such as beta-interferon (in the treatment of multiple sclerosis) or to Certolizumab (in the treatment of rheumatoid arthritis), to prolong the action of these drugs. PEG is hydrophilic and therefore allows intravenous injection into blood.

The results were good. Atherosclerosis was prevented in these experimental mice despite their genetic vulnerability.

The concept extends way beyond the heart. Many cancerous cells try to evade the immune system by expressing CD47 on their surface. They can be similarly targeted if a way of selecting them out can be found (perhaps through hypoxic metabolism, as they display the Warburg phenomenon?).

It is worthwhile ending by mentioning that the inflammatory nature of atherosclerotic plaques has been previously targeted in trials of an interleukin-1 (IL-1) antagonist (these are also used to treat severe and refractory gout where other agents have failed). Unfortunately, the limiting side effect was serious infections, as IL-1 is a vital cytokine for the innate immune system. This is where the selectivity of carbon nanotubes was highlighted through the study.

Reference:

1. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7254969/pdf/nihms-1546057.pdf

2. https://www.nejm.org/doi/pdf/10.1056/NEJMra0912273?articleTools=true

Wednesday, 22 July 2020

Racemisation of L-Aspartate is a Function of Ageing

L-Aspartate is the predominant enantiomer of Aspartic acid in the human body. Gradually, over time, L-aspartate undergoes racemisation to D-aspartate at body temperature. (Racemisation is the conversion of the optically active isomer to the optically inactive one). This has all round implications.

Consider this problem. Enlargement of the aorta, emphysema, laxity of skin and bladder dysfunction are all age related problems. The factor common to these areas- the aorta, lungs, skin and bladder- is a high proportion of elastin. Elastin is thousand-fold more stretchable than collagen, so it is quite intuitive that it should be present in organs/blood vessels that need to stretch. However, it is also true that a high level of elastin predisposes these tissues to ageing faster than tissues with low elastin content. It turns out that elastin has a far lower turnover than fibrillar matrix proteins such as collagen. This is nicely illustrated by a rising proportion of D-aspartate in elastin in tissues such the aorta with age. The proportion of D-aspartate in collagen remains constant at around 3% in the young and elderly aorta, but the proportion of D-aspartate in elastin rises from 3% to 13% between the two extremes of age, illustrating that senescent elastin fibres are not replaced, while the turnover of collagen remains relatively invariant. Thus, D-aspartate tends to accumulate in the longest lived elastin fibres. The gradual erosion of elastin content with age translates into a dilating aorta, emphysema, etc.

This principle is utilised in determining the age of a deceased person when only remnants of tissue are available. Since teeth often outlast the rest of the body, the D-aspartate content of dentine is used for this purpose, but equally, another tissue such as epiglottis or skin could be used. One simply need compare the D-aspartate content of the whole tissue with the D-aspartate content of the contained elastin fibres. The older the person, the higher will be the proportion of D-aspartate in elastin.

Accuracy of Smartphone Apps To Measure Pulse Oximetry

Some smartphones have a pulse oximeter function. Samsung, for example has an app called Digidoc, which uses the camera and the flash inbuilt into the phone to give a an arterial oxygen saturation. To understand how this works, it is perhaps important to understand how standard pulse oximetry functions, so that we may compare the two approaches.

The standard pulse oximeter is based on the principle that oxygenated haemoglobin (oxyHb) and deoxygenated haemoglobin (deoxyHb) absorb various wavelengths of light differently. Thus, as the graph below illustrates, oxyHb absorbs more infrared light than deoxyHb, while deoxyHb absorbs red light far better than oxyHb. (Pnemonic: SeXy DARLing- At SiX hundred wavelength, Deoxygenated haemoglobin Absorbs Red Light)

The pulse oximeter has a diode for emitting red light (wavelength 660 nm) and one that emits infrared light (wavelength 940 nm) separately. These are passed in turn through the finger inserted into the probe, and a screen below the finger (with its contained artery) measures the amount of light coming through. It also measures the amount of ambient light passing through the finger and subtracts it from the total light traversing the finger.

The oximeter cleverly ignores the "noise" from absorbance contributed by tissues such as skin, muscle etc (because these too absorb light) by only measuring light absorbance when it is pulsatile, i.e. generated by arterial pulsation. It thus ignores the blood flow in veins, which is of course non-pulsatile.

It is thus important to look at the graph (called a plethysmograph) of oxygen saturation generated by the oximeter or app. it should look something like this. If the graph looks like the tracing on the top, you are OK. If it looks like the one at the bottom, don't trust it. The plethysmograph is as important as your oxygen saturation reading. If the graph looks unfamiliar, the reading is not accurate. (Pnemonic used by anaesthetists: SpO2- See pleth before O2)

Thus it is intuitive that depending on the amount of oxygen present in arterial blood, the relative amount of oxyHb versus deoxyHb will vary. With this, will vary the ratio of cumulative red light absorbed versus the cumulative infrared light absorbed. Thus, at 100% arterial oxygen saturation, the only contribution will be from oxyHb, and the red:infrared absorbance ratio will be that of oxyHb. Conversely, at 0% oxygen saturation, the red:infrared absorbance ratio will be that for deoxyHb. At various intermediate saturations, the ratio will be between the 2 extremes, and the computer can easily tell what proportion of Hb is oxygenated from a given figure. This sort of analysis is called "ratio of ratios".

The pulse oximeter readings were standardised by measuring oxygen saturations in healthy volunteers given fixed amounts of oxygen to breath in a previously titrated oxygen/air mixture. However, oxygen saturations below 75% were not used due to reasons of safety and below this level, mathematical methods are used to calculate oxygen saturation.

Now to the Digidoc app used by Samsung. This is based on using white light (from the flash), which is of course a mixture of all visible lights. The camera measures the residual light coming through the finger. Instead of using the ratio of ratios method, the app relies on a "neural network" created from 38 normal volunteers, who were asked to breath room air for 30 minutes and then advised to hold their breath for as long as they could. Reference values were thus obtained for absorbance of white light for all levels of oxygen saturation between 85% and 100%. Saturation of oxygen rarely falls below 85% with breath holding.

It is important to state that the app is not accurate below 85% oxygen saturation due to the above reason. It should therefore not be used for detecting hypoxia, as in subjects with COVID-19. The purpose of the app is to establish baselines for a given person pursuing exercise and sports and compare their oxygen sats before, during and after exercise. Unfortunately, it is widely used by people with lung disease such as COPD, which it is not intended for.

There are 2 separate studies, one in adults and one in children, which compared the Samsung oximetry app to a standard pulse oximeter. The first study also compared the readings to arterial blood gas measurements. Both studies found that the app is reliable and accurate. The app also measures pulse rate accurately. However, using a probe that plugs into the smartphone (available to purchase) increases accuracy slightly.

IMO, this is an useful app for daily exercise and for sportsmen, just like the estimated VO2max on Garmin watches. However, it was never intended for medical usage.

References:
1. https://www.howequipmentworks.com/pulse_oximeter/#:~:text=The%20pulse%20oximeter%20works%20out,of%20infrared%20light%20absorbed%20changes.
2. https://s2.smu.edu/~eclarson/pubs/2018pulseox.pdf
3. https://pubmed.ncbi.nlm.nih.gov/29215972/
4. https://pubmed.ncbi.nlm.nih.gov/30904343/

Friday, 12 June 2020

What's The Diagnosis?

52-year-old man with one acute and one "chronic" finding. Leads V1-V6 are actually leads V1R through V6R (right chest leads). What's going on?

From Wave-Maven (ecg.bidmc.harvard.edu)

Tuesday, 9 June 2020

South Asian Health in Perspective

Why do highly educated, prosperous middle aged Indian men or women need to worry about health? They now have access to some of the best hospitals, highly qualified physicians, and can literally buy healthcare like a commodity. Surely, there is no cause for concern?

Sadly, there is. Many of the conditions I am going to discuss here are not diseases of poverty, but maladies associated with plenty.

In the two decades between 1990 and 2010, the proportion of deaths worldwide from non-communicable diseases (NCD) rose from 57% to 65%. Fully 80% of these occur in countries like India. And 90% of those deaths affect people below the age of 60.

The 4 biggest killers among NCD are CV disease, Cancer, COPD and Diabetes. Three of those are particularly relevant for Indians.

Of these, the one that towers above the rest is cardiovascular disease- read as a higher risk of MI and stroke. It's sobering to realise that the Indian subcontinent accounts for 25% of the world population but 60% of patients with heart disease.

A recent study (2010-2014), charmingly called the "MASALA study" (Mediators of Atherosclerosis in South Asians Living in America), showed that this cohort has a 4-fold higher risk of ASCVD than the general population. Furthermore, they develop this a decade earlier than others- often even before they have reached their 50s.

But it doesn't end there. They are more likely to require CABG to treat their IHD, and are likely to have poorer outcomes at CABG.

So what's driving this? The fact that Indians have one of the highest rates of Type 2 Diabetes Mellitus (T2DM) does not help. Worldwide, 1 in 11 people suffer from T2DM, i.e. 9%. The Indian government's own figures puts the prevalence at 11.8%. However, the prevalence is almost double in urban areas vis-a-vis rural settings, so that figure is an underestimate for most Indians reading this article.

The worrying thing is that many Indians who develop T2DM or IHD would not be classified as overweight or obese by international consensus definitions. Thus internationally, overweight is BMI>25, while obesity is defined as BMI>30. It has now been suggested by the AHA that we lower this to 23 and 27.5 for people of South Asian origin. Why is that?

The problem appears to be with the distribution of body fat in Indians. We tend to have less of our body weight contributed by muscle and more by fat. However, this fat is distributed around our abdominal organs- so called visceral fat and around our heart, where they contribute to an inflammatory phenotype in the blood vessels. Furthermore, Indians have almost half the brown fat- so called brown adipose tissue, of Caucasians, which is thermogenic and helpful in burning off the calories. Our basal metabolic rate is lower than other ethnicities. It appears that we accumulate fat- in the wrong places- literally, and struggle to burn it off. Thus T2DM and IHD both occur at lower BMI in Indians.

South Asians also have a higher prevalence of hypertension (not more than Blacks), high triglycerides, lower HDL, more LDL cholesterol, and a higher total cholesterol to HDL ratios, again, all at a lower body weight than other groups.

Diet clearly plays a part. Thus the reliance on unrefined bread- naan, etc, high usage of trans fatty acids through Vanaspati, and sources of saturated fats such as ghee and butter don't help. Adoption of Western diets both in India and in those living abroad, with a preference for high fat dairy, pizzas, potatoes and red meat makes for the worst of both worlds. Those who do best have a bicultural diet, adopting the best from both systems, combining a predominantly vegetarian diet with a high consumption of green leafy salads, whole grains, fruits, nuts and seeds, and chicken and fish.

Vegetarianism itself however does not help as this is often accompanied by fried snacks, sweetened beverages, and high fat dairy.

High intensity statins should be prescribed for secondary prevention. Those with T2DM, aged 40-75 and LDL>70 mg/dl should have moderate intensity statins. With other risk factors in addition to T2DM, such as family history of ASCVD, LDL>160, metabolic syndrome, CKD, pre-eclampsia, RA or HIV and South Asian origin, high intensity statins should be used. This means that all Indians aged 40-75 who have T2DM should take high intensity statins.

Primary prevention is based on the estimation of 10-year ASCVD risk through an online calculator, with risk stratified into 3 categories. (<7.5%, 7.5-20%, and >20%). CT calcium scores may be used where doubt exists, such as the intermediate group.

Indians have an unique vulnerability to haemorrhagic stroke. This may be associated with a higher prevalence of hypertension and high salt intake, but there are other, yet undefined factors. Unfortunately, this means that people who need anticoagulation, specifically those with AF and CHADS2 or CHADSVasc score of >1, are often not treated with anticoagulants, despite the availability of DOACs. Instead, aspirin is overprescribed for this group.

Undiagnosed CKD, related to ASCVD, or indeed as a contributor to ASCVD (it cuts both ways) is a further problem. Indians are vulnerable to heat stress nephropathy, found in hot dry areas. This was first described in central America, and therefore began under the rubric of Meso-American nephropathy. It then appeared in Sri Lanka and in Andhra Pradesh, particularly in the Nellore district. It tends to predominate in rural areas and is therefore more common in farmers, with >60% prevalence in villages. It's thought to be related to excessive sweating, coupled with inadequate hydration, with contributions from rhabdomyolysis. The temperature in India has been increasing at the rate of 0.8 degrees Celsius annually in many areas, and it is not unusual to find a wet bulb temp of>35 degree Celsius a few days of the year, a threshold that equates with intolerable heat.

Next to the heart, perhaps the most threatened organ in Indians is the liver. There are 4 prime insults that leads to this risk- alcohol, obesity, Hepatitis B and Hepatitis C.

Alcohol use is common in India across all sections of the society, and is far more common among men. Unfortunately, the Asian liver is also more susceptible to the ravages of alcohol, with increased production of acetaldehyde. Thus, studies in the UK show that Asians, particularly Sikh men, have a higher risk of cirrhosis, compared with White men drinking equivalent amounts of alcohol. Asian women drink much less often, but are exquisitely vulnerable to alcohol induced liver disease.

The chronic viral hepatitides B & C are silent killers in India, as they are in the rest of the world. It is estimated that India has 57 million cases of Chronic Hepatitis B, more than a fifth of the worldwide burden of 257 million cases. The prevalence of Hepatitis C carrier status in the Indian populace is 1-2%, equating with a total case load of 13-26 million cases, out of a worldwide denominator of 140 million cases. As fully 70% of Hepatitis C cases lead to carrier status (55-85%), most subjects are unaware that they have Hepatitis C. The commonest genotype by far in India is Genotype 3, unfortunately also the strain that leads most commonly to liver scarring and hepatocellular cancer. Unlike in the West, chronic Hep C in india is mostly acquired by vertical transmission rather than IVDU or in MSM.

Those born between 1945 and 1965 have a higher risk of chronic Hepatitis C due to a birth cohort effect.

NAFLD, is as expected, common in India, given the high prevalence of correlates such as T2DM, hypertrigyceridaemia, and high BMI. There is very little awareness of this condition and it is particularly damaging in conjunction with Chronic Hepatitis or alcohol.

It would only be appropriate to end with a word on the health of Indian women. They suffer from the lack of a national breast screening programme (contrast that to mammography every 3 years from age 50 in the UK), cervical screening programme (cervical smears from age 25 in the UK), and a general lack of awareness of post menopausal osteoporosis and the role of HRT. There is also very little awareness of the dangers of ovarian cancer and its late presentation.

Fully 800 million Indians are classed as anaemic. Fifty two percent of non pregnant women of reproductive age are anaemic. Iron deficiency anaemia (IDA) is the major aetiology, but other contributors such as poor diet, parasitic infections and haemoglobinopathies also apply. The intake of iron in diet is very low. IDA in turn leads to a higher risk of preterm labour, low birth weight and high infant mortality rate. Infants are at risk of developing IDA after 4 months of age.

However, apart from well known side effects such as fatigue, IDA has a sinister side to it. The microspherocytes that circulate in the blood of such patients are much less deformable, have higher viscosity, and are more likely to clog up capillaries. IDA also increases factor VIII levels & favours platelet aggregation, increasing the risk of both arterial and venous clots.

One large population based study involving over 200,000 patients showed that subjects with IDA were almost 50% more likely to suffer from ischaemic stroke (http://www.ncbi.nlm.nih.gov/pubmed/24349404). There is a well established association between IDA and cerebral venous thrombosis. The pro-coagulant effect of IDA is magnified in subjects who are otherwise predisposed to thrombotic events, such as Congenital Cyanotic Heart Disease and Hereditary Haemorrhagic Telangiectasia. In the former condition, severe polycythaemia is a physiological response to chronic deoxygentation, with haemoglobin levels often above 200g/l. Despite this, venesection is avoided unless symptoms of hyperviscosity such as headache, myalgia, or blurred vision are intolerable and the haematocrit is more than 65%, as the risk of causing IDA is unacceptable.

IDA is also the commonest secondary cause of Restless Legs Syndrome (RLS). This is a disorder where subjects have an unpleasant feelings in their legs in bed or while they are resting, relieved only by moving their legs. The condition affects sleep and many subjects are chronically sleep deprived. While most cases are primary, a significant proportion are iron deficient and respond to iron repletion.

The association of IDA with Restless Legs Syndrome led people to investigate its role in Parkinson's Disease. Reason? RLS responds well to anti-Parkinsonian drugs such as ropinirole. On such chance observations rests progress in Medicine. Sure enough, researchers established an association- IDA is more common in Parkinson's Disease although the association is nowhere as strong as between IDA and RLS. However, the challenge in Parkinson's Disease (PD) is to recognise it when it is not fully established...or even predict it years before the classical motor symptoms of rigidity, bradykinesia and tremor develop. It is now well recognised by neurologists that a constellation of non-motor symptoms- specifically anosmia, constipation, and in particular a strange phenomenon called Rapid-Eye-Movement Sleep Disorder (REM Sleep Disorder) develop in patients destined to suffer from PD up to 5-10 years before the motor symptoms set in. The last of these, REM Sleep Disorder is a fascinating example of corruption of normal physiological processes and deserves an explanation.

Normally, during REM sleep, which principally occurs in the second half of the night, dreams occur frequently, accompanied by a physiological paralysis of skeletal muscles. In REM sleep disorder, the paralysis of skeletal muscles are lost, so that the subject "acts out" his dreams (the disorder is much more common among men) with motor movements such as flailing arms and legs. Such movements can often be quite violent, and in some cases spousal injury has occurred. It has been estimated that REM sleep disorder affects up to half of all patients with PD. It is even more common in other α-synucleopathies such as Multiple System Atrophy and Lewy Body Dementia, affecting around 80% of patients. Thus, strictly speaking, non-motor symptoms are not unusual in PD and Parkinsons plus disorders at all. They are however less appreciated.

Monday, 8 June 2020

COVID-19- Lessons From Four Hundred Years Ago

If you are a newly infected COVID-19 patient (or any RNA virus, for that matter), you are more likely to do worse if you contracted the virus from a close relative than if you picked up the infection from an unrelated person.

So what support is there for this somewhat outlandish theory?

You will have to look back a long time- around 400 years, to find a plausible explanation for this. In the 16th century, settlers from Europe increasingly explored the Old World (a term that refers to the Americas and Australia, but mainly the former). Over the next century or so, denizens of the Old World, mainly American Indians living in South and Central America, died off. Approximately 56 million people had died by 1650. The population shrunk by 90%. Even as late as the the 1960s and 70s, the population in some Amazonion basins had a death rate of around 75%.

So what killed them?

A key piece of evidence comes from the work of Garenne and Aaby in 1990, in which they found that a child contracting measles from a family member faced twice the risk of death than a child picking up the infection from an unrelated passerby. While it is tempting to simply attribute this to the dosage of the virus being higher in the instances where it was picked up from a family member, studies with an attenuated measles virus have disproven this by showing that the dosage of this virus while infecting a new subject has no bearing on its outcome.

In a fascinating paper published in Science in 1992, Francis L Black, epidemiologist at New Haven, CT, contended that the American Indians were mainly killed off by RNA viruses. These viruses have poor proofreading in their RNA polymerase, which leads to numerous mutations during new viral RNA production, as discussed previously in this forum. They start mutating even while the infection is progressing within an individual host. The host counters this by presenting the neo-antigens in the grooves of MHC Class I molecules on antigen-presenting-cells (APCs) to CD8 T-lymphocytes. Cellular immunity results, eventually clearing the virus. When the virus infects a new host, the process starts all over again.

However, the virus adapts during this process as well. Like all RNA viruses, it has clever ways of stymieing the immune system by hiding antigenic epitopes inside the cell, rather than on the surface, preventing antigen presentation by APCs or mimicking host antigens. When the virus passes from one host to another, if the hosts are related and therefore share MHC phenotypes, the virus, being pre-adapted, finds it easier to multiply, as it has already found a way of bypassing the adaptive immune system. It does not have this advantage if it attacks a person unrelated to the original host.

And here, the genetic homogeneity of those original American Indians became their downfall. Just to illustrate, There are 3 classes of Class I HLA antigens- A, B, and C. Most epidemiologic data is available on the first two. Thus, serological studies have identified 40 A & B alleles in 1342 sub-Saharan Africans, 37 in 1069 Europeans, 34 in 4061 East Asians, but only 10 among 1944 South Americans, 14 in 12,243 Polynesians, and 10 in 5499 Papua New Guineans. All A & B sequences in the New World also occur in the Old World.

The more alleles in a population, the less is the frequency of an individual allele, and the less likely it is that the virus will find it in its next host. Thus if the allele frequency is q, there is a q^2 chance of finding it in the next host. Black worked out that there was a 32% chance that a virus passing between 2 South Americans would not find a new MHC phenotype at either the A or B locus of the new host, but only a 0.5% chance when it passed between 2 Africans.

Thus, the less the polymorphism of MHC alleles in the host group, the more dangerous the virus is, as evidenced by the demise of those Old World denizens, who lived in a cloistered community with no intermarriage.

The same principles apply to this pandemic, albeit in a different context. COVID-19 has a very high infectivity rate, R being >3. If one happens to pick it up from a close relative, one will fare worse due to the shared MHC alleles, and thus viral pre-adaptation, than if the infection came from an unrelated source.

A 400 year old tragedy may thus have lessons for the current pandemic.

References
1. Black F. why Did They Die. Science 1992;258:1739-40.
2. Garenne M and Aaby P, ibid 1990;161:1088.

Sunday, 31 May 2020

Using Genomic Sequencing to Contain COVID-19

If you haven't noticed already, the game has moved on, subtly but surely. Countries are moving on from lock downs, even as cases are at best stable or slowly declining (UK, USA, Italy, Spain), or even surging in some (Brazil, India, Russia). There is now a tacit acceptance that the price of continued and total lock down is too steep, and that nations may have to accept some new cases, in order that society and economy as a whole, can reopen.

It is not a question of if but when. Cases will surge in many places. The pandemic hasn't peaked in the last 3 nations, and is hardly under control in the first group. Some countries have rolled out a "track and trace" mobile app, based on using bluetooth signal to inform if the phone in question has been in the vicinity of one owned by a COVID sufferer. For this, you have to download a track and trace app. If your phone data shows you are at risk, a "track and tracer" (25000 strong in the UK) will contact you, and ask you to self-isolate for 14 days. If you develop the disease, you self isolate for 7 days.

However, there is another way- something that makes the track and trace much, much more effective. This is the power of genomic screening of the COVID19 strains. RNA viruses undergo many more mutations than DNA viruses. One reason for this is the spontaneous deamination of cytosine to uracil. When this happens in DNA virus, this is quickly detected by the virus, and the base is corrected back to cytosine, as uracil is not normally present in DNA. However, this cannot happen in RNA viruses, as uracil is not a "foreign base", and will therefore not be "corrected". C to U mutations will therefore accumulate.

In fact, RNA viruses have over 10 times the rate of mutations as DNA viruses. This does not increase or decrease the pathogenicity of the virus appreciably, but it does mean that within a community or a nation, there might be several "strains" (with differing genome sequences) of the virus in circulation. As this usually affects only a single nucleotide, rather than blocks, it is called "intra-host single nucleotide variation", or simply iSNV. And this presents an opportunity for those seeking to track the spread of the virus.

Consider this. With a limited number of cases, you have the power to sequence the genome of the virus in every documented case within a matter of hours. Each cluster of cases will have a "signature" viral genome, because of the fact that the index case will have a viral strain with its own unique mutations. Thus, once the pandemic is stable and reasonably contained, scientists have within their power to look at the viral genome of a new case, and from a database of existing patients, pinpoint exactly from whom the infection was acquired. Thus, self isolation, instead of being a non-selective and disruptive tool, can be applied selectively and in a limited fashion, to maximum effect. The rest of the population can get on with their lives.

Some examples of common viral mutations might make this easier to understand. During the Zaire Ebola epidemic, it became clear that within the incorporated viral genome in host DNA, a disproportionately high number of mutations were thymine to cytosine (T>C). (Thymine does not appear in RNA, so this is referring to the host DNA that has incorporated the complementary sequence of the viral RNA). It turns out that the preponderance of T>C in virus infected cells is due to the action of an interferon inducible enzyme called "Adenosine deaminase acting in RNA 1", or simply ADAR1. (Interferons, as you know, are produced by the host in response to viral infections).

ADAR1 deaminates adenosine to inosine. Inosine is not a natural nucleotide, and is read as guanosine by the cell. Since the complementary base of guanine is cytosine, the thymine bases complementary to adenine are "corrected" to cytosine by single nucleotide excision and repair. Hence T>C.

This is not a pipe dream. Scientists in NZ, Australia and UK have an extensive database of viral genomes circulating in their respective nations. While the database is virtually 100% complete in NZ and Aus, the UK has data on 20% of viral genomes in circulation, given the very large number of cases. But they are getting there.

This, IMO, presents the only realistic way of opening up the society while continuing to promptly identify and isolate cases and their contacts.

Friday, 15 May 2020

Are Camelids the Key to Beating COVID-19?

The normal antibody (immunoglobulin) has two light chains and two heavy chains. Each chain, be it light or heavy, has a variable and a constant fragment. The variable fragment binds to the putative antigen, while the constant fragment carries the Fc receptor that lets cells like NK cells and neutrophils bind to the antibody. The constant portion also binds and activates complement through the classical pathway.

in 1984, Raymond Hamers at the VUB university in Brussels, while analysing the blood of dromedary camels for antibody response to a Trypanosomal species (the dromedary camel is the Arabian camel, with a single hump, as opposed to the double humped Bactrian camel, found in the plains of Central Asia), found to his surprise that the camel antibodies did not look like the human counterparts at all. They lacked the light chain altogether, and contained only the heavy chain, comprised of the variable and heavy fragments. Quite appropriately, these antibodies were named "camelids".

The 1990s saw the establishment of phage display libraries, which allow the manufacture of virtually any antibody in bacteria, by inserting the relevant sequence in bacteriophages, which then infect the bacteria, and uses the bacterial enzymes to make the protein whose sequence has been inserted into the phage. This technique is responsible for producing most monoclonal antibodies these days, having moved on from the days when antibody producing B-cells were immortalised by fusing them with rat myeloma plasmablasts, a technique described as hybridoma.

It is now possible to produce through such phage display techniques, not just whole immunoglobulin molecules, but parts thereof, such as a the Fab fragment (commercially marketed as Certolizumab), a single chain variable fragment, the camelid (commercial application Caplacizumab, used to treat acquired thrombotic thromobocytopenic purpura), or an isolated variable heavy chain fragment, called VHH. Please see Diagram.

It is the VHH, or the variable heavy chain single domain antibody that now provides promise for the treatment of COVID-19. While vaccines can take years, and canonical (standard) monoclonal antibodies around 6 months to prepare, VHH can be prepared very quickly- within weeks, and therefore are ideally suited for dealing with a pandemic. Please see the linked paper in Cell below:

https://www.cell.com/cell-host-microbe/fulltext/S1931-3128(20)30250-X

VHH has several advantages over other techniques. It ways around 15 kDa, around a tenth of the full immunoglobulin molecule. It can reach antigenic epitopes that the full immunoglobulin molecule cannot reach, such as hidden epitopes, a fact that is relevant with COVID-19. And it does not have the foreign antigenicity that full camelids have, thus reducing the risk that they would be rendered ineffective by the human immune system.

Tuesday, 12 May 2020

Is Re-infection By COVID-19 Possible?

Once you have had COVID19, can you be reinfected with the same virus?

This article in JAMA may provide some reassurance, although it's understandably based on very little data.

https://jamanetwork.com/journals/jama/fullarticle/2766097

If you wanted a summary, it's halfway down the article, in this line here:

To date, no human reinfections with SARS-CoV-2 have been confirmed.

However, as with everything COVID, answers may not be forthcoming for a very long time. Hence, you look at the literature to see if there are predictors of long term immunity after infections, and a couple of facts begin to emerge.

In general viruses (excluding Influenza, which is prone to antigenic shift and drift) tend to cause long lasting immunity- the half life for antibody levels is between 50-200 years, from a 26 year long study of subjects following infections by 6 viruses- Vaccinia (the virus used for small pox vaccine), measles, mumps, rubella, Varicella zoster (chicken pox), and EBV. This may explain why there has never been a sustained second epidemic by a non-influenza virus, i.e. SARS, MERS, Marburg, Zika, Lassa Fever etc (don't misinterpret this as meaning that these viruses cannot infect naive subjects).

https://www.nejm.org/doi/full/10.1056/NEJMoa066092

The same does not apply to antibody responses to protein antigens, specifically for bacteria. For example, the half life for antibodies to Tetanus and Diphtheria is only 11 years-a fifth of that to the virus with shortest lived immunity.

Thirdly, women tend to have longer lasting protection than men.

https://genomemedicine.biomedcentral.com/articles/10.1186/s13073-018-0568-8

Fourth, if you have low antibody levels to start with, for example in a condition called Common Variable Immunodeficiency, a predominantly IgM response to the infective agent would suggest that the response will not be long lived. This may sound strange, but is intuitive, as IgM is the first antibody isotype produced for any sort of humoral (antibody mediated) response. It is then gradually followed by an IgG response. If IgG (particularly IgG1- there are 4 subclasses of IgG-1, 2, 3 and 4) levels do not increase appreciably, the response will not be long lived. Therefore in general, a high IgM level and a low IgG level- particularly that for the IgG1 component- is a worrisome feature.

https://www.jacionline.org/article/S0091-6749(18)30560-8/fulltext

And finally, some people use memory B cells as a surrogate marker for long lasting immunity. This is not a correct assumption. Long lasting humoral immunity can be memory B cell dependent and memory B cell independent, and each exists independently of the other.

If you wanted a straight yes or no answer to the question as to whether infection with COVID-19 is likely to lead to long lasting immunity- I would say, based on the available evidence, the answer is "Yes".

Sunday, 10 May 2020

Is COVID-19 Transmitted Sexually?

Recently, concerns have been raised by reports that COVID-19 could be transmitted sexually, based on its presence in semen by RT-PCR. However, caution must be exercised in drawing any conclusions about sexual transmissibility, based on the above.

It's instructive that the original SARS virus uses the same ACE2 receptor as COVID-19, and in 20 years since it first affected humans, not a single case of sexual transmission (based on epidemiologic or molecular studies) has been described for SARS.

The question we need to ask ourselves is therefore, whether presence of a virus in semen equates with sexual transmission. Fortunately, in Medicine, when you think of a query, somebody else has considered it before and tried to answer it. The paper below from the United Kingdom- "Breadth of Viruses in Human Semen"- did just that in 2017, in the wake of the Zika virus outbreak. Apart from SARS, they documented 26 other viruses that appear in the human semen. Of these, less than half have ever had known sexual transmission. The list includes viruses you would never think of being passed on sexually, such as Chicken pox, Mumps and Chikungunya.

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5652425/

If one is observant, one will notice some glaring omissions in that list. Where's HPV, for example, that well known scourge of teenagers, the purveyor of genital warts, vulval and cervical cancer? Well, it's transmitted sexually, just not through semen. Presence in semen therefore does not equate with sexual transmission. Nor does the absence of a virus in semen provide reassurance that it is not transmitted sexually.

NEJM even ran an editorial on this in 2018- "Virus in Semen and the Risk of Sexual Transmission"-and I quote, "Contrary to prevalent belief, the detection of viral genomes in semen tends to be more common among viruses that are typically not sexually transmitted, such as certain adenoviruses, bunyaviruses, flaviviruses, hepadnaviruses, herpesviruses, paramyxoviruses, and retroviruses"

https://www.nejm.org/doi/full/10.1056/NEJMe1803212

There is another important issue- these COVID-19 virus particles found in semen may not be replication competent. RT-PCR only signifies the presence of the bit of the viral RNA that PCR probe primes with. It will do so regardless of whether the virus has a capsid or not- ie "live" or "dead".

Friday, 8 May 2020

Why Do Subjects of Afro-Caribbean Ancestry Have a Higher Mortality from COVID 19 than Caucasians?

The figures are startling. The mortality rate from Covid 19 among black Americans is 2.6 times that of Caucasians.

https://www.apmresearchlab.org/covid/deaths-by-race

In the UK, the figures are even more stark. Blacks are more than 4 times likely to die from COVID than whites.

https://www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/deaths/articles/coronavirusrelateddeathsbyethnicgroupenglandandwales/2march2020to10april2020

Although in general, Asians and Hispanics do worse than Caucasians as well, the differences are far less pronounced.

What accounts for this difference? There are various possibilities including genetic differences, acquired co-morbidities (diseases) or perhaps the way those co-morbidities are managed by physicians. The last one interests me the most.

Mendelian randomisation is nature's way of demonstrating differences in outcome due to a putative risk factor. For example, subjects with familial hypercholestrolaemia, the commonest autosomal dominant condition in the population, have a higher risk of heart disease and stroke than those without the condition, due to the fact that they have high LDL cholesterol. So far, no such signals have emerged in COVID.

However, differences could be acquired. Black subjects have a higher prevalence of hypertension and obesity than other races. Both of these have emerged as significant risk factors for COVID related mortality.

While the above is undoubtedly true, I believe (I haven't seen this in the medical press yet) that there is another factor- how hypertension is managed in Black subjects. For unknown reasons, Black people with hypertension respond poorly to a class of drugs called ACE inhibitors (yes, it's the same ACE you read about in the context of COVID receptors). In fact, there is evidence to suggest that Black subjects have a higher risk of death from MI (heart attack), stroke and heart failure when treated with ACE inhibitors than when not.

https://www.thecardiologyadvisor.com/home/topics/hypertension/ace-inhibitors-may-not-be-as-effective-in-black-patients/

As a result, ACE inhibitors are used far less often to treat hypertension in Blacks than in other races, and herein, I believe, lies the rub. I have cautioned here in the past against discontinuing ACE inhibitors in hypertensive subjects during the pandemic, as this is likely to lead to harm, an inference that was later confirmed by NEJM.

https://www.nejm.org/doi/full/10.1056/NEJMsr2005760 (free to access)

This is a case of unintended iatrogenic (physician induced) randomisation. If you are obese, and hypertensive, you are more likely to die from COVID 19. However, if you are obese, hypertensive and not taking ACE inhibitors, as in the majority of Black subjects, that risk is far higher.

Wednesday, 29 April 2020

The Non-coding RNAs of Eucaryotes

Messenger RNA, tRNA and rRNA are not the only RNA species present in cells. There are several other non-coding (they do not code for protein) RNA in the cell- small nuclear RNA (snRNA), small nucleolar RNA (snoRNA), heterogenous nuclear RNA (hnRNA), micro RNA (miRNA), small interfering RNA (siRNA), telomerase and signal recognition peptide (SRP). It is important to appreciate that noncoding RNA are also transcribed from DNA- but unlike mRNA, they are not translated into protein. That is to say, all genes do not lead to protein as the end product- for some, the end product is RNA.

Some of the noncoding RNA are produced from introns (as below), which might seem non-intuitive as the generally held view is that introns do not code for anything.

The eucaryotic pre-mRNA is capable of being alternatively spliced in to different mRNAs and can therefore produce different proteins from the same DNA code. This happens in the "spliceosome". Unusually, the splicing reaction is not catalysed by proteins, but by non-coding RNA, called small nuclear RNA (snRNA). A similar situation is seen during protein synthesis, where rRNA, which forms 2/3 of the ribosome-the other third is protein- catalyses the peptide bond formation required to lengthen the polypeptide chain. Such catalytic RNA, which act like enzymes, are called ribozymes. It's thought that they hark back to very early evolution, when RNA was the main catalyst for living cells rather than protein.

There are 5 types of snRNA- U1, U2, U4, U5 and U6. There is no U3 snRNA. These associate with proteins, and together, the complex is is called snRNP. Around 90% of multiexonic mRNA in humans is subject to alternative splicing. The most common form of alternative splicing in humans is exon-skipping, followed by intron retention.

Bacteria have a single RNA polymerase, and do not have alternative splicing. There are 3 RNA Polymerases in eucaryotes- I, II, and III. RNA Polymerase I codes for most rRNA- 18S, 28S, and 5.8S. RNA Polymerase II codes for all protein coding RNA- ie the messenger RNA. RNA Polymerase III codes for tRNA, and also 5S rRNA. The S in rRNA refers to the rate of sedimantation in an ultracentrifuge. The larger the S value, the larger the rRNA.

Messsenger RNA only form 3-5% of the total RNA in the cell. Fully 80% of cellular RNA is comprised of rRNA. Thus, human beings contain some 200 rRNA genes per haploid genome, which are together responsible for making 10 million copies of each type of rRNA (28S, 5.8S, 18S and 5S) to constitute 10 million ribosomes.

The 200 rRNA coding genes are present in just 5 chromosomes- 13, 14, 15, 21 and 22. In all 5 chromosomes, these genes are located right at the very tip (end) of the short arms. 28S, 18S and 5.8S rRNA are all made initially as part of a larger 45 S pre-rRNA before being spliced. 5S rRNA is made separately. All but 18S RNA contribute to form the large 60S rRNA subunit. The smaller 40S subunit is comprised of 18S rRNA.

Many Small nucleolar RNAs are encoded on the introns of other genes, mainly for ribosomal proteins. They are synthesised by RNA Polymerase II and processed from excised Intron sequences.

The nucleolus is the cellular site for manufacture and processing of all noncoding RNA, and additionally carries the genes for tRNA. The 5 pairs of chromosomes mentioned above each make a portion of the nucleolus, which then fuse together to form one large nucleolus.

Antibodies to RNA Polymerase III in Systemic Sclerosis are associated with a higher risk of cancer.

Monday, 20 April 2020

Insights From Human Genome Sequencing

It's two decades since the human genome was sequenced. What it revealed has changed our understanding of the human genome and allowed us to construct a phylogenetic tree of how we got here.

The human genome, like most other mammalian genomes, is comprised of 3.2 million base pairs. There are 25000 known genes. Most mammals have similar number of DNA base pairs, the chicken has around a million, while the Japanese pufferfish, Takifugu rubripes is an outlier with only 400 million base pairs.

Only 5% of the human DNA is transcribed, i.e. read into mRNA, and only 1.5% is translated into protein from exons. Vast tracts of DNA therefore has no apparent function. It is in this context that the pufferfish's remarkable genomic efficiency must be viewed, as it appears to have rid itself of most of its "junk" DNA through evolution.

The need to preserve and keep hold of essential DNA leads to the remarkable similarities seen between diverse organisms such as the yeast, a worm (Caenorhabditis elegans), a model plant (Arabidopsis thaliana) and the mammalian genome.

However, the "excess (noncoded) DNA" is not without its uses. It provides a window on millions of years of evolution. By comparing the genomic sequences of chicken and Homo sapiens, for example, one can tell that these two organisms diverged from their common ancestor 300 million years ago. A similar exercise tells you that we humans diverged from our nearest anthropoid relative, the Chimpanzee a mere 7 million years ago. We are slightly more distantly related to the Gorilla, and even more distantly to the Orang utan.

So what accounts for the remarkable redundancy of the human (and mammalian) genome? Most such redundancy can be explained by "repeats", which account for over 50% of the >3 million base pairs. There are several types of such repeats, namely transposons, simple sequence repeats and segmental duplications. We'll discuss each in turn.

By sheer volume alone, the most abundant of these repeats are the transposons, forming fully 44% of the human genome. Barbara McClintock was awarded the Nobel Prize in 1983 for her discovery of transposons, popularly known as "jumping genes". Transposons are parasitic DNA, which far predate the human species itself. There is evidence that their origins stretch back more than 300 million years.

There are 4 types of transposons- LINE, SINE, LTR retrotransposons, and DNA transposons. Of these, LINE is the most abundant and arguably the most successful, as it still accounts for roughly 1 in 250 mutations in the human genome. Very few SINE still exists in the genome, and the LTR transposons and DNA transposons have died out for all practical purposes except for the solitary exception of HERV-K for the former.

LINE, or Long Interspersed Nuclear Elements- constitute 21% of the human genome. Such DNA is "autonomous", which means that it codes its own proteins needed for propagation. LINE contains 2 open reading frames (the equivalent of exons), and a reverse transcriptase. Unlike retroviruses, and their related LTR retrotansposons, reverse transcription takes place in the nucleus, where an endonuclease makes a single stranded nick in the human DNA to insert the retrotranscribed LINE DNA. This process starts at the 3' end and proceeds towards the 5' end. However, it is often incomplete, i.e. in many cases, it doesn't reach the 5' end. LINE derived RNA has a poly-A tail at the 3' untranslated region, which in eucaryotes, protects the mRNA from degradation once transcribed. The tail has an exactly opposite function in procaryotes.

Misplaced insertion of LINE elements has been associated with diverse human diseases such as dementia and cancers. Perhaps the most interesting example, described by Kazazian, was the insertion of a LINE transposon from Chromosome 22 in a woman into her X chromosome in the middle of the Factor VIII gene, a fact uncovered after her son was born with Haemophilia A, despite an absence of family history.

SINE, or Short Interspersed Elements are non-autonomous, unlike LINE. They do not code for protein, and in fact depend on LINE for the proteins needed for transposition. As such, they are more vulnerable to mutational loss. For example, when LINE2 died out roughly 50 million years ago, so did the associated SINE.

While most SINEs are now no longer functional and therefore cannot propagate, there are two- Alu and SVA which remain functional.

Interestingly, SINE & LINE are located in different parts of the genome. While SINE favours GC rich areas, LINE is located in more AT rich areas. GC rich areas have a higher gene density, while AT rich areas are "genetic deserts", ie they are dominated by non coding, apparently nonfunctional DNA. Some authorities think that SINE elements have a symbiotic relationship with the host DNA, where they work by reducing the likelihood of harmful mutations.

The LTR (Long Terminal repeat) retrotransposons are thought to be the predecessors of ancient retoviruses. Like the latter, they have LTR at both ends and reproduce by reverse transcription from RNA in the cytoplasm (not nucleus, unlike LINEs). As such, they code for gag and pol proteins, just like retroviruses. LTR retrotransposons have all but died out in the human genome. The only remaining retrotransposon- HERV-K has no known function.

Similarly, DNA transposons- which contain inverted repeats at either end, and whose description in maize led to the award of McClintock's Nobel Prize nearly 40 years later, are no longer functional in the human genome. They do remain functional in bacteria though, where they are responsible for horizontal transmission of antibiotic resistance. As it cannot spread from human beings horizontally, it became nonfunctional in the latter.

The human genome is remarkably repeat rich with interspersed transposons, in comparison with the yeast or invertebrates. Furthermore, the transposons in the human genome are ancient, compared with their counterparts in these other organisms. Again, a direct comparison of these repeats between human beings and mouse shows that human repeats are much older. It seems therefore that Homo sapiens has kept hold of ancient repeats in comparison with other organisms including fellow mammals despite the fact that most of these repeats serve no discernible function. Wish we were all as efficient as the puffer fish!

Not all chromosomes in the human cell are equally ancient. The Y chromosome, for example, is a relatively "young" chromosome, with rapid turnover of repeats, ie the repeats on Y chromosome are phylogenetically millions of years younger than on other chromosomes.

Simple sequence repeats are 2 or 3 base repeats such as AT and ATG which are polymorphic. This latter property- polymorphism- particularly in (CA)n has been useful in establishing identity, paternity tests, etc. When n is 1-13, these repeats are called microsatellites, while n=14 or more are called minisatellites. For some reason, (CA)n polymorphisms are infrequent on the X chromosome, i.e. Most X chromosomes have roughly equal numbers of CA repeats.

Segmental duplications involve duplications of 1-250 kb. For some reason, they tend to favour pericentromeric and telomeric regions. They can be intrachromosomal or interchromosomal. When they are intrachromosomal, they are called Low Copy repeats (LCRs). Intrachromosomal segmental duplications lead to deletion or duplication during crossover, and thus contiguous gene syndromes such as CMT 1A (due to duplication of PMP22). Similarly, it can lead to microdeletion syndromes such DiGeorge and velocardiofacial syndrome and Williams-Bueren syndrome.

LCRs are ubiquitous and can lead to problems with accurate genetic mapping with short reads, leading to gaps in the mapped genome.

Interchromosomal segemental duplication can lead to spread of a disease causing sequence to other chromosomes The most notable example of this is is the duplication of the adrenolekodystrophy locus from Xq28 to the pericentromeric region of chromosomes 2, 10, 16 and 22. Many inter and intrachromosomal duplications involve the X chromosome.

During meiosis, crossover occurs. There are 2 structural observations that are relevant here. First crossovers tend to affect the short arm of chromosomes far more than the long arm. Secondly, meiotic crossover is less common close to centromeres, and increases in the terminal 20-35 Mb section of the chromosome.

The unit for measuring "closeness" or linkage of loci is centiMorgan or cM. The closer the loci are, the less is the likelihood of crossover at meiosis. When two loci are separated by 1 cM, that equates with an 1% chance that these two loci will be separated by crossover at meiosis. The chances of crossing over for two loci is expressed as cM/Mb. The most crossover prone part of the human genome resides in the short arms of chromosome X & Y. Thus, two genes located in Xp or Yp have an almost 100% chance of being separated at cross over.

Common though the repeats above are, there are certain portions of the genome which they leave alone as being almost sacred. These regions have very few repeats. In mammals, 4 such regions are Homeobox A, B, C, and D. They are known as HoxA, HoxB, HoxC, and HoxD. The homeoboxes are responsible for embryonic development in the antero-posterior axis, and it is thought that ontogenically, mammals will not tolerate any disruption of this function by the interposition of repeats. The same however does not apply to reptiles, who have many repeats in their Hox regions and display a remarkable variety of species, perhaps due to the variation caused by these repeats during embryonic development. The remarkable speciation found in Anilis lizard is a good example of this phenomenon.

Not all parts of the human genome are equally rich in GC or AT. In fact, GC pairs only constitute 41% of the human genome, and AT pairs make up the other 59%. It is thought that over time (millions of years), there is steady mutational erosion of GC, being gradually replaced by AT. This is of some importance, as GC pairings are remarkably over-represented in gene rich regions- ie they appear in areas of high gene density. This is not to be confused with the density seen on Giemsa staining- called G bands. GC rich areas correspond to lighter G bands, while AT rich areas have denser G bands, ie the exact opposite of gene density.

What causes GC rich areas to be more gene dense? This is almost completely attributed to much shorter intron lengths in GC rich areas. The length of exons and exon numbers are relatively invariant between GC rich and AT rich areas.

CpG islands consist of cytosine bound to guanine through a phosphodiester bond in the 5'-3' direction from C to G (that is to say CpG is not the same as GpC). If we go by the relative frequency of Cytosine and Guanine bases- 21% each, then the frequency of CpG islands in the human genome should be 0.21*0.21, or around 4%. In actuality, the frequency of CpG islands is only a fifth of this.

This remarkable finding is explained by the fact that a large proportion of cytosine bases in H.sapiens are methylated. These spontaneously mutate to thymine. Unmethylated cytosine bases also mutate spontaneously to uracil, which, being foreign to DNA, is quickly corrected back to cytosine.

There may be an element of self preservation about the fact that humans have methylated CpG islands. The opposite applies to bacteria and viruses, who have hypomethylated DNA. When bacteria or viruses invade the human cell, TLR9 detects them through the fact that they are unmethylated and thus activates the innate immune system. For viruses, for example, this can lead to increased production of Type I interferons by plasmacytoid dendritic cells.

CpG islands in human beings are over-represented in promoter regions based at the transcription start (5') end, and it is thought that they play a vital part in the function of these promoter regions. As expected, CpG islands occur in gene dense areas, just like CG base pairs.

Again, the human chromosomes differ in their content of CpG islands. The average number of CpG islands across all human chromosomes is 5-15 per Mb. The Y chromosome is relatively bereft, with only 2.9, while Chromosome 19 is an extreme outlier with 43 CpG islands per Mb.

Since there are 4 bases in RNA, the number of triplet codons on mRNA that can be made from these 4 bases is 4^3 or 64. As there are only 20 amino acids (21 if you include Selenocysteine), there is redundancy here. However redundancy is also reflected in the number of anticodons on tRNA, numbering only 46, due to the fact that the 1st RNA base on an anticodon, which corresponds with the 3rd RNA base on the codon, often shows "wobble". For 2-codon boxes (where the 3rd base on codon could be one of 2 choices), this is seen when the 3rd base is either C or U. The corresponding 1st base on the anticodon in such cases could be either G or A. Asparagine, for example has codons AAU & AAC. The cognate tRNA anticodon for asparagine could thus be either GUU or AUU. In reality, there are 33 genes which code for GUU and only one for AUU, thus reflecting both redundancy and codon preference.

In practice, when A is present as the first base on an anticodon, it is almost always post-translationally deaminated to inosine. Thus AUU, in reality, becomes IUU.