SARS-CoV-2: Structure, Proteins and Pathogenesis of the New Coronavirus

Who I am
Louise Hay

Author and references


For over a year now, the attention of the world has been directed to COVID-19, respiratory infection caused by the new SARS-CoV-2 coronavirus, And the pandemic which resulted from it.

The epidemiological data of the moment say that: SARS-CoV-2 is present in over 200 countries of the world, approximately 19 million people have fallen ill with COVID-113 throughout the globe (February 2021) and, of these, no 2,5 million died.

SARS-CoV-2 is a virus that mainly affects the respiratory tract, causing symptoms such as cough, cold, fever and, in the most severe cases, breathing difficulties; sometimes, however, it can also induce systemic inflammation, causing sepsis, heart failure and multi-organ dysfunction.

SARS-CoV-2 infection is particularly dangerous for individuals over-60, for those with chronic diseases (eg: diabetes, coronary heart disease) and for people on therapy with immune system depressant drugs (eg: chemotherapy, immunosuppressants ).

This article aims to analyze the structure, genome and proteins of SARS-CoV-2, and to provide fundamental information related to the pathogenesis of the virus.

For further information: SARS-CoV-2: How to Recognize the First Symptoms and What to Do

Structure of SARS-CoV-2

How it is made SARS-CoV-2: Structure, Genome and Proteins

Example of beta-coronavirus, SARS-CoV-2 is a positive single-stranded RNA virus, with pericapside (the envelope).

The pericapsid is a sort of envelope placed around the capsid of some viruses; it is made up of phospholipids and glycoproteins.

SARS-CoV-2 possesses a genome of 29.881 nitrogen bases, which codes for 9.860 amino acids.
This genome breaks down into genes for structural proteins and genes for non-structural proteins.

Genes for structural proteins code for protein spike (abbreviated to S), la protein of the pericapside (abbreviated to E, from envelope), the membrane protein (abbreviated to M) and the nucleocapsid protein (abbreviated to N).
As the name suggests, structural proteins combine to form the structure of SARS-CoV-2.

The genes for non-structural proteins, on the other hand, encode proteins, such as the protease similar to 3-chymotrypsin, the protease similar to papain or the RNA-dependent RNA polymerase, whose functions are regulating and directing the replication processes. and virus assembly.

Below is a description of the individual structural proteins, with a focus on protein S, and of the non-structural proteins.

To learn more: Coronavirus: What are they?

Proteina Spike

Structure of the SARS-CoV-2 Spike Protein

The spike protein (or protein S) of SARS-CoV-2 (and of all known Coronaviruses) covers the external surface of the virus, forming those characteristic protuberances that give the New Coronavirus the appearance of a crown (from which it derives, in fact , the term "Coronavirus").

The spike protein weighs 180-200 kDa (read kiloDalton) and is made up of 1.273 amino acids.

Spike is made up of two major amino acid components, called S1 subunit (14-685) and S2 subunit (686-1.273) said so well:

  • The S1 subunit hosts an amino acid sequence known as RBD (English acronym for "Receptor Binding Domain"), which is essential to bind the virus to the cells of the host (ie the human being).
  • The S2 subunit, on the other hand, is the site of amino acid sequences (fusion peptide, HR1, HR2, transmembrane domain and cytoplasmic domain), whose final function is to favor the fusion and entry of the virus into the host cells.

In its native state (i.e. when the virus is not infecting anyone), the spike protein is in the form of inactive precursor. When the virus encounters a potential organism to be infected, however, it immediately switches to an active form: the proteases of the target cells trigger the activation process (so it is the host himself who activates it!), Which "break "the spike and form the S1 and S2 subunits.

How SARS-CoV-2 Spike Protein Works

The functioning of the SARS-CoV-2 spike protein is complex; the article in question aims to simplify it as much as possible, so that it can be understood by readers.

The spike protein is essential for initiating the host infection process; in other words, it is the weapon that the New Coronavirus uses to cause the infection known as COVID-19.

The spike-driven infection process can be divided into two stages:

  • Il bond to the host cell. It is the phase in which the virus attacks and binds to the cells of the organism that it will then infect.
  • La fusion of the viral membrane (essentially of the virus) with the membrane of the host cell. It is the phase that allows the virus to enter the cells of the attacked organism and spread its genome there.

Binding to host cells

The spike protein binds to host cells through the RBD sequence of the S1 subunit.

Scientific studies have observed that the RBD sequence binds to host cells by means of an interaction with the ACE2 receptor placed on the surface of the plasma membrane of the cells themselves.

ACE2 is an enzyme and is homolog of ACE, the protein responsible for converting angiotensin 1-9.
In humans, ACE2 is found mainly on the surface of the plasma membrane of organ cells such as the lungs, intestines, heart and kidneys.

Once the S1 subunit has bound to ACE2, the S protein begins to change conformation; this event serves to favor the fusion phase and the entry of the virus into the host cell.

The link to ACE2 and the resulting conformational change are two fundamental aspects for the realization of the SARS-CoV-2 vaccine and to understand the mechanisms of antigenicità and immune response implemented by the host.

However, there is a problem that must be considered: mutations in the S1 subunit and, in particular in the RBD sequence, could change the way in which the conformational change develops; consequently, this could have an impact on the antigenic characteristics and the efficacy of vaccines (to learn more about the topic, we recommend reading the article dedicated to the variants of SARS-CoV-2).

Host Cell Fusion

The spike protein fuses the virus to the host cell through the amino acid sequences of the S2 subunit.

The virus fusion process takes place on the wave of the conformational change of protein S induced by the bond between RBD and the host ACE2 receptor: the change in spike conformation, in fact, brings the viral membrane closer to the plasma membrane of the host cell, up to to the interaction, to the fusion between membranes and, finally, to the incorporation of the infecting virus.

Once the viral genome is inside the host cell, the virus begins its replication and the infection process can be considered complete.

For further information: Spike Protein Mutations: SARS-CoV-2 Variants

Pericapse protein

Also known as E proteins, the SARS-CoV-2 pericapsid proteins contribute to the formation of the pericapsid.

They constitute a group of very small proteins, consisting only of 75-109 amino acids.

Despite their small size, E proteins have an extremely significant functional role: in fact, they support theassembly and release of virions.

In microbiology, the mature viral particle is called virion, with its nucleic acid (DNA or RNA) enclosed in a protein capsule, called capsid.

Studies in this regard have shown that SARS-CoV-2 protein E is one viroporina, which, once in the host cell, goes to localize on the membrane of the Golgi apparatus and of the endoplasmic reticulum, to facilitate the assembly and release of virions.

A viroporin is a viral protein that acts as a membrane channel within the host's cells.

The SARS-CoV-2 protein E is very similar to that of SARS-CoV, while it has some differences from that of MERS-CoV.

Membrane protein

Membrane proteins (or M proteins) are the most abundant structural proteins in SARS-CoV-2.

They have an amino acid sequence of approximately 220 items.

SARS-CoV-2 protein M has several functions:

  • Defines the shape of the pericapsid;
  • Interacting with proteins E, N and S, it organizes the assembly of virions.

Research has shown that without the M protein, but with all the other structural proteins available, SARS-CoV-2 is unable to assemble new virions within the host; this means that M proteins play a key role in the aforementioned process.

On the other hand, the evidence suggests that:

  • The interaction between protein M and protein S ensures the incorporation of the latter in the new virions;
  • The interaction between the M protein and the N protein stabilizes the nucleocapsid (ie the RNA - N protein complex) and promotes the final assembly of the virions.
  • Together with protein E, it contributes to the formation of the pericapsid.

Nucleocapsid protein

Protein N, or nucleocapsid protein, is the only SARS-CoV-2 protein capable of binding to the viral genome.
Not surprisingly, thanks to this property, it plays a key role in the process of packaging of viral RNA within the new virions.

The viral RNA - N protein complex constitutes the so-called nucleocapsid.

As anticipated, to mediate the action of the N protein is the M protein: the interaction between these two proteins, in fact, stabilizes the nucleocapsid and promotes the final assembly of the virions.

It should be noted that studies on the N protein showed that the latter is also involved in the transcription and replication of viral RNA.
Following this discovery, experts began to consider protein N as a possible target for new drugs specific to SARS-CoV-2.

Protein N is highly conserved in coronaviruses: for example, that of SARS-CoV-2 has an amino acid sequence that is 90% superimposable to that of SARS-CoV.

Non-Structural Proteins

Origin and Function of SARS-CoV-2 Non-Structural Proteins

The topic of non-structural proteins (abbreviated to "nsp") of SARS-CoV-2 is somewhat complex.
Therefore, it needs a simplification, in order to make it easier to understand.

First, the SARS-CoV-2 non-structural proteins are in total 16 items.

They are derived from two large proteins, called poliproteina 1a (pp1a) is poliproteina 1ab (pp1ab), which in turn are encoded, respectively, by the viral genes known as replicasi 1a e replicasi 1ab.

The process of formation of non-structural proteins starting from the two polyproteins sees the involvement of two specific viral enzymes, called protease and produced early by the virus; these proteases deal with "cut" the polyproteins in specific points, so as to give rise to the single non-structural proteins.

The polyprotein strategy (from which smaller proteins are derived) is very common among viruses.

It is interesting to point out that, before the cutting process, the proteins still included in the polyproteins are inactive, non-functional; they become functional only after the intervention of the proteases and their cleavage with respect to the major amino acid chains.

The main function of SARS-CoV-2 non-structural proteins is to deal with transcription and replication viral RNA.

However, it should be noted that these proteins are also involved in viral pathogenesis.

SARS-CoV-2 protease

Two non-structural proteins fundamental for SARS-CoV-2 are, without a doubt, the proteases that deal with "cutting" the polyproteins and forming the proteins useful for the transcription and replication of viral RNA.

These proteases are known as protease similar to 3-chimotripsina (abbreviated to 3CLpro) is papain-like protease (abbreviated to PLpro).

Considering that the proteins to which they give rise then serve to spread the infection in the host, the proteases in question represent an interesting pharmacological target.

RNA RNA-dependent polymerase

L'RNA-dependent RNA polymerase it is the non-structural protein of SARS-CoV-2 essential for the replication of the viral genome destined for the new virions.

This non-structural protein would also represent an attractive pharmacological target.

Viral pathogenesis

How SARS-CoV-2 causes COVID-19 infection

SARS-CoV-2 begins the infectious process when, through the spike protein, it manages to invade the host's cells.

As described in the chapter dedicated to protein S, the interaction between the RBD spike sequence and the ACE2 receptor present on the plasma membrane of the respiratory tract cells of the host itself ensures the entry of the virus into the host organism.

Upon entry, SARS-CoV-2 "appropriates" the host's ribosomes and exploits them to translate its own genome into RNA and create the proteins necessary for the replication of the same genetic material and the assembly of new virions.

Based on the above, a key role in the transcription and replication of viral RNA belongs to non-structural proteins.

With the transcription and replication of the viral genome, SARS-CoV-2 begins to spread in the host, initiating the actual infectious disease.

In this phase, the virus acts on the host organism both with acytocidal activity (i.e. it kills cells) with both immune-mediated mechanisms.

Regarding cytocidal activity, the evidence suggests that SARS-CoV-2 induces apoptosis (cell death) e cell lysis; more in detail, it emerged that the virus produces syncytia within the infected cell and causes the Golgi apparatus to rupture after replication. 

As for the immune-mediated mechanisms, research has shown that SARS-CoV-2 involves both the innate immune system, be that adaptive (antibodies and T lymphocytes).

Why is SARS-CoV-2 more infectious than the SARS Coronavirus?

SARS-CoV, the coronavirus responsible for SARS, also invades host cells by exploiting the interaction between RBD and the ACE2 receptor present on respiratory tract cells.

However, there is an important difference between this type of link and the one put in place by SARS-CoV-2: the RBD sequence of the Coronavirus responsible for COVID-19 has much more affinity for ACE2 and binds to it much more efficiently, resulting much more effective in the invasion process of host cells.

Scientific studies in this regard have shown that the difference in interaction described above is due to one different amino acid composition between RBD of SARS-CoV and RBD of SARS-CoV-2; in particular, there are two amino acid regions with important differences.

This difference in affinity explains several aspects:

  • The reason why SARS-CoV-2 has a higher R0 than SARS-CoV;
  • The reason that drugs and vaccines that targeted the SARS-CoV RBD sequence and appeared to be effective are not suitable against SARS-CoV-2.
For further information: New Coronavirus and SARS: Analogies and Differences

Inflammatory response

SARS-CoV-2 and Cytokine Release

Upon entry of SARS-CoV-2, the immune system of the infected host organism is activated.

At this point, elements of the immune system (eg T lymphocytes) reach the site of infection and attack the virus.

In most people, the above process is successful, clearing the virus from the body and allowing the individual to recover.

In a certain percentage of cases, however, it happens that the infectious disease takes on more serious connotations and that SARS-CoV-2 stimulates an immune response aberrant.

In this case, in the aforementioned situations, it emerged that the virus causes a overproduction of pro-inflammatory cytokines (eg: interleukin-1, interleukin-2, interleukin-6 etc.) that accumulate in the lungs, to the point of damaging the lung parenchyma.

Pro-inflammatory cytokines arise from the activity of certain cells of the immune system.
Under normal conditions, they serve to regulate the immune response, inflammation and hematopoiesis.

Furthermore, clinical data and other research have shown that the overproduction of pro-inflammatory cytokines seen in the presence of severe SARS-CoV-2 infection can extend to other organs (ex: heart), causing the dysfunction, and have repercussions on coagulation processes, inducing the formation of thrombus.

When SARS-CoV-2 triggers an extensive overproduction of pro-inflammatory cytokines, experts define the phenomenon with the expression "cytokine storm syndrome".


  • Ahmad Abu Turab Naqvi, Kisa Fatima, Taj Mohammad, Urooj Fatima, Indrakant K. Singh, Archana Singh, Shaikh Muhammad Atif, Gururao Hariprasad, Gulam Mustafa Hasan, and Md. Imtaiyaz Hassan (101 october 2021) Insights into SARS-CoV-2 genome, structure, evolution, pathogenesis and therapies: Structural genomics approach. Biochim Biophys Acta Mol Basis Dis.; 1866(10): 165878.
  • Yuan Huang, Chan Yang, Xin-feng Xu, Wei Xu and Shu-wen Liu (03 august 2021) Structural and functional properties of SARS-CoV-2 spike protein: potential antivirus drug development for COVID-19. Acta Pharmacologica Sinica volume 41, pages1141–1149.
  • Muge Cevik, clinical lecturer, Krutika Kuppalli, Jason Kindrachuk and Malik Peiris (23 October 2021) Virology, transmission, and pathogenesis of SARS-CoV-2. BMJ; 371: m3862.
Audio Video SARS-CoV-2: Structure, Proteins and Pathogenesis of the New Coronavirus
add a comment of SARS-CoV-2: Structure, Proteins and Pathogenesis of the New Coronavirus
Comment sent successfully! We will review it in the next few hours.