In silico structure-based discovery of a SARS-CoV-2 main protease inhibitor

The Coronavirus Disease 2019 (COVID-19) pandemic caused by the novel lineage B betacoroanvirus severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has resulted in significant mortality, morbidity, and socioeconomic disruptions worldwide. Effective antivirals are urgently needed for COVID-19. The main protease (Mpro) of SARS-CoV-2 is an attractive antiviral target because of its essential role in the cleavage of the viral polypeptide. In this study, we performed an in silico structure-based screening of a large chemical library to identify potential SARS-CoV-2 Mpro inhibitors. Among 8,820 compounds in the library, our screening identified trichostatin A, a histone deacetylase inhibitor and an antifungal compound, as an inhibitor of SARS-CoV-2 Mpro activity and replication. The half maximal effective concentration of trichostatin A against SARS-CoV-2 replication was 1.5 to 2.7µM, which was markedly below its 50% effective cytotoxic concentration (75.7µM) and peak serum concentration (132µM). Further drug compound optimization to develop more stable analogues with longer half-lives should be performed. This structure-based drug discovery platform should facilitate the identification of additional enzyme inhibitors of SARS-CoV-2.


Introduction
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the causative agent of the Coronavirus Disease 2019 (COVID-19) pandemic, is a novel lineage B betacoronavirus first discovered in Wuhan, China, in late 2019 [1]. SARS-CoV-2 is highly transmissible and rapidly disseminated worldwide to cause more than 102 million cases of COVID-19, including over 2.2 million deaths as of 2 nd February 2021 [2][3][4]. While the overall case-fatality rate of COVID-19 is about 2%, the infection is especially severe in the elderly and those with underlying diseases [4]. In the past year, a number of potential antiviral treatments for COVID-19 have been evaluated in clinical trials. Examples include monotherapy and/or combinatorial regimen of remdesivir, interferon-β1b, lopinavirritonavir, and hydroxychloroquine [5][6][7][8]. However, their effects on disease outcomes are restricted to selected groups of patients, and the interim results of the WHO Solidary Trial suggested that these treatments might have little or no effect on hospitalized COVID-19 patients in terms of the overall mortality, ventilation requirement, and Ivyspring International Publisher duration of hospital stay [9]. Therefore, discovery of additional effective antivirals for COVID-19 is urgently needed.
De novo development of new antiviral agents for emerging viral infections usually takes years and inevitably lags behind the rapid evolvement of the epidemics [10]. To find immediately available treatment options for COVID-19, repurposing studies of existing drug compounds have been conducted [11]. The major limitation of cell-based screening of antivirals is that it is highly laborious. An alternative strategy is to exploit in silico structure-based screening of chemical libraries which has the advantages of being fast and providing mechanistic insights related to the target viral protein structure [12].
Similar to other betacoronaviruses, including SARS-CoV and Middle East respiratory syndrome coronavirus (MERS-CoV), the genome of SARS-CoV-2 is arranged in the order of 5'-replicase [open reading frame (ORF) 1a/b]-structural proteins [Spike (S)-Envelope (E)-Membrane (M)-Nucleocapsid (N)]-3' [13,14]. The ORF1a/b encodes a number of viral enzymes with important roles in the viral replication cycle, including the main (M pro ) or chymotrypsin-like cysteine (3CL pro ) protease, papain-like cysteine protease (PL pro ), RNA-dependent RNA polymerase (RdRp), and helicase, which are potentially druggable targets [10]. The SARS-CoV-2 M pro plays an important role in viral replication by processing polyproteins that are translated from viral RNA [13]. The SARS-CoV-2 M pro cleaves various non-structural proteins (nsp4 to nsp16), including the RdRp (nsp12) and helicase (nsp13). Because of its essential role in viral replication, the SARS-CoV-2 M pro represents one of the most attractive antiviral drug targets [15,16]. A number of crystal structures of the SARS-CoV-2 M pro with or without bound inhibitors have been recently reported [17][18][19]. In this study, we established an in silico screening platform based on these crystal structures to identify potential SARS-CoV-2 M pro inhibitors from a chemical library consisting >8,800 compounds.

Molecular docking
CovalentDock was used for covalent virtual screening of DrugBank compounds against SARS-CoV-2 M pro [20,21]. Compounds with covalently bondable chemical groups (Michael acceptor and β-lactam family) were recognized with the scripts provided in the program package. The relevant parts of the ligand structure were altered, i.e., open-up of β-lactam ring or active C=C bond. Then a "dummy" atom was artificially attached to temporarily occupy the empty valence for covalent linkage with the receptor. The altered ligand structure was optimized with Amber GAFF forcefield during a short minimization [22]. The crystal structure of M pro (code: 6LU7) was retrieved from Protein Data Bank (PDB) [23]. The charge/protonation state of protease protein was assigned with H++ server [24]. Binding pockets on protein surface was defined according to the native ligand pose. The Sγ atom of the nucleophilic Cys145 in M pro was assigned as the covalent linkage acceptor. Hbind was used to detect intermolecular hydrogen bonds and calculate SLIDE affinity score and direct hydrophobic contacts [25,26]. 3D intermolecular interaction plot was generated by Pymol.

Main protease purification and enzymatic assay
Genes encoding the SARS-CoV-2 M pro (residues 3264-3569) were cloned into the expression vector pETH. The recombinant proteins were expressed in Escherichia coli BL21(DE3) cells and purified using the Ni2+-loaded HiTrap Chelating System (GE Healthcare) according to the manufacturer's instructions. The purity of each protein was assessed by 12% sodium dodecyl sulfate-polyacrylamide gels (SDS-PAGE). The concentration of each protein was determined by using the Bicinchoninic Acid Protein Assay Kit (Sigma-Aldrich). The M pro enzyme inhibition assay experiments were performed in triplicate in Greiner bio-one 96-well black microplates using the peptide substrate Dabcyl-KTSAVLQ SGFRKM-E(Edans)-NH2 (GL Biochem) as previously described [27]. The assay was performed in buffer composed of 20mM Tris base, 100mM NaCl, 1mM EDTA, 1mM DTT, pH 7.3, 50μM fluorescence substrate and 10μM M pro , with a final assay volume of 100μl. The compound was incubated with M pro for 30min before addition of the substrate and fluorescence detection at emission 460nm and excitation 355nm. GC376 was included as a positive control inhibitor as previously reported [28].

Virus strain and titration
A clinical isolate of SARS-CoV-2 HKU001a (GenBank accession number MT230904) obtained from a COVID-19 patient was used in this study [29]. The virus was amplified by three additional passages in VeroE6 cells (American Type Culture Collection, ATCC) in DMEM medium supplemented with 1% fetal bovine serum (FBS, GibcoTM, Life Technologies Corporation, Massachusetts, USA) and 100units/ml penicillin plus 100μg/ml streptomycin to make working stocks of the virus (2×10 5 50% tissue culture infectious dose (TCID 50 )/ml) as previously described [29]. For virus titration, aliquots of SARS-CoV-2 were applied on confluent VeroE6 cells in 96-well plates for TCID 50 assay as previously described [30]. Briefly, serial 10-fold dilutions of the virus were inoculated in a VeroE6 cell monolayer in quadruplicate and cultured in penicillin/streptomycin-supplemented DMEM and 1% FBS. The plates were observed for cytopathic effects for 4 days. Viral titer was calculated with the Reed and Münch endpoint method. One TCID 50 was interpreted as the amount of virus that causes cytopathic effects in 50% of inoculated wells. All experiments with live SARS-CoV-2 was conducted in the Biosafety Level 3 facility of The University of Hong Kong [31][32][33].

Cell lines and drug compounds
VeroE6 and Caco-2 cell lines were obtained from ATCC as we previously described [29,34,35]. Trichostatin A was purchased from MedChem-Express (New Jersey, USA).

Cytotoxicity assay
The 50% effective cytotoxic concentration (CC 50 ) of trichostatin A in Caco-2 cells were determined by CellTiter-Glo® luminescent cell viability assay (Promega) as we previously described with slight modifications [11,[36][37][38]. Briefly, Caco-2 cells (4×10 4 cells/well) were incubated with different concentrations of trichostatin A for 48h, followed by the addition of 40µl/well of CellTiterGlo® substrate and detection of luminance after another 15 min. The CC 50 was calculated using Sigma plot (SPSS) in an Excel add-in ED50V10.

Plaque reduction assay
Plaque reduction assay was performed to plot the half maximal effective concentration (EC 50 ) as we previously described with slight modifications [41,42]. Briefly, VeroE6 cells were seeded at 4×10 5 cells/well in 12-well tissue culture plates on the day before the assay was performed. After 24h of incubation, 50 plaque-forming units (PFU) of SARS-CoV-2 were added to the cell monolayer with or without the addition of drug compounds. The plates were further incubated for 1h at 37°C in 5% CO 2 before removal of unbound viral particles by aspiration of the media and washing once with DMEM. Monolayers were then overlaid with media containing 1% low melting agarose (Cambrex Corporation, New Jersey, USA) in DMEM and appropriate concentrations of trichostatin A, inverted and incubated as above for another 72h. The wells were then fixed with 10% formaldehyde (BDH, Merck, Darmstadt, Germany) overnight. After removal of the agarose plugs, the monolayers were stained with 0.7% crystal violet (BDH, Merck) and the plaques counted. The percentage of plaque inhibition relative to the control (i.e. without the addition of compound) wells were determined for each drug compound concentration. EC 50 was calculated using Sigma plot (SPSS) in an Excel add-in ED50V10. The plaque reduction assay experiments were performed in triplicate and repeated twice for confirmation.

Immunofluorescence staining
Antigen expression in SARS-CoV-2-infected cells was detected with an in-house rabbit antiserum against SARS-CoV-2 nucleocapsid (N) protein as we previously described [43][44][45]. Cell nuclei were labelled with the DAPI nucleic acid stain from Thermo Fisher Scientific (Waltham, MA, USA). The Alexa Fluor secondary antibody was obtained from Thermo Fisher Scientific. Mounting was performed with the Diamond Prolong Antifade mountant from Thermo Fisher Scientific. Imaging was taken and processed as we previously described [46].

Time-of-drug-addition assay
Time-of-drug-addition assay was performed for trichostatin A as previously described with slight modifications [47]. Briefly, VeroE6 cells were seeded in 24-well plates (2×10 5 cells/well). The cells were inoculated with SARS-CoV-2 (MOI = 0.1) and then incubated for 1h for virus internalization. The viral inoculum was then removed and the cells were washed twice with PBS. At 0hpi (i.e. after PBS wash) and 3hpi, 10µM trichostatin A was added to the infected cells, followed by incubation at 37°C in 5% CO 2 until 9hpi. For the "pre-incubation" time-point, 10µM trichostatin A was added to pre-treat cells at 2h before virus infection and then removed, followed by drug-free medium incubation with the cells until 9hpi. For the "co-infection" time-point, 10µM of trichostatin A was added together with the virus inoculation, followed by drug removal after 1h and incubation of the cells until 9hpi. At 9hpi, the cell culture supernatant of each time-point experiment was collected for viral load measurement using qRT-PCR. Dimethyl sulfoxide (0.5%) was included as a negative control for each group.

Covalent virtual screening of SARS-CoV-2 M pro inhibitors
To screen for potential covalent inhibitors of SARS-CoV-2 M pro , DrugBank release version 5.1.6 which contains 8820 compounds with 3D structures available for docking was used for covalent virtual screening. Compounds with electrophilic chemical groups were first automatically recognized and prepared with CovalentDock ( Figure 1). As a result, 177 compounds were selected for covalent docking screening. To eliminate pose scoring biases, SLIDE scoring function was also utilized for consensus scoring. A total of 75 drug compounds with CovalentDock score > -12 and SLIDE score > -7 were excluded, leaving 102 drug compounds for further analysis. After manual inspection and consideration of the hydrogen bond potential, shape complementarity of the binding pose, ligand efficiency, and hydrophobic contacts, 5 purchasable compounds, namely, canertinib, fexaramine, PD-168393, piperine, and trichostatin A, were selected as potential SARS-CoV-2 M pro inhibitors for downstream experimental validation (Table S1).

Trichostatin A inhibits SARS-CoV-2 M pro activity in vitro
To validate the SARS-CoV-2 M pro inhibition of the selected drug compounds, we applied the EDANS-Dabcyl system for detection of M pro cleavage activity and an anti-feline coronavirus drug compound GC376 with proven inhibitory activity against SARS-CoV-2 M pro [28,48]. As expected, GC376 exhibited potent SARS-CoV-2 M pro inhibition with an half maximal inhibitory concentration (IC 50 ) of 0.098±0.008µM (Figure 2A). Among the 5 selected drug compounds, only trichostatin A showed reduction of M pro activity in a dose-dependent manner (IC 50 = 37.97±3.68µM) ( Figure 2B). Therefore, we further evaluated the cytotoxicity and antiviral activity of trichostatin A with additional antiviral assays.

Trichostatin A inhibits SARS-CoV-2 replication in vitro
Trichostatin A is known as a histone deacetylase inhibitor and an antifungal antibiotic [49]. The cytotoxicity and antiviral activity of trichostatin A were evaluated in Caco-2 cells as previously described [50]. The CC 50 of trichostatin A in Caco-2 cells was 75.7±5.2μM after 48h of incubation ( Figure  3A). In the viral load reduction assay, trichostatin A (50μM) treatment reduced viral RNA load in Caco-2  Figure  3B). Intracellularly, dose-dependent reduction of SARS-CoV-2 N protein production was detected in the cell lysate of trichostatin A-treated groups ( Figure 3C). Moreover, immunofluorescence staining demonstrated marked suppression of SARS-CoV-2 N protein expression upon trichostatin A treatment ( Figure 3D). To fully document the antiviral potency of trichostatin A, we further validated its anti-SARS-CoV-2 activity using plaque reduction assay. As shown in Figure 3D, 5μM and 10μM of trichostatin A completely inhibited plaque formation of SARS-CoV-2, resulting in an EC50 of 1.5±0.3μM ( Figure 3E). Overall, we demonstrated that trichostatin A potently inhibited viral RNA load, antigen expression, and infectious particle formation of SARS-CoV-2 in vitro at non-cytotoxic concentrations.

Trichostatin A interrupts the post-entry events of the SARS-CoV-2 replication cycle
To investigate the phase of the SARS-CoV-2 replication cycle interrupted by trichostatin A, we performed a time-of-drug-addition assay by exposing the virus-infected cells to the drug at different time-points during the viral replication cycle, followed by measurement of virus titers at 9 hours post-inoculation (hpi). VeroE6 cells were infected by 0.1 MOI of SARS-CoV-2 ( Figure 3F). No significant inhibitory activity was observed when trichostatin A was added at the virus adsorption stage (0~1 hpi, termed "co-infection"). SARS-CoV-2 attachment was not affected when VeroE6 cells were pre-incubated with trichostatin A (−2 to 0hpi). Apparently, about 60% drop of progeny virions were detected when the drug was added after virus absorption (0 hpi), whereas the inhibitory effect became marginal when trichostatin A was added at 3 hpi. As progeny virus could be detected as early as 9 hpi, indicating completion of a single virus life cycle [51]. Our time-of-drug-addition result suggested that trichostatin A interfered with the post-entry events of the SARS-CoV-2 replication cycle, which was compatible with the hypothesized role of trichostatin A as a SARS-CoV-2 M pro inhibitor. The result also indicated that the proteolytic process executed by SARS-CoV-2 M pro might occur with 3h after virus internalization.

Potential binding mode of trichostatin A to the catalytic site of the SARS-CoV-2 M pro
Trichostatin A was predicted to bind to the catalytic site of the SARS-CoV-2 M pro with good shape complementarity ( Figure 4A). When inspecting the hydrogen bonding potential, it was found that trichostatin A forms hydrogen bonds with LEU-141 backbone and HIS-163 sidechain to further stabilize the binding pose and improve the specificity ( Figure  4B). Meanwhile, the ene-carbon of trichostatin A forms a covalent S-C linkage with the CYS-145 thiol ( Figures 4B and 4C), which is a typical thiol Michael addition reaction (also known as thia-Michael addition). Thus, we reasoned that the complementary non-covalent interaction and the much stronger covalent linkage enabled trichostatin A to act as a potent SARS-CoV-2 M pro inhibitor.

Discussion
Protease inhibitors have been successfully used to treat viral infections clinically, including human immunodeficiency virus (HIV) and hepatitis C virus infections. We have also previously shown that novobiocin and bromocriptine might be repurposed as Zika virus NS2B-NS3 protease inhibitors [12,51]. For SARS, MERS, and COVID-19, we and others have demonstrated that the HIV protease inhibitor lopinavir was effective in vitro and/or in vivo [7,10,40,[52][53][54][55]. In this study, we utilized in silico structure-based screening to identify potential SARS-CoV-2 M pro inhibitors from a large chemical library consisting nearly 9,000 compounds. Among the primary hit compounds, we further validated trichostatin A's inhibitory effect of SARS-CoV-2 M pro activity using an enzyme inhibition assay. The antiviral activity of trichostatin A against SARS-CoV-2 was evident in the drug's ability to significantly reduce the viral RNA load, viral antigen expression, and infectious virus particle formation. Corroborating with the expected role of M pro , our time-of-drug-addition assay showed that trichostatin A interrupted the post-entry events of the SARS-CoV-2 replication cycle. Molecular docking analysis predicted that trichostatin A was able to bind to the SARS-CoV-2 M pro the catalytic site of with good shape complementarity.
Trichostatin A was originally reported as a fungistatic antibiotic obtained from a culture broth of Streptomyces platensis [56]. Subsequently, its potent inhibitory effect on histone deacetylase (HDAC) activity was identified [56]. Trichostatin A chelates zinc ions in the active site of HDAC which prevents histone unpacking and makes DNA less available for transcription. Trichostatin A selectively inhibits class I and II mammalian histone HDAC families of enzymes, but not class III HDACs [57]. It is rapidly and extensively metabolized in mice following intraperitoneal administration, which is evidenced by its maximal serum concentration (Cmax) of 40 µg/ml (equivalent to 132µM) being achievable within 5 min of drug administration [58]. Despite this rapid metabolism, the major metabolite of trichostatin A, N-Monomethyl trichostatin A amide, still exhibits HDAC inhibitory activity. The EC50 of trichostatin A against SARS-CoV-2 as determined by plaque reduction assay is around 1.5μM which is below the C max . Nevertheless, given the very short plasma half-life of trichostatin A (<10 minutes at 80mg/kg) [59], further drug compound optimization to develop more stable analogues with longer half-lives should be performed. Alternatively, synergistic effect between trichostatin A and other anti-SARS-CoV-2 drug compounds such as remdesivir, interferons, lopinavir, and ribavirin should be evaluated to identify potential combinatorial regimens in which trichostatin A may be used to enhance the effects of these clinically approved antivirals with longer half-lives.
Other drug compounds that have been reported to exhibit both inhibitory activity against the SARS-CoV-2 M pro and reduced virus-induced cytopathic effects include the α-ketoamide boceprevir (antiviral EC 50 = 1.31μM), the peptide-aldehyde calpain inhibitor II (antiviral EC 50 = 2.07μM), and the sulfonate-featured peptide GC-376 (antiviral EC 50 = 3.37μM) [60]. Additionally, other drug compounds such as manidipine (anti-M pro IC 50 = 4.8μM), lercanidipine (anti-M pro IC 50 = 16.2μM), and bedaquiline (anti-M pro IC 50 = 18.7μM) have also demonstrated inhibitory activity against the SARS-CoV-2 M pro , but their effects against SARS-CoV-2 replication remain to be examined [61]. Notably, boceprevir is an FDA-approved treatment for hepatitis C virus infection. Boceprevir (oral 800 mg three times daily) achieves a C max of 1,723 ng/mL (equivalent to 3.3 µM), which is above its in vitro anti-SARS-CoV-2 IC 50 (1.31μM) [62]. GC-376 is an investigational drug with in vivo efficacy for treating cats with feline infectious peritonitis caused by feline coronavirus. GC-376 has a favourable C max that is >100-fold of the in vitro EC 50 against feline coronavirus and an elimination half-life (T 1/2 ) of 3-5 hours [63]. The in vivo efficacy of GC376 against mice infected with murine hepatitis virus or murine norovirus has also been reported [64].
The identification and validation of trichostatin A as a potent anti-SARS-CoV-2 drug compound has demonstrated the capability of our structure-based screening platform and in vitro enzyme inhibition and antiviral assays to discover SARS-CoV-2 M pro inhibitors from a large chemical library. A similar approach could be adopted to screen additional libraries to find potential inhibitors of other key viral enzymes of SARS-CoV-2, including RdRp, helicase, and papain-like protease, to expand the treatment options for COVID-19.