An in silico approach for structural and functional analysis of Heavy Metal Associated ( HMA ) proteins in Brassica oleracea

Heavy metal ATPases (HMAs) are the most important proteins involved in heavy metal accumulation process. Brassica oleracea has 5 HMA (1-5) homologues whose 3D structure has been predicted and validated in this study by different bioinformatics tools. Phylogenetic and multiple sequence alignment analyses showed high relationship between HMA2 and HMA4, while two same domains were identified in all five HMA proteins: E1-E2 ATPase and haloacid dehydrogenase (HAD) domain. Four HMA (2-5) proteins were identified to be localized in the plasma membrane, while HMA1 localization is predicted to be in plastid. Interactome analysis revealed high interaction of all HMA (1-5) proteins with many metal ion binding proteins and chaperones. Among these, interesting and strong interaction is observed between all HMA (1-5) proteins and ATX1, while HMA1, HMA2 and HMA4 have been found to strongly interact with FP3 (farnesylated protein 3) and FP6 (farnesylated protein 6) proteins. Docking site predictions and electrostatic potentials between HMA2/HMA4 and the interactome proteins were explained and discussed in this study.


Brassicaceae family
Brassicaceae is the name of a medium-sized and economically important family of flowering plants, informally known as the mustards, mustard flowers, the crucifers, or the cabbage family.The name is derived from the included genus Brassica.Brassica is a one of the major crop worldwide with Brassica oleracea as a main consumed species in Europe and USA.Brassica is a genus with many beneficial characteristics for our health, such as reducing risk for age related chronical illness, degenerative diseases and it reduces risk of several types of cancer.Brassica contain many vitamins which are essential for our health, such as vitamin A, C, E, K and B-6, carotenoids (such as c-and b-carotene and zeaxanthin), anthocyanins, folate, soluble sugars and phenolic compounds which are known to be the major antioxidants of Brassica crops [1].
Interesting fact is that all parts of Brassica is used as a food, including root, stems, leaves, flowers, buds and seeds.Brassica has many species, thanks to difference in phenotype within themselves.Like all species in Brassica family, Brassica oleracea is very rich with vitamins and other nutrients.Brassica oleracea has been bred into a wide range of cultivars, including cabbage, broccoli, cauliflower, brussels sprouts, collards, and kale, some of which are hardly recognizable as being members of the same genus, let alone species [2].Brassica vegetables are highly regarded for their nutritional value.With high amounts of vitamin C and soluble fiber they are excellent candidates to fight cancer, including molecules known of anticancer properties such as cellsproperties:3,3'diindolylmethane, sulforaphane and selenium [3].Furthermore, Brassica vegetables are rich in indole-3carbinol, a chemical which boosts DNA repair in cells in vitro and appears to block the growth of cancer cells in vitro [4].They are also a good source of carotenoids, with broccoli having especially high levels [5] and goitrogens, some of which suppress thyroid function [6].

Brassica oleracea genetic characterization
A recent study done with AFLP markers, evaluated the genetic diversity in kale landraces through Europe and compared the diversity to that in the wild populations of Brassica oleracea.In total 17 accessions were collected from all around Europe, including Bosnia and Herzegovina, Croatia and Turkey.In Bosnia and Herzegovina 47 individuals were analyzed, in Rivine, Dubrave and city of Stolac and its interesting to say that among a total of 93 polymorphic markers which were scored, a unique allele was found in only one accession, and it is the one in Bosnia and Herzegovina.In addition, the AFLP analyses of genetic diversity in leafy kale (Brassica oleracea L. convar.acephala) landraces, showed that Herzegovina has a 58% of polymorphic loci, while Croatia had 69% and Turkey 76%.Accessions from Bosnia and Herzegovina, Croatia, Portugal and Turkey contain many individuals with mixed genotype, sharing parts of their genome with other accessions due to common ancestry or gene flow [7].

Brassicaceae family as heavy metal accumulators
Through various ways, as for example, gas exhausts, energy and fuel production, intensive agriculture, and sludge dumping activities, humans contaminate soils and aqueous streams with large quantities of toxic metals.A number of studies from developing countries have reported heavy metals contamination in wastewater and wastewater irrigated soils [8].In this regards, heavy metals are harmful to humans and other life forms, as they can cause cancer, blindness, loss organ function, severe illness, and death.The fact that some of these Brassicaceae family plants can accumulate high amounts of toxic metals, without visible symptoms, and in the same time being important food crops as well, leads to potential contamination of our food chain and this has to be taken into account in any phytoremediation process [9].In general, plants require at least 14 mineral elements for their nutrition.These include the macronutrients nitrogen (N), phosphorus (P), potassium (K), calcium (Ca), magnesium (Mg) and sulphur (S) and the micronutrients boron (B), chlorine (Cl), iron (Fe), manganese (Mn), copper (Cu), zinc (Zn), nickel (Ni) and molybdenum (Mo).Crop production is often limited by low bioavailability of essential mineral elements and/or the presence of excessive concentrations of potentially toxic heavy metals, such as Fe, Mn, Cu, Cr, Cd, Pb, Zn and Al in the soil solution.High concentrations of heavy metals in the soil can inhibit plant growth and reduce crop yields, which can affect sustainable development severely [10].Therefore, some known Brassicaceae family species are already proven to be effective heavy metal accumulators.For example, A.halleri is one of the best model organism for the study of plant adaptation to extreme metallic conditions since it is considered as Zn and Cd hyper-accumulator [11].A.halleri is found to cope with excessive metal ions and toxicity in a way that it uses effective metal uptake, increased xylem loading and increased detoxification in shoot tissues.In recent years, several types of transporters involved in these processes have been identified in Zn and Cd hyper-accumulators, specifically in A.halleri [12].The most investigated proteins that transfer the toxic metals are named as Heavy metal ATPases.These are located within membrane complexes of plant cell.As their name implies they produce or utilize energy in form of ATP.There are three different types of heavy metal transporters: P-type, V type and F0-F1 type.Most common type that is found in plant organisms is P-type.The proteins of this type usually transport essential metal ions which are Cu2+, Zn2+, Mn2+, Fe2+ and Co2+.This type of transporters does not produce energy, but actually uses it in order to pump these metals.Another type is V type which also utilizes energy.The third type is F0-F1 ATPases which produce energy instead of using it.The function of these proteins is to regulate the concentration of these metals in all tissues found in plants [10].Their proper functioning is highly important for plant, where high levels of essential and some non-essential metals can be very toxic for the plant [9].For example, expression of the HMA1 gene from Atriplex canescens significantly increased the ability of yeast cells to adapt to and recover from exposure to excess iron.AcHMA1 expression also provided salt, alkaline, osmotic and oxidant stress tolerance in yeast cells.In this regards these results suggest that HMA1 gene encodes a membrane-localized metal tolerance protein that mediates the detoxification of iron in eukaryotes and may be involved in the response to abiotic stress [13].HMA2 is known for maintaining plant metal homeostasis by transporting Zn and Cd metal ions [14].It is shown that HMA2 and HMA4 drive metal efflux out of the cell in A. thaliana [15] and promote xylem loading of metal in N. caeruslecens [16].HMA4 is responsible for zinc hyper-accumulation in A. halleri as it shown by a RNA interference approach for down regulation of its expression.Additionally, transfer of the HMA4 gene to A. thaliana enables zinc partitioning into xylem vessels and up-regulated specific genes characteristic for zinc hyper accumulators [17].This example shows impressively the importance of regulatory gene expression and gene copy number expansions for the special trait of metal hyperaccumulation.Furthermore, AtHMA4 is shown to be responsible for the reduction of Cd uptake/accumulation [18].In contrast, HMA3 is localized at the tonoplast enabling vacuolar metal influx and therefore cellular sequestration [19].The quantitative trait locus (QTL) analysis on chromosome 1 in Arabidopsis thaliana revealed that this QTL regulates Cu translocation capacity and involves Cutransporting via HMA5 [20].Furthermore, in Arabidopsis, the heavy metal P-type ATPase HMA5 is shown to interact with metallochaperones and function in copper detoxification of roots [21].It is found that some HMA genes are highly expressed in A.halleri, suggesting their importance in hyperaccumulation process.Among HMA genes, HMA2 and HMA4 are discovered to be among the most important ones with HMA4 playing role ineffective root-to-shoot Zn/Cd translocation [12] and HMA3 playing role in Zn detoxification [22].Additionally, HMA2 and HMA4 have both been demonstrated to be plasma membrane proteins.Finally, both of these proteins appear to function in Cd transport within the plant.Analysis of whole plant demonstrates that HMA2 accumulate more Zn and Cd than wild type plants although they do not appear to have an increased sensitivity to either metal.HMA4 mutant plants accumulate more Zn and Cd in the roots but they accumulate less Zn and Cd in leaves [22].

Retrieving HMA sequences and Multiple Sequence Alignment
The sequences of Heavy metal ATP proteins were obtained from the National Center for Biotechnology Information (NCBI) Protein Database [23].Sequences' accession numbers are listed in table 1.Multiple sequence alignment (MSA) has been performed using the Clustal Omega software located on the website of the European Bioinformatics Institute (EBI), using default options [24].Clustal Omega is a new multiple sequence alignment program that uses seeded guide trees and HMM profile-profile techniques to generate alignments between three or more sequences.MSA is an invaluable bioinformatics tool used to measure the similarity between sequences, examine patterns of conservation and variability and derive evolutionary relationships [25].

Phylogenetic tree construction
In order to infer the evolutionary relationship between the HMA proteins, a phylogenetic tree was constructed using ClustalW2 phylogeny, a web service for phylogenetic analysis of molecular sequences [26].The service was run on default settings, and the steps that it performed to construct the phylogenetic tree involved MSA, alignment organization and construction, and visualization of the phylogenetic tree using different integrated tools.

3D structure prediction and validation
The structures of HMA proteins were hereby predicted with the help of the Phyre2 protein homology modeling server [27].Phyre2 is a web-based service for protein structure prediction that is free for non commercial use, and being one of the most popular methods for protein structure prediction.Cited over 1000 times, it is able to generate reliable protein models.Phyre2 has been designed and funded by the Biotechnology and Biological Sciences Research Council (BBSRC) from United Kingdom.A practical and widely cited molecular visualization tool PyMOL was used for structure visualization and representation.PyMOL vs 1.31 edu (The PyMOL Molecular Graphics System), is a molecular visualization tool that provides viewing, customizing and exporting of the visualized molecules.
The validation and stereo-chemical analysis of the predicted structures was performed using several tools.The first one was QMEAN6, available as the structure assessment tool at ExPASy server.QMEAN6 is a scoring function that is actually a linear combination of six terms: torsion angle potential over three consecutive amino acids, two distance-dependent interaction potentials, solvation potential and two terms describing the agreement of the predicted structure and the solvent accessibility of the model [28].Also provided is the Z-score of the QMEAN6, which compares the estimated score to the score from a high-resolution reference structure solved experimentally by X-ray crystallography, with strongly negative Z-scores expected from low quality models.
In QMEAN6 score better predictions have higher scores (between 0 and 1) and in Z-score lower quality predictions have more negative scores.Next, we used Verify3D, which assesses protein structures using three-dimensional profiles, analyzing the compatibility of a 3D model with its own amino acid sequence (1D).The scores range from -1 (bad score) to +1 (good score) [29].Stereo-chemical quality of the protein models was assessed with the PROCHECK software [30].PROCHECK compares the geometry of the residues in the predicted model with the known stereochemical values from well-known structures.It results in Ramachandran plots, providing information about the dihedral angles φ and ψ of amino acid residues in the protein structure.What PROCHECK basically does is comparison of the geometry of the residues in the predicted model with the known stereo-chemical values from well-known structures [30].

Localization of proteins
For the subcellular localization, we used the recently developed tool PSI (Plant Subcellular localization integrative predictor) which uses the group voting strategy and machine learning to combine the results of 11 independent subcellular localization tools: cello, mPloc, Predotar, mitoProt, MultiLoc, TargetP, WolfPSORT, subcellPredict, iPsort, Yloc and PTS1 [31].

Domain search and interaction prediction
The identification of domains in the five HMA proteins was performed using the online tool SMART (Simple Modular Architecture Research Tool) located on the website of the European Molecular Biology Laboratory (EMBL).The tool is able to detect more than 500 domain families from chromatin-associated, extracellular and signaling proteins [32].
Interactome of the HMA proteins was determined by using the STRING (Search Tool for the Retrieval of Interacting Genes/Proteins).It was used for searching interlogs of the five proteins.The STRING database consists of known and predicted protein interactions of currently 9 643 763 proteins and 2031 organisms.The predicted protein interactions are classified into physical or functional associations.What the program does basically is to determine binary interactions of each individual protein with predicted proteins.
Further, each interaction is assigned to a confidence score which depicts the quality and number of experimental technique used for the detection of these protein interactions [33].

Docking sites prediction
The docking site prediction was undertaken by the ClusPro 2.0 online software, an automated docking tool from the Structural Bioinformatics Lab of Boston University.ClusPro works on the basis of providing 70 000 rotations for the ligand protein from which 1000 rotations with the lowest score are chosen [34].

MSA and Phylogenetic tree construction
Multiple sequence alignment was performed by the Clustal Omega online tool.The results of the sequence alignment are in table 2. Using the same tool we have constructed the phylogenetic tree that shows the evolutionary relationship between the aligned HMA proteins from Brassica oleracea .

Figure1. Phylogenetic tree (cladogram) of HMA proteins in Brassica oleracea
In figure 1 we observe that we have two sister groups, one group being HMA2 and HMA4 proteins and on another side HMA1 and HMA5 proteins, each group having a common ancestor.HMA3, as a lone taxon, shares common ancestor with HMA1 and HMA5 but is more distant.However, it shows more homology with HMA2 and HMA4, as confirmed by Table 2.In addition, phylogenetic tree was constructed combining the five B.oleracea HMA proteins with HMA proteins from A. thaliana and B.rapa (see Figure 2).In this figure, it's clearly visible that A.thaliana HMA proteins share the common ancestors with all other taxa analyzed in the phylogenetic tree.

Protein localization
By PSI, the proteins are localized to 10 possible locations, with a score from 0 to 1 and higher implying higher confidence in the presence of the protein in a particular subcellular compartment.Results are shown for protein localization prediction in table below:

Predicted and varified 3D structure models
The determination of the structure of proteins is vital for total understanding of the function, interactions and possible ligands, conserved domains and their homologues and many other purposes.However, experimental determination of the 3D structure is a demanding and time consuming process, so bioinformatics tools are used to predict the structures of proteins of interest.The 3D structures of HMA proteins predicted by Phyre2 tool are seen in figure 3.  The SMART analysis revealed that all HMA proteins have the same domains but on different locations.The E1-E2 ATPase domain is a trans-membrane domain, which is basically membrane-bound enzyme complex/ion transporter that uses ATP hydrolysis to drive the transport of protons across a membrane.Some trans-membrane ATPases also work in reverse, harnessing the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP [35].

Interactome of HMA proteins
STRING (Search Tool for the Retrieval of Interacting Genes/Proteins) is used for searching interlogs of the five proteins.The STRING database consists of known and predicted protein interactions of currently 9.6 million proteins and 2031 organisms [38].The predicted protein interaction is classified into physical or functional associations.Before entering FASTA format of sequences into STRING, we need to specify which organism to search for our sequence.STRING doesn't offer Brassica oleracea as a model organism so we used organism Brassica rapa, because it shares more than 90 % of homology with HMA proteins from Brassica oleracea species.The interactome analysis revealed strong interactions of HMA1, HMA2, HMA4 proteins with FP3 (farnesylated protein 3) and FP6 (farnesylated protein 6), whereas all HMA proteins show strong interactions with ATX1 (copper metallochaperone) protein and other related copper and ion binding proteins (see supplement Table 1 and Figure 7).

Figure 5: STRING interactome of Brassica rapa HMA1, HMA2, HMA4 and HMA5 protein family (original figure shown)
The Brassica rapa HMA3 protein interactome is not shown in figure 7 due to the absence of this particular protein in STRING interactome server.In order to analyze the interactome of HMA3 protein we have used the homolog from Arabidopsis thaliana as a template (AT4G30120), due to the high similarity of 90% with the B.oleracea HMA3 protein.In this analysis we have confirmed that the HMA3 from Arabidopsis thaliana is also predicted to interact with FP3 and FP6 as the homolog heavy metal accumulator's protein from Brassica rapa.Furthermore, the results show strong interactions with other metal transporting proteins such as Copper chaperone (CCH) related, heavy metal associated isoprenylated plant protein 27 (HIPP27) and several heavy metal transport/detoxification domaincontaining proteins (see supplement table 2).

Docking sites prediction results
In figure 6, 7 and 8, the results of docking site prediction of HMA2 and HMA4 are shown with AtX1, FP3 and FP6, respectively.The protein structures modeled with ClusPro were checked by the same verification tools as with the HMA (1-5) proteins (Table 10).In addition, we have introduced an additional verification tool, DFire.This tool estimates the non-bonded atomic interactions in a model, thus providing the energy estimation which is considered closer to the native conformation if the DFire energy score is lower (supplement table 3) [39].

Discussion
Brasica oleracea is a plant known as metal hyperaccumulator that, as such can have an important role in environmental aspects.Among these, phytoremediation technology is the most interesting one and is the one that brought high attention of researchers in the last decade.However, beside phytoremediation and positive effects they can produce, metals-accumulating plants are directly or indirectly responsible for much of the dietary uptake of toxic heavy metals by humans and animals.Vegetables such as cabbage (Brassica juncea, Brassica oleracea) cultivated in wastewater-irrigated soils take up heavy metals in large enough quantities to cause potential health risks to the consumers [8].
Metal accumulation and translocation potential varies from plant to plant and metal to metal [8], therefore it is important to investigate both potentials in plants which are considered as metal-accumulators, which, in our work was B.oleracea.
Analysis of proteins responsible for metal accumulation and transport is of great importance to understand how those plants perform their functions in hyperaccumulation of metals.The HMA proteins (1-5) analyzed in our work are already known to play important roles in heavy metal accumulation processes.
Beside their primary function as metal-accumulators, it is important to investigate other processes in which HMAs can be involved.In order to investigate such processes, we were looking for potential interactions with other proteins that are currently unknown to the literature, not known to interact with the analyzed HMA proteins.
In this work we have confirmed that HMA2 and HMA4 proteins share the most homology among other HMA family proteins [40], with 71.7% similarity.The phylogenetic tree analysis between the HMA proteins, additionally confirmed the similarity among these two proteins, where HMA2 and HMA4 share same ancestor, separated by other groups in the tree.These results suggest that due to their close evolutionary relationship, they play important biochemical roles by performing same or similar functions within the cell.As reviewed by Hussain and colleagues [41], HMA2 and HMA4 play an important role in Zn transport and homeostasis in A.thaliana.By mutating the HMA4 and HMA2 genes they have observed a significant decrease in Zn accumulation.Furthermore, they observed that only the hma2-hma4 double mutant and neither of the single mutants exhibited an obvious nutritional deficiency in soil, suggesting that HMA2 and HMA4 have a level of functional redundancy, which can be consistent with sequence comparisons that show that HMA4 is the most closely related to HMA2, as confirmed in this study.
For further phylogenetic investigation for HMA homologes from the Brassicaceae family, a ClustaOmega cladogram was constructed (Figure 5 The 3D structures of proteins enables additional functional studies, domain analysis, molecular interaction studies, estimation of structural similarity between proteins etc.In this study, we used Phyre2 tool, a protein homology modeling server, used to create models of target proteins.These models contain information about the tendency for mutation of each amino acid in a sequence and are unique for each protein.They are created for a set of known 3D structures as well as for the user sequence, and then scanned to find a match [42].Further confirmation and verification of the modeled structure was tested by three validation methods.QMEAN6, PROCHECK and Verify 3D. The verification results of all five HMA protein in B.oleracea, showed sufficient quality, required for further analysis.According to our results, the Verify3D score for HMA1 is 0.70, for HMA2 it is 0.68, for HMA3 0.78, for HMA4 it is 0.72 and the highest score was observed with HMA5, being 0.80.The Ramachandran plots analysis revealed that all of the structural regions lie in the range of acceptance, with having more than 90% favored regions.QMEAN6 results revealed good Z scores.To be precise, QMEAN6 score for HMA1 is 0.546, for HMA2 score is 0.516.0.523 is score for HMA3 while 0.554 is score for HMA4.The lowest score is 0.444 and it stand for HMA5 (see Table 5).For the Z-score analysis we observe that all models have negative Z-scores (in average of -2 Z-score), being a median score for structural validations.Models of low quality are expected to have strongly negative QMEAN Z-scores, less then -3.5 [43].Obtained Z-scores are in line to scores obtained for high-resolution experimental structures of similar sizes solved by Xraycrystallography.Therefore, the Phyre2 generated models appeared acceptable for the protein and metal docking site prediction.
We have shown that all proteins share two domains, starting from different residues.The domains identified are P-ATPases (E1-E2 ATPases), membrane-bound enzyme complexes/ion transporters that use ATP hydrolysis to drive the transport of protons across a membrane [44] and the HAD domain, haloacid dehydrogenase (HAD) superfamily domains which are involved in a variety of cellular processes ranging from amino acid biosynthesis to detoxification [37].The interactome analysis revealed strong interactions of HMA1, HMA2, HMA4 proteins with FP3 (farnesylated protein 3) and FP6 (farnesylated protein 6), whereas all HMA proteins show strong interactions with ATX1 (copper metallochaperone) protein and other related copper and ion binding proteins.In order to analyze the interactome of HMA3 protein we have used the homolog from A.thaliana as a template (AT4G30120), due to the high similarity of 90% with the B.oleracea HMA3 protein.The HMA3 protein from A.thaliana is also predicted to interact with FP3 and FP6 as the homolog heavy metal accumulator's protein from B.rapa.Furthermore, the results show strong interactions with other metal transporting proteins such as Copper chaperon (CCH) related, heavy metal associated isoprenylated plant protein 27 (HIPP27) and several heavy metal transport/detoxification domaincontaining proteins (supplement table 1).
FP6 (also known as HIPP26) is characterized by a heavy metal binding domain (HMA) and an additional isoprenylation motif on C-terminus.This family of HIPPs embraces at least 44 proteins in A.thaliana with HMA domain being responsible for heavy metal binding, metal transport and metal homeostasis processes.Isoprenylation motif is added through the process of isporenylation [45].
Isoprenylation, also known as farnesylation, is a posttranslational protein modification that involves addition of a C-terminal hydrophobic anchor that is important for interaction of the protein with membranes or other proteins [46].
In a study conducted by Barth and colleagues [45], it is confirmed that HIPP26 exhibits a nuclear localization signal (NLS), thus being localized in the nucleus.In their work, they also concluded that for the exact spatial localization of HIPP26 within the nucleus, the isoprenylation seems to be important, which probably by its hydrophobic nature determines the correct spatial arrangement of this protein within the nucleus.
Furthermore, their study confirmed that HIPP26 strongly interacts with ATHB29, a zinc fingerhomeodomain transcription factor (ZF-HD proteins) which is found to be induced by drought, high salinity and abscisic acid, thus playing role in regulation of stress response of plants [45].Furthermore, Gao and colleagues [47] showed that FP6 in A.thaliana (AtFP6) upon interaction with plasma mebrane acyl-CoA-binding protein 2 (ACBP2) mediate cadmium Cd(II) tolerance [47].
Due to the strong interaction with FP6, the represented data confirms HMA protein family involvement in Cd(II) transport and tolerance, since all three HMAs are found to be cadmium/zinc transporting ATPases.In addition, we may suggest that these three HMA proteins may be important in strees-induced tolerance, since it was the case for FP6 protein [45].FP3 from A.thaliana, if soluble and isoprenylated, is capable of reversibly binding a copper-chelate matrix in tobacco BY2 cell homogenates, suggesting a ubiquitous role for these proteins in diverse plants [48] In this study, we confirm the interaction of all HMA proteins with CCH (Copper chaperone) or CCH-related proteins, which has been shown to functionally complement atx1 mutants, but the ATFP3 gene expression is not regulated in the same manner as CCH gene expression [49].ATX1 (copper metallochaperone) protein shows strong interactions with HMA1, HMA2, HMA4 and HMA5 proteins.
ATX1 is related with copper metallochaperones which assist copper in reaching vital destinations without inflicting damage or becoming trapped in adventitious binding sites [50].ATX1 is shown to bind Cu(I) in the cytoplasm which delivers it to a copper transporter in the membrane of a post-Golgi vesicles.In the vesicle, the copper is inserted into a multicopper oxidase essential for high-affinity iron uptake, so ATX1 can be involved in both, copper transport and defense against oxidative stress [49] ATX1 is also proposed to be involved in Cu homeostasis by its Cu-binding activity and interaction with the Cu transporter heavy metal-transporting P-type ATPase5, suggesting a regulatory role for the plantspecific domain of the CCH Cu chaperone, therefore, a role for HMA5 in Cu compartmentalization and detoxification [21].
In a more recent study conducted by Lung et al. [51], it is confirmed that overexpression of ATX1 enhancing Cu tolerance implies the potential use of ATX1 for phytoremediation in Cu-contaminated soil.In same study, they connected HMA5 with ATX1 on the way that ATX1 was proposed to deliver Cu to HMA5 for Cu detoxification in roots and translocation to shoots.
In the docking analysis, the HMA proteins were considered to be a ligand (according to ClusPro default settings, the ligand is the structure that gets rotated to fit into the receptor).Specific docking sites presented are ATX1 with HMA2 and HMA4.In this research we have verified our structures where all predicted models of docking show sufficient quality (see supplement table 3).
For further analyses of predicted docking structures, the electrostatic potential between the HMA proteins and docking partners was calculated via DeepView.This tool is showing clouds of negative and positive electrostatic potential in the docking site predicted, from which we could conclude that at least part of all docking sites is due to electrostatic forces.In supplement figure 1 we can see the clear separation of charges between HMA2 /HMA4 and ATX1 on similar docking regions.
The predicted docking region lies in the of N terminun as shown in literature, where HMA2 and HMA4 Nterminal domain are essential for function in planta while the C-terminal domain, although not essential for function, may contain a signal important for the subcellular localization of the protein (supplement figure 2) [52].
These predicted docking sites of FP3 and FP6 to HMA2 and HMA4 lie in similar region, usually on N terminus, which confirms the good modelling of 3D structures by ClusPro.The visualization of the resulting docking site models for HMA4 with FP3 and PP6 (supplement figure 3).
All the structures, verified by the electrostatic potential, given by DeepView, confirm the docking sites supported by the electrostatic forces.It has been shown that the electrostatic potentials at the interfaces of interacting molecules are anti-correlated.This means that at the interface, there is a good chance to find a patch of positive electrostatic potential on the surface of one molecule positioned next to a negative patch on the surface of the adjacent molecule and vice versa [53].Furthermore, a big DFire score results for all models are indicating good models of docking, which estimates the non-bonded atomic interactions in a model, thus providing the energy estimation that is closer to the native conformation the lower it gets (lower quality predictions have more negative scores) [39].

Conclusion
Brassicaceae family plants are known to accumulate high amounts of toxic metals, such are: (N), phosphorus (P), potassium (K), calcium (Ca), magnesium (Mg) and sulphur (S) and the micronutrients boron (B), chlorine (Cl), iron (Fe), manganese (Mn),copper (Cu), zinc (Zn), nickel (Ni) and molybdenum (Mo).The subgroup, Brassica oleracea shows great potential to be hyperaccumulator of Zn and Cd heavy metals.In this research we have focused to further investigate the roles of HMA (1-5) proteins in Brassica oleracea, predicting their structures and interactomes.To achieve the most accurate result, the stated aim was enhanced and supported through the use of several common bioinformatics techniques.The B.oleracea HMA proteins (1-5) were subjected to multiple sequence alignment analysis with HMA proteins from B.rapa, B.napus and A.thaliana in order to obtain information about conserved regions among these proteins and to assess the phylogenetic relationship of the proteins.This further enabled further analysis of their 3D structures as well as their interactome analysis, in order to confirm current functional roles of each protein and possibly discover new inteactome partners unknown to the literature, for new annotations of functional roles of B.oleracea HMA proteins (1-5).
It is through bioinformatics analysis that we identified and structurally predicted 5 homologues of HMA proteins in Brassica rapa, mostly similar to Brassica oleracea.Since they are similar, but not identical in structure and differentiate in two groups in phylogenetic analysis, further inference about the functions and localization of the homologues was required.For that purpose, localization tools were used to predict subcellular locations and the trend of differences between the homologues continued.Lastly, the interactome analysis showed similar functions and associations with many crucial processes of metal ion transportation, required for cellular integrity and stability maintenance.The results obtained in this study lead us to the conclusion that cellular functions of the 5 homologues are very similar, where HMA2 and HMA4 are directly involved in Zn/Cd transport, whereas HMA5 functions as metallochaperones and functions in copper detoxification, as confirmed within this study.In addition, the interactome analysis revealed strong interaction of HMA2 and HMA4 proteins with FP3 (farnesylated protein 3) and FP6 (farnesylated protein 6).A study conducted by Dykema et al. [48], showed that FP3 from A.thaliana has function as ubiquitous protein in diverse plants.FP6 (also known as HIPP26) is characterized by a heavy metal binding domain (HMA) and an additional isoprenylation motif on Cterminus.It is shown that HIPP26 strongly interacts with ATHB29, a zinc finger homeodomain transcription factor (ZF-HD proteins) which is found to be induced by drought, high salinity and abscisic acid, thus playing role in regulation of stress response of plants.Furthermore it is shown that FP6 interacts with acyl-CoA-binding protein 2 (ACBP2) mediate cadmium Cd(II) tolerance protein, indicating the possibility the HMA2 and HMA4 proteins may share the above mentioned cellular functions.
Experimental determination of 3D structures of the homologues, as well as further testing in terms of interactome and co-localization analysis, is needed to fully understand the role of HMA homologues in metal transports.Especially the docking sites and binding domains need to be researched further, preferably in vivo, in order to understand the mechanism by which this protein docks to Zn and Cd ions and its function as a partner for other protein functions.This study confirmed the known functional roles of HMA proteins, especially the HMA2 and HMA4 proteins, known to be hyper-accumulators for Zn and Cd, elongating their potential cellular roles by detailed 3D structure and interactome analysis.
Furthermore, this indicated the possibility of an evolutional change of A.thaliana HMAs into B.rapa and B.oleracea HMA proteins.The HMA proteins from B.rapa and B.oleracea share the common ancestor, being sister taxa with all HMA proteins, except HMA3.HMA3 protein from B.rapa and A.thaliana shares the common ancestor whereas B.Oleracea has evolved separately.

Figure 6 :Figure 7 :Figure 8 :
Figure 6: Docking site prediction of ATX1 with HMA2 and HMA4 from Brassica rapa Furthermore, HMA2 and HMA4 are shown to have strong interaction with ATX1 protein, by now only know to interact with HMA5, which may indicate a specific involvement of HMA2 and HMA4 proteins in Cu(I) binding and the delivery to the post-Golgi vesicle, with strong possibility of Cu compartmentalization and detoxification, as shown for HMA5.

Table 1 : Accession numbers of HMA proteins HMA proteins Arabidopsis thaliana Brassica oleracea Brassica napus Brassica rapa
After the prediction, the 3D structures underwent the process of validation by several structure assessment tools.The results are shown in table 4.The domains in the five HMA proteins from Brassica oleracea were identified by the SMART software.The results are presented in the table 5: