PMID: PMC5795011-1-4

 

    Legend: Gene, Sites

Title : Identification and label-free quantification of serum N-glycoproteome

Abstract :
  1. The proposed procedure was validated by analyzing the N-linked sialylated glycoproteome of serum samples from healthy individuals (n = 12) and patients diagnosed with prostate cancer (n = 12)
  2. Tryptic peptides from serum samples were desalted using zwitterionic chromatography-hydrophilic interaction liquid chromatography solid phase extraction ( ZIC-HILIC SPE) to enrich glycopeptides , followed by enrichment of sialylated glycopeptides with TiO2 beads (Fig. 4)
  3. The N-linked sialylated glycopeptides were analyzed by LC-MS using HCD with stepped NCE and the acquired MS2 spectra were submitted to the Mascot search engine for automated identification and relative quantification using Mascot Distiller (Fig. 4)
  4. The result of the described strategy for large scale automated glycopeptide analysis of LC-MS datasets was demonstrated by serum alpha-1-acid glycoprotein 1 ( A1AG1 ) as an example
  5. Considering zero missed cleavages, NxT/S/C motifs and a peptide length of 6–30 amino acids, A1AG1 potentially contained two N-glycosylation sites in the custom glycoprotein database (QDQCIYNTTYLNVQR, ENGTISR)
  6. A1AG1 was identified with a protein score of 1559 and 27% sequence coverage by the database search of 24 LC-MS runs
  7. Most of the sialylated N-glycans were identified on the sequence QDQCIYNTTYLNVQR
  8. Mascot annotated nine different mono-, di-, tri- and tetra-sialylated N-glycan structures on this glycosylation site (Fig. 5)
  9. Almost all the intense peaks in the MS2 spectra of mono-sialylated bi- (Fig. 5A), tri- (Fig. 5B) and tetra-antennary (Fig. 5C) glycopeptides were annotated by Mascot, confirming the presence of these glycan structures
  10. The peptide sequence was confirmed by annotation of y5, y6 and y8 ions
  11. MS2 spectra shown in Fig. 5 D–F were annotated to di-sialylated bi-, tri- and tetra-antennary glycopeptides , confirmed by a complete series of y ions representing both peptide and glycan cleavages
  12. The same was found for tri- and tetra-sialylated glycan structures on the same peptide sequence (Fig. 5G–I)
  13. Nine different glycan structures with varied degree of complexity and sialylation on the same glycosylation site , and near to complete information about both the peptide and glycan part proved the capability of the current approach for large scale automated glycopeptide analysis
  14. Some of the above sialylated glycopeptides were also identified with attached fucose residues
  15. As mentioned above, fucose was considered as a variable modification during the database search
  16. Though it is not possible to pinpoint the exact location of fucose residues , it can be easily concluded whether the fucose is attached to the core HexNAc residue or the HexNAc residues after the trimannosyl core glycan structure
  17. For example, the top scoring matches of tri-sialylated tri-antennary and di-sialylated tetra-antennary glycopeptides indicated a fucose residue after the core structure
  18. The absence of peak at +146 Da following the peptide \+ HexNAc peak clearly indicated that the fucose residue is not attached to the core HexNAc (Supplementary Fig. 4A,B)
  19. As opposed to the above examples, Mascot annotated the fucose residue to the core HexNAc of a di-sialylated bi-antennary glycopeptide of alpha-2-macroglobulin
  20. The presence of a peak at +146 Da, following the peptide \+ HexNAc peak clearly indicated that the fucose is attached to the core structure (Supplementary Fig. 4C)
  21. Therefore, it must be considered that the fucose is either attached to the core HexNAc or HexNAc residues following the core glycan structure when determining the position of fucose residues in Mascot output
  22. In addition to fucose , other modifications such as sulfation and phosphorylation of HexNAc or Hex could also be considered as variable modifications if this is of interest
  23. However, using more variable modifications increases the search space and thus the uncertainty in some assignments
  24. Using this approach, a total of 257 glycoproteins were identified from the 24 serum samples (Supplementary Table 3)
  25. Within these 257 glycoproteins , a total of 970 unique glycosylation sites and 3447 non-redundant glycopeptide variants were identified (Supplementary Tables 4, and 5)
  26. Of these 3447 glycopeptide variants , the most abundant are the di-sialylated bi-antennary glycans with no (377), one (291) and two fucose residues (169)
  27. The next major glycopeptide variants included the di-sialylated tri-antennary and mono-sialylated di-antennary glycopeptide variant without and with fucose residues (Supplementary Table 5)
  28. The specific enrichment for di-sialylated bi-antennary glycans might indicate the abundance of these glycans in the serum proteins
  29. However, an effect of the enrichment protocol cannot be ruled out
  30. Label-free quantification of the glycopeptides (aggressive vs. indolent prostate cancer) was performed using the replicate quantitation protocol of Mascot Distiller
  31. The median protein ratios revealed no significant changes between aggressive and indolent samples and most of the protein ratios were within the range of 1.0 ± 0.5 (Supplementary Table 3)
  32. To find out any quantitative differences at the glycosylation level, the glycopeptides were segmented based on the glycan structures irrespective of the protein origin and the corresponding ratios were plotted as violin plots
  33. Figure 6 displays the glycopeptide ratios of the three most abundant glycan structures and most of them have peptide ratios near to 1.0, indicating no significant changes between the analyzed indolent and aggressive cancer samples
  34. Glycopeptide ratios of various other glycan structures which were identified in more than 10 different peptide sequences are presented in Supplementary Fig. 5
  35. Most of these structures had also peptide ratios around 1.0 with a very few being up or down
  36. For example, the median glycopeptide ratio of the tri-sialylated tri-antennary glycopeptides is near 1.0 based on 77 values, whereas the mono- (73 values) and di-fucosylated (19 values) versions have a median peptide ratio slightly above 1.0
  37. The tri-sialylated tetra-antennary glycopeptides had two different populations at median peptide ratios of 1.0 and 1.5, whereas the fucosylated version had a median peptide ratios slightly above 1.0 (Supplementary Fig. 5)
  38. Summarized, the presented data shows the ease and feasibility of the proposed workflow for automated glycopeptide identification and quantification
  39. In addition to the database used in obtaining the above presented results, the LC-MS data sets of the 24 serum samples were also searched against differentially sized custom glycoprotein databases created from (i) all known plasma/serum proteins from PeptideAtlas build 2010 ( 2421 glycoproteins ), (ii) all deamidated proteins identified following PNGaseF treatment of glycopeptides from the same 24 serum samples ( 280 glycoproteins ) and (iii) Swiss- Prot annotated human proteome ( 14120 glycoproteins )
  40. Irrespective of the databases, 68 glycoproteins were consistently identified in all four different databases (Supplementary Fig. 6)
  41. There is a good level of agreement between the three plasma protein related databases because 121 glycoproteins were consistently identified
  42. The deamidated proteins ( 280 proteins ) identified after the PNGaseF treatment potentially represent well the detectable glycoproteins present in the 24 serum samples
  43. Comparing the glycoprotein databases created from deamidated proteins identified following PNGaseF treatment and plasma glycoproteins reported in Peptide Atlas, out of the 257 glycoproteins identified, 163 were found to be common representing 63% overlap (Supplementary Fig. 6)
  44. This result clearly indicates the authenticity of the glycoproteins identified by the workflow presented in this study
Output (sent_index, trigger, protein, sugar, site):
  • 11. both, , -, -, peptide
  • 11. glycopeptides, , -, -, glycopeptides
  • 118. N-glycosylation, , -, -, -
  • 13. glycopeptide, , -, -, glycopeptide
  • 13. glycosylation, , -, -, site
  • 13. sialylation, , -, -, site
  • 14. glycopeptides, , -, -, glycopeptides
  • 14. sialylated, , -, -, glycopeptides
  • 16. attached, , -, the fucose, residue
  • 16. attached, , -, the fucose, residues
  • 17. di-sialylated, , -, -, glycopeptides
  • 17. glycopeptides, , -, -, glycopeptides
  • 17. tri-sialylated, , -, -, glycopeptides
  • 19. di-sialylated, , -, -, glycopeptide
  • 19. glycopeptide, , -, the core HexNAc, glycopeptide
  • 19. glycopeptide, , alpha-2-macroglobulin, -, glycopeptide
  • 19. glycopeptide, , alpha-2-macroglobulin, the core HexNAc, glycopeptide
  • 2. glycopeptides, , -, -, glycopeptides
  • 2. sialylated, , -, -, glycopeptides
  • 21. attached, , -, the fucose, residues
  • 24. glycoproteins, , glycoproteins, -, -
  • 25. glycopeptide, , -, -, glycopeptide
  • 25. glycoproteins, , glycoproteins, -, -
  • 25. glycosylation, , -, -, sites
  • 26. glycopeptide, , -, -, glycopeptide
  • 27. glycopeptide, , -, -, glycopeptide
  • 27. mono-sialylated, , variant, -, -
  • 3. glycopeptides, , -, -, glycopeptides
  • 3. sialylated, , -, -, glycopeptides
  • 30. glycopeptides, , -, -, glycopeptides
  • 32. glycopeptides, , -, -, glycopeptides
  • 33. glycopeptide, , -, -, glycopeptide
  • 36. glycopeptide, , -, -, glycopeptide
  • 36. glycopeptides, , -, -, glycopeptides
  • 36. tri-sialylated, , -, -, glycopeptides
  • 37. glycopeptides, , -, -, glycopeptides
  • 37. tri-sialylated, , -, -, glycopeptides
  • 38. glycopeptide, , -, -, glycopeptide
  • 39. glycopeptides, , glycoproteins, -, glycopeptides
  • 39. glycoprotein, , glycoprotein, -, -
  • 39. glycoproteins, , glycoproteins, -, -
  • 4. glycopeptide, , -, -, glycopeptide
  • 4. glycoprotein, , A1AG1, -, -
  • 4. glycoprotein, , alpha-1-acid glycoprotein 1, -, -
  • 40. glycoproteins, , glycoproteins, -, -
  • 41. glycoproteins, , glycoproteins, -, -
  • 42. glycoproteins, , glycoproteins, -, -
  • 43. glycoprotein, , glycoprotein, -, -
  • 43. glycoproteins, , glycoproteins, -, -
  • 44. glycoproteins, , glycoproteins, -, -
  • 5. N-glycosylation, , -, -, sites
  • 5. glycoprotein, , glycoprotein, -, -
  • 71. glycopeptide, , -, -, -
  • 71. glycosylation, , -, -, -
  • 8. glycosylation, , -, -, site
  • 9. glycopeptides, , -, -, glycopeptides
Output(Part-Of) (sent_index, protein, site):
  • 19. alpha-2-macroglobulin, glycopeptide
  • 5. database, sites
*Output_Site_Fusion* (sent_index, protein, sugar, site):

 

 

Protein NCBI ID SENTENCE INDEX