PMID: PMC4184449-1-1

 

    Legend: Gene, Sites

Title : Glycopeptide Identification Based on AccurateIntact Mass

Abstract :
  1. We compared the usefulness of LC–MS dataacquired usingHILIC-C18-MS versus C18-MS for profiling glycoprotein proteolyticpeptides
  2. For this purpose, the neutral mass values extracted fromthe data sets were searched against a set of theoretical glycopeptidemasses , with a 10 ppm mass-error tolerance
  3. The theoretical massesconsisted of the protein proteolytic peptides with up to two missedcleavage sites
  4. Masses for those peptides containing an N-glycosylation sequon were calculated as a set of glycosylation variantsusing N-glycosylation com positions ranging from core N-glycan structures to penta-antennary complex-type N-glycans containing N-acetylneuraminicacid and high-mannose N-glycans (shown in Supporting Information Section S-3)
  5. The GlycReSoftprogram was used to match and score deconvolutedmasses from an LC–MS experiment against this list of theoreticalcompound accurate masses
  6. Because the complexity in glycopeptide dataincreases with the number of glycosylation sites , intact mass assignmentssuffer from ambiguous matches and false positives that cannot be verifiedin the absence of tandem MS. We acquired LC–MS profilingdata for three glycoproteins with a range of complexities
  7. Transferrinis known to have two N-linked glycosylation siteswith complex N-glycans, and the glycan heterogeneityfor this protein is fairly limited with [5,4,0,2,0] [Hex, HexNAc,dHex, NeuAc, NeuGc] contributing to over 90% of the glycan distribution
  8. Thus, this glycoprotein presents low complexity,which makes the search space small and minimizes the chances for falsepositives and ambiguous assignments
  9. AGP, by contrast, has 5 known N-linked glycosylation sites with multiple genetic variants and a more diverse set of complexglycans that make this a relatively more complex glycoprotein
  10. Recombinanthemagglutinin from A/USSR/90/1977 (H1N1), referredto herein as HA, was the most complex of the three glycoproteins analyzed,with 9 consensus glycosylation sites
  11. In addition, theoretical trypticcleavage of HA produced 2 peptides that presented more than 1 putativeglycosylation site on a single peptide
  12. Thus, expected ambiguity andfalse positives scaled with increasing number of glycosylation sitesin transferrin , AGP, and HA
  13. The number of theoretical glycopeptidecompositions positions used for each glycoprotein is given in Table 1
  14. We observed a significantly higher number of glycopeptide matchesusing HILIC-C18 LC–MS data over C18 LC–MS data, forall glycoproteins tested
  15. Table 1 shows thenumber of glycopeptides and peptides identified based on intact glycopeptidemasses
  16. Even with the limited possibility of false matches when searchingsolely based on theoretical masses, it was evident that HILIC-C18enriched glycopeptides efficiently
  17. Data were acquired as analyticaltriplicates, and peptide/glycopeptide com positions found in all threereplicates were considered matches
Output (sent_index, trigger, protein, sugar, site):
  • 1. glycoprotein, , glycoprotein, -, -
  • 10. glycoproteins, , glycoproteins, -, -
  • 10. glycosylation, , -, -, sites
  • 11. putativeglycosylation, , -, -, site
  • 13. glycoprotein, , glycoprotein, -, -
  • 14. glycopeptide, , -, -, glycopeptide
  • 14. glycoproteins, , glycoproteins, -, -
  • 15. glycopeptides, , -, -, glycopeptides
  • 16. glycopeptides, , -, -, glycopeptides
  • 17. peptide/glycopeptide, , -, -, peptide/glycopeptide
  • 4. N-glycosylation, , -, -, sequon
  • 6. glycopeptide, , -, -, glycopeptide
  • 6. glycoproteins, , glycoproteins, -, -
  • 6. glycosylation, , -, -, sites
  • 8. glycoprotein, , glycoprotein, -, -
  • 9. glycoprotein, , glycoprotein, -, -
  • 9. glycosylation, , -, -, sites
Output(Part-Of) (sent_index, protein, site):
  • 1. glycoprotein, proteolyticpeptides
  • 3. -, sites
*Output_Site_Fusion* (sent_index, protein, sugar, site):

 

 

Protein NCBI ID SENTENCE INDEX