Title : Glycopeptide Identification Based on AccurateIntact Mass
Abstract :
- We compared the usefulness of LC–MS dataacquired usingHILIC-C18-MS versus C18-MS for profiling glycoprotein proteolyticpeptides
- For this purpose, the neutral mass values extracted fromthe data sets were searched against a set of theoretical glycopeptidemasses , with a 10 ppm mass-error tolerance
- The theoretical massesconsisted of the protein proteolytic peptides with up to two missedcleavage sites
- Masses for those peptides containing an N-glycosylation sequon were calculated as a set of glycosylation variantsusing N-glycosylation com positions ranging from core N-glycan structures to penta-antennary complex-type N-glycans containing N-acetylneuraminicacid and high-mannose N-glycans (shown in Supporting Information Section S-3)
- The GlycReSoftprogram was used to match and score deconvolutedmasses from an LC–MS experiment against this list of theoreticalcompound accurate masses
- Because the complexity in glycopeptide dataincreases with the number of glycosylation sites , intact mass assignmentssuffer from ambiguous matches and false positives that cannot be verifiedin the absence of tandem MS. We acquired LC–MS profilingdata for three glycoproteins with a range of complexities
- Transferrinis known to have two N-linked glycosylation siteswith complex N-glycans, and the glycan heterogeneityfor this protein is fairly limited with [5,4,0,2,0] [Hex, HexNAc,dHex, NeuAc, NeuGc] contributing to over 90% of the glycan distribution
- Thus, this glycoprotein presents low complexity,which makes the search space small and minimizes the chances for falsepositives and ambiguous assignments
- AGP, by contrast, has 5 known N-linked glycosylation sites with multiple genetic variants and a more diverse set of complexglycans that make this a relatively more complex glycoprotein
- Recombinanthemagglutinin from A/USSR/90/1977 (H1N1), referredto herein as HA, was the most complex of the three glycoproteins analyzed,with 9 consensus glycosylation sites
- In addition, theoretical trypticcleavage of HA produced 2 peptides that presented more than 1 putativeglycosylation site on a single peptide
- Thus, expected ambiguity andfalse positives scaled with increasing number of glycosylation sitesin transferrin , AGP, and HA
- The number of theoretical glycopeptidecompositions positions used for each glycoprotein is given in Table 1
- We observed a significantly higher number of glycopeptide matchesusing HILIC-C18 LC–MS data over C18 LC–MS data, forall glycoproteins tested
- Table 1 shows thenumber of glycopeptides and peptides identified based on intact glycopeptidemasses
- Even with the limited possibility of false matches when searchingsolely based on theoretical masses, it was evident that HILIC-C18enriched glycopeptides efficiently
- Data were acquired as analyticaltriplicates, and peptide/glycopeptide com positions found in all threereplicates were considered matches
Output (sent_index, trigger,
protein,
sugar,
site):
- 1. glycoprotein, , glycoprotein, -, -
- 10. glycoproteins, , glycoproteins, -, -
- 10. glycosylation, , -, -, sites
- 11. putativeglycosylation, , -, -, site
- 13. glycoprotein, , glycoprotein, -, -
- 14. glycopeptide, , -, -, glycopeptide
- 14. glycoproteins, , glycoproteins, -, -
- 15. glycopeptides, , -, -, glycopeptides
- 16. glycopeptides, , -, -, glycopeptides
- 17. peptide/glycopeptide, , -, -, peptide/glycopeptide
- 4. N-glycosylation, , -, -, sequon
- 6. glycopeptide, , -, -, glycopeptide
- 6. glycoproteins, , glycoproteins, -, -
- 6. glycosylation, , -, -, sites
- 8. glycoprotein, , glycoprotein, -, -
- 9. glycoprotein, , glycoprotein, -, -
- 9. glycosylation, , -, -, sites
Output(Part-Of) (sent_index,
protein,
site):
- 1. glycoprotein, proteolyticpeptides
- 3. -, sites
*Output_Site_Fusion* (sent_index,
protein,
sugar,
site):