PMID: 23829323

 

    Legend: Gene, Sites

Title : Exploring site-specific N-glycosylation microheterogeneity of haptoglobin using glycopeptide CID tandem mass spectra and glycan database search

Abstract :
  1. Glycosylation is a common protein modification with a significant role in many vital cellular processes and human diseases, making the characterization of protein-attached glycan structures important for understanding cell biology and disease processes
  2. Direct analysis of protein N-glycosylation by tandem mass spectrometry of glycopeptides promises site-specific elucidation of N-glycan microheterogeneity, something that detached N-glycan and deglycosylated peptide analyses cannot provide
  3. However, successful implementation of direct N-glycopeptide analysis by tandem mass spectrometry remains a challenge
  4. In this work, we consider algorithmic techniques for the analysis of LC-MS/MS data acquired from glycopeptide-enriched fractions of enzymatic digests of purified proteins
  5. We implement a computational strategy that takes advantage of the properties of CID fragmentation spectra of N-glycopeptides , matching the MS/MS spectra to peptide-glycan pairs from protein sequences and glycan structure databases
  6. Significantly, we also propose a novel false discovery rate estimation technique to estimate and manage the number of false identifications
  7. We use a human glycoprotein standard, haptoglobin , digested with trypsin and GluC , enriched for glycopeptides using HILIC chromatography, and analyzed by LC-MS/MS to demonstrate our algorithmic strategy and evaluate its performance
  8. Our software, GlycoPeptideSearch ( GPS ), assigned glycopeptide identifications to 246 of the spectra at a false discovery rate of 5.58%, identifying 42 distinct haptoglobin peptide-glycan pairs at each of the four haptoglobin N-linked glycosylation sites
  9. We further demonstrate the effectiveness of this approach by analyzing plasma-derived haptoglobin , identifying 136 N-linked glycopeptide spectra at a false discovery rate of 0.4%, representing 15 distinct glycopeptides on at least three of the four N-linked glycosylation sites
  10. The software, GlycoPeptideSearch, is available for download from http ://edwardslab.bmcb.georgetown.edu/ GPS
Output (sent_index, trigger, protein, sugar, site):
  • 0. N-glycosylation, , haptoglobin, -, -
  • 0. glycopeptide, , -, -, glycopeptide
  • 0. microheterogeneity, , haptoglobin, -, -
  • 2. glycopeptides, , -, -, glycopeptides
  • 3. N-glycopeptide, , -, -, N-glycopeptide
  • 5. N-glycopeptides, , -, -, N-glycopeptides
  • 5. sequences, , -, peptide-glycan pairs, sequences
  • 7. glycopeptides, , -, -, glycopeptides
  • 7. glycoprotein, , glycoprotein, -, -
  • 7. glycoprotein, , haptoglobin, -, -
  • 8. glycopeptide, , -, -, glycopeptide
  • 8. glycosylation, , -, -, sites
  • 9. glycopeptide, , -, -, glycopeptide
  • 9. glycopeptides, , -, -, glycopeptides
  • 9. glycosylation, , -, -, sites
Output(Part-Of) (sent_index, protein, site):
  • 8. haptoglobin N-linked, sites
*Output_Site_Fusion* (sent_index, protein, sugar, site):

 

 

Protein NCBI ID SENTENCE INDEX
haptoglobin 3240 0,7,9
haptoglobin N-linked 3240 8
GluC 57733 7