Title : Exploring site-specific N-glycosylation microheterogeneity of
haptoglobin using
glycopeptide CID tandem mass spectra and glycan database search
Abstract :
- Glycosylation is a common protein modification with a significant role in many vital cellular processes and human diseases, making the characterization of protein-attached glycan structures important for understanding cell biology and disease processes
- Direct analysis of protein N-glycosylation by tandem mass spectrometry of glycopeptides promises site-specific elucidation of N-glycan microheterogeneity, something that detached N-glycan and deglycosylated peptide analyses cannot provide
- However, successful implementation of direct N-glycopeptide analysis by tandem mass spectrometry remains a challenge
- In this work, we consider algorithmic techniques for the analysis of LC-MS/MS data acquired from glycopeptide-enriched fractions of enzymatic digests of purified proteins
- We implement a computational strategy that takes advantage of the properties of CID fragmentation spectra of N-glycopeptides , matching the MS/MS spectra to peptide-glycan pairs from protein sequences and glycan structure databases
- Significantly, we also propose a novel false discovery rate estimation technique to estimate and manage the number of false identifications
- We use a human glycoprotein standard, haptoglobin , digested with trypsin and GluC , enriched for glycopeptides using HILIC chromatography, and analyzed by LC-MS/MS to demonstrate our algorithmic strategy and evaluate its performance
- Our software, GlycoPeptideSearch ( GPS ), assigned glycopeptide identifications to 246 of the spectra at a false discovery rate of 5.58%, identifying 42 distinct haptoglobin peptide-glycan pairs at each of the four haptoglobin N-linked glycosylation sites
- We further demonstrate the effectiveness of this approach by analyzing plasma-derived haptoglobin , identifying 136 N-linked glycopeptide spectra at a false discovery rate of 0.4%, representing 15 distinct glycopeptides on at least three of the four N-linked glycosylation sites
- The software, GlycoPeptideSearch, is available for download from http ://edwardslab.bmcb.georgetown.edu/ GPS
Output (sent_index, trigger,
protein,
sugar,
site):
- 0. N-glycosylation, , haptoglobin, -, -
- 0. glycopeptide, , -, -, glycopeptide
- 0. microheterogeneity, , haptoglobin, -, -
- 2. glycopeptides, , -, -, glycopeptides
- 3. N-glycopeptide, , -, -, N-glycopeptide
- 5. N-glycopeptides, , -, -, N-glycopeptides
- 5. sequences, , -, peptide-glycan pairs, sequences
- 7. glycopeptides, , -, -, glycopeptides
- 7. glycoprotein, , glycoprotein, -, -
- 7. glycoprotein, , haptoglobin, -, -
- 8. glycopeptide, , -, -, glycopeptide
- 8. glycosylation, , -, -, sites
- 9. glycopeptide, , -, -, glycopeptide
- 9. glycopeptides, , -, -, glycopeptides
- 9. glycosylation, , -, -, sites
Output(Part-Of) (sent_index,
protein,
site):
- 8. haptoglobin N-linked, sites
*Output_Site_Fusion* (sent_index,
protein,
sugar,
site):