PMCID: PMC4739677

 

    Legend: Gene, Sites, Suger

Section : Central to this study is the explorative, nontargeted analysis of O-glycosylated blood plasma glycoproteins

Content :
  1. To this end a glycoproteomics approach was applied, that includes the identification of the peptide moiety as well as a characterization and localization of the O-glycosylation sites with the characterization of the corresponding O-glycans
  2. HILIC enriched glycopeptides derived from a broad-specific proteolytic digest of human blood plasma proteins were analyzed by reversed-phase liquid chromatography combined with multistage mass spectrometry (CID- MS2 /-MS3, ETD- MS2 fragmentation )
*Output_Site_Fusion* (sent_index, protein, sugar, site):
Section : Kininogen-1

Content :
  1. The human KNG1 gene codes for two splicing variants of kininogen, namely low-molecular and high-molecular weight kininogen
  2. The latter is involved in blood coagulation and the assembly of the kallikrein-kinin system and was identified in the present study by six O-glycopeptides
  3. Currently nine O-glycosylation sites/regions are described in literature for kininiogen-1—presumably all being decorated with mucin-type core 1 or possibly core 8 O-glycans (30, 60–62)
  4. Experimental glycoproteomic evidence on the macro and microheterogeneity of kininogen-1 is still missing, though
  5. In the present study, four kininogen-1 O-glycosylation sites , including one novel site ( Ser604 ), could be pinpointed and described with respected to the com position of the attached O-glycans (Table I)
  6. The identified O-glycopeptide 600FNPIDFPD610 (m/z 734.263+) carries a disialylated T-antigen and harbors three potential O-glycosylation sites
  7. ETD analysis implies the occupancy of Ser604 , because of the presence of a signal at m/z 1638.301+, corresponding to a c6 ion (supplemental Fig
  8. S6)
  9. Also of note, in previous studies the use of trypsin did not allow to pinpoint occupied O-glycosylation sites in the region aa119–161 (30, 61)
  10. Proteinase K , however, generated two distinct O-glycopeptides (132EGPVVA138m/z 664.702+, 146VHPIQ152m/z 719.232+) that allowed pinpointing the site Thr137 and the region Ser150/Thr151
  11. For the latter, unfortunately, the ETD spectrum quality did not allow localizing the exact site
  12. The peptide 132EGPVVA138 (m/z 664.702+) could not be identified correctly by MASCOT database search, because of missing fragment ions
  13. However, the peptide could be identified via manual de novo annotation supported by mass tag ([283.0 Da]VVTA) assisted de novo sequencing using the tool MS-Homology (http ://prospector.ucsf.edu/prospector) (supplemental Fig
  14. S5)
  15. The peptide identity was further verified by the identification of the glycosylated peptide 132EGPVVAQ138 (m/z 874.282+) in a subsequent HILIC fraction (supplemental Fig
  16. S7)
*Output_Site_Fusion* (sent_index, protein, sugar, site):
  • 7. Kininogen-1, -, Ser604
Section : Immunoglobulin J Chain

Content :
  1. The immunoglobulin J chain (joining chain) participates in the effective di-/polymerization of either IgA or IgM and is essential for the secretion of these immunoglobulins into the mucosa
  2. In literature the J chain was reported to be N-glycosylated at Asn49 (60, 63, 64) ; however, O-glycosylation has hitherto not been described for the molecule
  3. Interestingly, two O-glycopeptides detected in HILIC fractions #13 and #14 might correspond to the J chain and suggest O-glycosylation at Thr97 (95DPEV99m/z 608.722+, m/z 608.712+) (supplemental Figs
  4. S3 and S5)
  5. This potentially new O-glycosylation site is in close vicinity to a cysteine ( Cys91 ) that can form a disulfide-bridge to IgM molecules
  6. Hence, one might speculate that an occupied O-glycosylation site in this region might function in the establishment/preservation of this inter-molecular bond
  7. However, the number of present fragment ions in the corresponding CID-MS3 spectra did not allow an unambiguous identification of the peptide, as evidenced by several potential peptide hits being equally scored by the search engine
  8. Manual fragment spectra annotation, though, suggest the identification of immunoglobulin J chain —nevertheless, this identification deserves further validation
  9. Both identified O-glycopeptides were found to be decorated with monosialylated T-antigens
*Output_Site_Fusion* (sent_index, protein, sugar, site):
  • 2. Immunoglobulin J Chain, -, Asn49 (60, 63, 64)
  • 3. Immunoglobulin J Chain, -, Thr97
Section : Inter-α-trypsin Inhibitor Heavy Chain H4

Content :
  1. For the protease inhibitor inter-alpha-trypsin inhibitor heavy chain H4 two O-glycosylation sites/regions, Ser640 and Thr722/723 have been described in literature (58, 65)
  2. In agreement with recent findings by Chandler et al., Ser640 was found to be O-glycosylated
  3. The O-glycopeptide 639AFPR644 (m/z 660.722+) harbors two potential O-glycosylation sites and the occupied site could be clearly inferred from the ETD spectra by the presence of a signal at m/z 490.221+, corresponding to a z+14 ion (supplemental Fig
  4. S5)
  5. In contrast to Chandler et al., but in agreement with Halim et al., ETD data of the O-glycopeptide 722QTPAPIQAPS733 (m/z 623.273+) suggested the occupancy of the sites Thr722/723 (58, 65) (supplemental Fig
  6. S5)
  7. Unfortunately, none of the two potential O-glycosylation sites could be clearly ruled out by the detected fragment ions
  8. Both sites/regions Ser640 and Thr722/723 were decorated with a monosialylated T-antigen
  9. This contrasts previous findings by Chandler et al. who also observed a disialylated T-antigen on S640
*Output_Site_Fusion* (sent_index, protein, sugar, site):
  • 2. inter-alpha-trypsin inhibitor heavy chain H4, -, Ser640
Section : Inter-α-trypsin Inhibitor Heavy Chain H2

Content :
  1. For the H2 heavy chain of the Inter-alpha-trypsin inhibitor a c-terminal cluster of mono- and disialylated mucin-type core 1 O-glycans ( Thr666, Ser673, Thr675 and Thr691 ) has been described in literature (58, 60, 66, 67)
  2. These previously reported O-glycosylated sites , except for the site T666, could be confirmed by the present study, albeit solely with monosialylated T-antigens
  3. ETD spectra of the O-glycopeptide 689ESPPPHV696 (m/z 507.153+/760.242+) enabled a clear identification of the occupied O-glycosylation site Thr666
  4. This finding is supported, in particular, by a signal detected in the doubly charged species at m/z 1287.411+ which corresponds to a z+16 ion (supplemental Fig
  5. S6)
  6. Remarkably, the CID-MS2 spectrum of the O-glycopeptide 669WANPPPV677 (m/z 760.923+) revealed that both O-glycosylation sites, Ser673 and Thr675 , are occupied by a monosialylated T-antigen (supplemental Fig
  7. S7)
  8. Moreover, the spectrum features signals indicating the presence of hexose rearrangement products, that is the transfer of an additional hexose either to the glycan or the peptide moiety, as described earlier (68, 69)
  9. The occurrence of these artifacts necessitates the careful interpretation of CID glycopeptide fragment spectra
*Output_Site_Fusion* (sent_index, protein, sugar, site):
  • 1. -, -, Thr666, Ser673, Thr675 and Thr691
  • 3. -, -, site Thr666
  • 6. -, -, sites, Ser673 and Thr675
Section : τ-Tubulin Kinase 2

Content :
  1. The τ-tubulin kinase 2 2 (TTBK2 ) phosphorylates τ and tubulin, preferably in the nervous system
  2. Aberrant TTBK2 activity was linked to the progression of the Alzheimer's disease (70, 71)
  3. The protein resides primarily in the cytosol; however, Böhm et al. could also detect TTBK2 in a secreted form in human tears (72, 73)
  4. Hitherto, no glycosylation of this protein has been described
  5. CID-MS3 as well as ETD spectra of the O-glycopeptide 814KDHSATEPL823 \+HexNAc1Hex1NeuAc1 (m/z 877.822+), though, suggest the O-glycosylation of Thr820
  6. ETD fragment ions at m/z 485.221+, 557.431+, and 1098.631+, corresponding to c4, c+15, and z4 ions, allowed discerning the exact glycosylation site
  7. As TTBK2 is involved in ciliogenesis (74, 75), a process which requires the vesicle transport from the Golgi to the basal bodies and cilia, we speculate that TTBK2 might become O-glycosylated during this process
*Output_Site_Fusion* (sent_index, protein, sugar, site):
  • 5. 2 (TTBK2, -, Thr820
Section : Fibrinogen α and β Chain

Content :
  1. The blood clotting protein fibrinogen is known to be N-glycosylated at the β- and γ-chain
  2. Interestingly, a recent study by Zauner et al. could also show O-glycosylated sites and regions, seven in total, within the molecule (51)
  3. In the present study O-glycosylation of the fibrinogen alpha region aa524–528 could be confirmed; pinpointing the exact O-glycosylation site was not possible, though (supplemental Fig
  4. S5, 524GKFPG531, m/z 725.782+)
  5. Nevertheless, O-glycosylation within the fibrinogen beta region aa58–67 could be confirmed and pinpointed
  6. Here, the presence of the ETD fragment ions m/z 931.541+, 1300.541+, and 1915.501+, corresponding to z+19, c6, and c12 ions (supplemental Fig
  7. S6), 54EEAPLRPAPPPIS67, m/z 706.273+) indicates O-glycosylation at the site Ser58
  8. This contrasts recent findings by Bai et al. who reported the site Ser67 to be O-glycosylated, but not the site Ser58 (76)
  9. In agreement with previous findings, both fibrinogen O-glycopeptides (524GKFPG531, m/z 725.782+, 54EEAPLRPAPPPIS67, m/z 706.273+), detected in the present study, were found to be decorated with monosialylated T-antigens
  10. Interestingly, the peptide 54EEAPLRPAPPPIS67 was also found in its nonglycosylated form (HILIC fractions #12-#15, CID, see supplemental Table S2), which suggests only a partial occupation of the O-glycosylation site Ser58
*Output_Site_Fusion* (sent_index, protein, sugar, site):
  • 10. -, -, site Ser58
  • 7. -, -, Ser58
  • 8. -, -, Ser67
Section : Reproducibility of the Proteinase K Digest

Content :
  1. Previous studies on single glycoproteins could show the successful application of Proteinase K in the context of N- and O-glycoproteomics (32, 49–52)
  2. However, its application on complex samples, like human blood plasma, has not been described so far
  3. Here, we have employed Proteinase K to generate (glyco) peptides from the entire (glyco)proteome of a pooled human blood plasma sample that was derived from 20 healthy donors
  4. To assess the reproducibility of such a digest, five independent Proteinase K treated blood plasma samples (technical replicates) were measured with nanoRP-LC- ESI-IT-MS/MS in preliminary experiments
  5. A comparison of the resulting base peak chromatograms revealed a high reproducibility of these digests, as shown in supplemental Fig
  6. S1
*Output_Site_Fusion* (sent_index, protein, sugar, site):
Section : Glycopeptide Enrichment and Fractionation via HILIC-HPLC

Content :
  1. The HILIC-HPLC fractionation carried out in the present study was optimized for the enrichment of O-glycosylated peptides (data not shown)
  2. In total 17 HILIC fractions were collected and were analyzed by nanoRP-LC- ESI-IT- MS2 ( CID )
  3. The acquired fragment spectra were manually screened for the presence of N- and O-glycopeptides—relying on the detection of diagnostic oxonium ions (B-ions, e.g. HexNAc1Hex1NeuAc1; m/z 657.24) and characteristic mono(oligo)-saccharide neutral loss fragment ions (Y-ions)
  4. Glycopeptides were detected in five HILIC fractions (#13-#17) (Fig. 2)
  5. The glycopeptides eluted in the range of 9–32 min in RP-LC-MS and clusters of glycopeptides were registered between 12–18 min, 20–22 min, and 25–29 min (exemplarily shown for fraction #15, Fig. 3)
*Output_Site_Fusion* (sent_index, protein, sugar, site):
Section : Determination of the Glycan Com position

Content :
  1. CID- MS2 spectra were carefully inspected and manually annotated with respect to the glycan com position
  2. Major signals in these spectra resulted from consecutive neutral losses (singly and doubly charged species) of the monosaccharides hexose , N-acetylhexosamine and N-acetylneuraminic acid from the intact glycopeptide and most of the time the applied collision energy induced the complete fragmentation of the glycan moiety while leaving the de-glycosylated peptide intact
  3. These fragment ions along with corresponding oxonium ions, allowed inferring the glycan com position and the putative peptide mass (Fig. 4A)
  4. Detailed analysis revealed, that exclusively mucin-type core 1 mono- and disialylated O-linked glycopeptides ((di)sialyl-T-antigen) were present
  5. For the glycan annotation a mass error of ±0.3 Da was accepted
  6. This parameter was justified as the observed mass errors were about 0.07 Da (median value)
  7. In total 88 O-glycopeptides were detected and characterized with respect to their glycan com position
  8. The registered glycopeptides covered an m/z range of 507–945 (average m/z 728) and were either doubly (55 peptides ) or triply charged (33 peptides )
*Output_Site_Fusion* (sent_index, protein, sugar, site):
Section : Identification of the Peptide Moiety

Content :
  1. To complement the deduced glycan com position with peptide sequence information, CID-MS3 experiments were conducted on putative peptide masses, which were derived from CID-MS2 spectra (Fig. 4A)
  2. In separate LC-MS runs the selected peptide precursor masses (predominantly singly charged) were used to trigger manual CID-MS3 fragmentation
  3. In rare cases peptide\+HexNAc was selected for fragmentation , because of low signal intensity of the peptide species in MS2
  4. CID-MS3 spectra were searched against the human subset of the highly curated and nonredundant protein database UniProtKB /Swiss- Prot
  5. Notably, also in some CID-MS2 spectra b- and y-ions derived from peptide backbone cleavages were detected, which enabled peptide identification (e.g.supplemental Fig
  6. S5 : α-2-HS-glycoprotein m/z 623.233+)
  7. For 88 detected glycopeptides , 60 corresponding peptides could be identified unambiguously (Table I, Table II)
  8. These 60 peptides could be linked to 22 different proteins , most of them being acute phase proteins
  9. As the protein identification is based on a single peptide , validation of the potential peptide hits is of utmost importance
  10. Here, in particular, the protein inference problem (53), which is intrinsic to bottom-up proteomic approaches, had to be considered
  11. To cope with this, peptide spectra were manually revised and only peptide hits with a MASCOT ion score of greater than 20 were considered; only in rare cases, and supported by other evidences, also lower scored peptides were accepted
  12. Furthermore, peptide hits needed to exhibit at least one potential O-glycosylation site ( Ser/Thr )
  13. If available, knowledge derived from public databases (UniProtKB and UniCarbKB) on already described O-glycosylation sites within the putative peptides or within the entire protein was used to support a potential hit
  14. The peptide identification was further corroborated by redundant identifications, that is the multiple occurrence of: (1) the same glycopeptide in different HILIC fractions, (2) or the same peptide but with a different glycan moiety, (3) or the identification of a peptide harboring the same glycosylation site , but differing in peptide length; the latter being attributed to the broad-specific proteolysis (e.g. alpha-2-HS-glycoprotein , 341TVVQP[HexNAc1Hex1NeuAc1]VG348 derived from HILIC fraction #13 and 342VVQP[HexNAc1Hex1NeuAc1]VG348 from fraction #14)
  15. In some cases, though, peptide identification was hampered or inconclusive
  16. One of the main obstacles here was the frequent occurrence of prolines within the (glyco)peptide sequence , which was also described in literature
  17. The cyclic structure of proline , gives rise to a high signal of the preceding y-ion but precludes in most cases the generation of a subsequent b-ion—thus introducing a sequence gap (54)
  18. This in turn leads to incomplete peptide fragment ion series and the occurrence of dipeptide fragment ions (e.g. PS and SP), which may result in ambiguity in peptide identification
  19. This effect is particularly critical for short peptide sequences , as usually obtained by a broad- or nonspecific digest
  20. The average peptide length of glycopeptides identified in this study is 10 amino acids (aa)
  21. This is significantly shorter than the average length of tryptic peptides (14 aa, based on an in-silico digestion of the human UniProtKB database (55), supplemental Fig
  22. S2)
  23. All this—in conjunction with a nonspecific peptide search—makes a reliable peptide identification challenging
  24. To complement the identified O-glycopeptides with nonglycosylated peptides that are also present in blood plasma, CID und ETD fragment spectra of the corresponding HILIC fractions (#1–17) were searched against the human subset of the UniProtKB/Swiss- Prot protein database
  25. In total 111 proteins were identified
  26. CID and ETD spectra provided complementary results; 54 and 45 proteins were identified, respectively, and only 12 proteins were identified with both modes
  27. Compared with ETD, significantly more peptides were identified with CID (321 versus 150), though
  28. The majority of peptides were derived from immunoglobulins, serotransferrin , haptoglobin and serumalbumin (supplemental Table S1)
  29. Notably, also nonglycosylated peptides corresponding to previously identified O-glycopeptides , e.g. of plasminogen and hemopexin , were identified (Table I)
*Output_Site_Fusion* (sent_index, protein, sugar, site):
Section : Localization of the O-Glycosylation Sites

Content :
  1. To further characterize the identified O-glycopeptides , the corresponding O-glycosylation sites needed to be localized
  2. In a few cases the use of Proteinase K , already generated glycopeptides that exhibit only one possible O-glycosylation site , e.g.132EGPVV[HexNAc1Hex1NeuAc1]A138 and 567DLIA[HexNAc1Hex1NeuAc2]M572 from kininogen-1 or 234AP[HexNAc1Hex1NeuAc1]HPAPPGLH244 from selenoprotein P . Noteworthy, in the first example a tryptic digest would have generated a peptide with a length of 43 aa (119FVAQCQIPAEGPVVQYDCLGCVHPIQPDLEPILR161), harboring 8 potential O-glycosylation sites
  3. This clearly illustrates a benefit of the Proteinase K digest for the O-glycan site identification
  4. When the O-glycosylation sites could not be inferred directly, glycopeptides were subjected to ETD fragmentation in a separate LC-MS run (Fig. 4B)
  5. The most prominent peaks in the acquired ETD glycopeptide spectra were the unfragmented precursor ion along with charge-reduced species; minor peaks were derived from c- and z-type peptide backbone cleavages
  6. Furthermore, fragment ions indicating either the loss of 43.018 Da (C2H3O·) from the radical cationic species or 42.016 Da (C2H2O) from the even electron species [M+H]+ were consistently detected
  7. In the literature this spectral feature was attributed to the loss of an acetyl-radical from the N-acetyl group of a HexNAc (56, 57)
  8. This in turn can support the discrimination of ETD spectra derived from glycosylated and nonglycosylated species
  9. Strikingly, and in contrast to the general mode of action of ETD, also fragmentations of the glycan moiety along the intact peptide backbone were observed, leading to a complete loss of the O-glycosylated Ser/Thr side-chain
  10. Nevertheless, the resulting fragment ions enabled a verification of the glycan com position as well as the peptide mass
  11. At first, ETD generated glycopeptide spectra were searched against the human subset of the UniProtKB/Swiss- Prot database using MASCOT, under consideration of the O-glycan modification (theoretical glycan mass used as variable modification of Ser/Thr)
  12. However, this strategy failed because of the presence of intense signals in the ETD spectrum, which correspond to: (I) the precursor ion, (II) the charge reduced precursor ion, (III) acetyl radicals ions, (IV) or glycan fragment ions
  13. These ions might be erroneously interpreted as peptide derived fragment ions by the search engine, because ETD is supposed to solely produce peptide fragment ions while keeping fragile side-chain modifications, like the glycosylation, intact
  14. To overcome this, glycopeptide spectra were exported to Bruker BioTools for manual spectra annotation
  15. Here, the identified glycopeptides were built in silico, taking into account the corresponding O-glycan moieties as well as all possible O-glycosylation sites
  16. Subsequently, the resulting in silico fragment ions (c- and z-type ions) were matched to their counterparts in the measured ETD-MS2 spectra
  17. To evaluate the spectra annotation and to discern the correct O-glycosylation site , the BioTools spectra matching score along with manual inspection of the respective spectra were considered
  18. Furthermore, public repositories, namely UniProtKB and UniCarbKB, were queried with respect to known O-glycosylation sites within the peptide in question
  19. To further asses the validity of the O-glycosylation site annotation, the site occupancy was predicted using NetOGlyc—an online tool, based on machine-learning algorithms, which allows the prediction of mucin-type O-glycosylation sites (27)
  20. For 36 of 60 identified glycopeptides the quality of the corresponding ETD spectra was acceptable - in terms of signal intensity and the number of fragment ions
  21. Overall, 31 O-glycosylation sites and regions were detected, of which 23 sites could be pinpointed (Tables I and II)
  22. Strikingly, 11 previously unknown O-glycosylation sites and regions were registered, of which 8 sites could be pinpointed
  23. Generally, O-glycosylation on threonine residues was observed more frequently than on serine (16× Thr , 7× Ser )
  24. In accordance with literature, prolines were frequently found in close vicinity to the O-glycosylation site ( positions n - 1, n + 1, n + 3), e.g.267AVP[HexNAc1Hex1NeuAc1]PV272, 343VQP[HexNAc1Hex1NeuAc1]VGA349 from alpha-2-HS-glycoprotein ( 30 )
  25. In addition also prolines in position n + 2 were found occasionally, e.g.20GPVP[HexNAc1Hex1NeuAc1]PPDNI29 from alpha-1 microglycoprotein ( protein AMBP )
*Output_Site_Fusion* (sent_index, protein, sugar, site):
Section : Identified Glycoproteins : Selected Examples

Content :
  1. In the following selected examples of identified O-glycopeptides are detailed, that feature novel O-glycosylation sites or exhibit remarkable fragmentation characteristics
*Output_Site_Fusion* (sent_index, protein, sugar, site):
Section : α-2-HS-glycoprotein

Content :
  1. In this study, the majority of identified O-glycopeptides were derived from α-2-HS-glycoprotein , also known as fetuin-A
  2. Fetuin-A is a negative acute phase glycoprotein that is highly abundant in fetal blood plasma
  3. It is involved in transport and storage of substances and features three O-glycosylation sites ( Thr256, Thr270, Ser346 ), which are decorated with sialylated mucin-type core 1 O-glycan structures (30, 58, 59)
  4. In contrast to previous reports (30, 58), intact O-glycopeptides identified and characterized in the present study describe all three known fetuin-A O-glycosylation sites including the attached O-glycans (mono- and disialylated mucin-type core 1 O-glycans), respectively
  5. By pinpointing O-glycosylation sites using ETD, the reported ETD Biotools scores can be misleading
  6. This for instance holds true for the fetuin-A O-glycopeptide 252QPVTSQPQPE262 (m/z 623.233+) and its three potential O-glycosylation sites : Thr252 (669), Thr256 (412), Thr257 (362) (supplemental Fig
  7. S7)
  8. According to the score values T252 would be the occupied site ; the presence of characteristic ETD fragment ions at m/z 344.011+ (c3), 1200.451+ (c5), 1287.471+ (c6), 1525.571+ (z+18), 1751.511+ (z+210), though, clearly indicates the occupancy of Thr256 , which is in agreement with literature findings
  9. For the two other described fetuin-A O-glycosylation sites Thr270 and Ser346 ETD fragmentation was actually not mandatory, because corresponding O-glycopeptides were identified that solely harbor one O-glycosylation site (e.g. Thr270 : 267AVPPV272, Ser346 : 342VVQPVG348), respectively
  10. Also of note, with respect to the peptide identification, b- and y-ions were detected in the CID-MS2 fragment spectra of the fetuin-A O-glycopeptides 252TQPVSQPQPE262 (m/z 623.233+), 267AVPPVVDPDAPPSPPL283 (m/z 872.723+) and 266EAVPPVVDPDAPPSPPL283 (m/z 915.713+), which already permit the unambiguous peptide identification without consideration of CID-MS3 spectra (supplemental Figs
  11. S5–S7)
  12. Furthermore, internal glycopeptide fragment ions resulting from concerted fragmentations along the peptide backbone and along the glycan moiety were detected in the same CID-MS2 spectra - a low-energy CID glycopeptide fragmentation event that is rarely described in literature (e.g.252TQPV(HexNAc1Hex1NeuAc1)SQPQPE262252TQPV(HexNAc)SQ258m/z 945.351+) (supplemental Fig
  13. S7)
*Output_Site_Fusion* (sent_index, protein, sugar, site):
  • 3. fetuin-A, -, Thr256, Thr270, Ser346
  • 6. fetuin-A, -, Thr252 (669), Thr256 (412), Thr257 (362)
  • 8. fetuin-A, -, Thr256
  • 9. fetuin-A, -, Ser346
  • 9. fetuin-A, -, sites Thr270 and Ser346
Section : Number of detected glycopeptides in HILIC fractions #13-#17

Content :
*Output_Site_Fusion* (sent_index, protein, sugar, site):
Section : Extracted ion chromatograms (EICs) of diagnostic glycan oxonium ions ( [HexNAc\+Hex\+NeuAc\+H]\+: 657.24) reveal the clustered elution of O-glycopeptides (*) on a C18 reversed-phase column

Content :
  1. EICs of HILIC fraction #15 are shown as an example
*Output_Site_Fusion* (sent_index, protein, sugar, site):
Section : Fragment ion spectra of the Proteinase K generated plasminogen APPELTPV373 measured with nanoRP-LC- ESI MSn (positive ion mode, CID and ETD)

Content :
  1. A (top), For the given O-glycopeptide the CID-MS2 spectrum is shown together with its corresponding precursor ion m/z 718.30 [M+3H]3+ (inset)
  2. The spectrum allows the elucidation of the O-glycan com position (here disialylated T-antigen)
  3. In addition, also some internal glycopeptide fragments have been detected (e.g. b10\+HexNAc)
  4. A (bottom): The putative peptide mass (m/z 1205.66 [M+H]+) of the given O-glycopeptide was subjected to CID-MS3 fragmentation
  5. The peptide was identified by MASCOT search (Score: 16, Uni Prot KB/Swiss- Prot , human)
  6. B, The O-glycosylation site (here Thr365 ) was pinpointed by means of ETD (Biotools-Score: 150)
  7. Magnified regions show the isotope pattern of selected peptide fragment ions, confirming the annotation
  8. In addition to peptide fragment ions also fragment ions derived from the glycan moiety were detected, allowing a verification of the glycan com position
  9. Furthermore, a neutral loss of an acetyl radical from the intact O-glycopeptide was observed, which is typically seen in ETD spectra of glycopeptides
*Output_Site_Fusion* (sent_index, protein, sugar, site):
  • 6. CID-MS2, -, Thr365
Section : Over the last few years mass spectrometry based glycoproteomics has experienced significant advances in terms of instrumentation, methodology and bioinformatics; resulting in a variety of excellent glycoproteomic publications that highlight the merits of high resolution mass spectra, complementary fragmentation techniques, improved multidimensional glycopeptide enrichment and separation techniques as well as sophisticated software tools (41)

Content :
  1. However, despite these advances—and despite its enormous clinical and pharmaceutical relevance as well as diagnostic potential—our knowledge about the human blood plasma glycoproteome is still very limited
  2. This holds particularly true for the human blood plasma O-glycoproteome
  3. Here several important questions can be raised: Which proteins are O-glycosylated
  4. , Which O-glycans are attached to which sites
  5. , Which dynamics in terms of the O-glycan micro- and macroheterogeneity can be observed in a certain biological context
  6. , What are the biological and biotechnological implications of O-glycosylation
  7. In the present study we have developed and employed an analytical workflow that allows the explorative, nontargeted analysis of the human blood plasma O-glycoproteome in a site-specific manner
  8. To this end intact human blood plasma O-glycopeptides , generated by a broad-specific proteolytic digest via Proteinase K , were selectively enriched using HILIC fractionation in order to be analyzed by multistage nanoRP-LC- ESI-IT-MS using low-energy CID as well as ETD ( CID-MS2 /MS3, ETD-MS2 )
  9. This combined workflow was applied on a pooled blood plasma sample derived from 20 healthy donors and allowed for the identification of 31 O-glycosylation sites in 22 proteins , including the detection of 11 previously unknown O-glycosylation sites
  10. We were able to pinpoint 23 O-glycosylation sites , of which eight sites have been described for the first time
  11. The identified O-glycan com positions most probably correspond to mono- and disialylated core-1 mucin-type O-glycans (T-antigen)
*Output_Site_Fusion* (sent_index, protein, sugar, site):
Section : Caveats of the Approach

Content :
  1. In contrast to tryptic (glyco) peptides , Proteinase K generated peptides and glycopeptides cannot be predicted because of the broad cleavage specificity of the enzyme
  2. More importantly, though, is the reduced peptide length compared with a tryptic digest, as this can lead to an insufficient number of detected fragment ions to allow for unambiguous peptide identifications
  3. This problem can be even more intensified by the frequent occurrence of prolines within mucin-type O-glycopeptide sequences , as prolines can introduce additional sequence gaps during mass spectrometry-based peptide sequencing
  4. Also important to note is the increased search space of the search engine because of the use of a nonspecific enzyme , which results in an increased ambiguity with respect to the peptide identification (lower identification scores) and longer search times
  5. A confounding factor that relates to the ETD analysis is the predominance of charge state 2+ among the measured O-glycopeptide precursor ions, because ETD fragmentation is more efficient for precursor charge states greater than 2+ (86)
  6. The predominance of charge state 2+ can be explained by a lack of ionizable/basic amino acids (lack of Arg, Lys, His ) within the glycopeptides—a characteristic that can be linked to the broad-specific proteolytic digest by Proteinase K (87)
  7. Another caveat is related to the HILIC glycopeptide enrichment: this step was optimized to enrich O-glycopeptides carrying short mucin-type core-1 and -2 O-glycans, as they represent the vast majority of O-glycans on human blood plasma proteins (25)
  8. Hence, O-glycopeptides carrying bigger and thus more hydrophilic O-glycans, such as N-acetyl-lactosamine (LacNAc) extended mucin-type core-2 O-glycans, or O-glycopeptides carrying multiple mucin-type O-glycans, might elute in the subsequent washing phase of the HILIC fractionation and as a consequence cannot be found during the analysis
*Output_Site_Fusion* (sent_index, protein, sugar, site):
Section : Summary and Outlook

Content :
  1. In the present study we have investigated the human blood plasma mucin-type O-glycoproteome of healthy individuals in an explorative and nontargeted manner
  2. To this end, we have conducted a site-specific large-scale O-glycoproteomic analysis, which combines a broad-specific proteolytic digest, with HILIC enrichment/fractionation and subsequent multistage mass spectrometry measurement (nano-RPLC- ESI-IT- MSn ) with CID and ETD
  3. Centered on the characterization and identification of intact glycopeptides , we could demonstrate the in-depth O-glycoproteomic analysis of a number of important human blood plasma glycoproteins (mainly acute phase proteins ), including alpha-2-HS-glycoprotein , fibrinogen, plasminogen and kininogen-1
  4. Our results are in good agreement with previous findings by other research groups, but also add new aspects to the field, e.g. the identification of a couple of novel O-glycosylation site as well as the benefits and drawbacks of using Proteinase K in large-scale mass spectrometric glycoproteomic studies
  5. Explorative site-specific N- and O-glycoproteomic studies of biofluids, like human blood plasma, human milk, urine or cerebrospinal fluid hold an enormous potential to better understand the implications of protein glycosylation under normal physiological conditions, but also under pathophysiological conditions
  6. By serving as a diagnostic tool, the detection/discovery of relevant glycopeptides (biomarker candidates) can be the basis for targeted quantitative glycoproteomic analyses, which allow for a site-specific monitoring of glycosylation alterations, e.g. during disease progression
  7. Site-specific glycosylation analyses are, moreover, important to produce biopharmaceuticals according to quality by design requirements, in particular if these biopharmaceuticals are produced in heterologous expression systems
  8. In this regard site-specific glycosylation analyses might also enable understanding/controlling important glycan-related features of the final product including its efficacy, half-life, or antigenicity
*Output_Site_Fusion* (sent_index, protein, sugar, site):
Section : Other O-glycoprotomic Studies on Complex Biofluids

Content :
  1. In the recent past efforts have been made to investigate the O-glycoproteome of different complex biological samples
  2. Halim et al., for instance, analyzed the O-glycoproteome of cerebrospinal fluid ( CSF ) using a sialic-acid capture-and-release protocol (30)
  3. This protocol is based on the sialic acid specific hydrazide capturing of periodate oxidized glycoproteins
  4. Upon tryptic digestion the protocol allows the acid hydrolysis of sialic acid glycosidic bonds in order to release and analyze (formerly) sialylated glycopeptides
  5. To focus on O-glycosylations the authors included a peptide N-glycosidase F ( PNGase F) sample pretreatment step to remove N-glycans
  6. The authors have used an automated CID-MS2 /-MS3 spectra search protocol for glycopeptide identification (Peptide-GalNAc-Gal) and have employed ECD and ETD to pinpoint the glycosylation sites
  7. In total they have identified 106 O-glycosylation sites and could pinpoint 67 of these
  8. The identified CSF O-glycopeptides belong to 49 different proteins and were predominately decorated with structures corresponding to core-1 mucin-type O-glycans
  9. In a previous study the same group has also investigated the human urinary N-and O-glycoproteome using the sialic-acid capture-and-release protocol (58)
  10. Unfortunately, the applied protocol does not allow the enrichment of nonsialylated glycoproteins nor does it give any information on the degree of sialylation of the attached O-glycan moieties
  11. This limits the applicability of this procedure, as the degree of O-glycan sialylation is a crucial determinant in the pathogenesis of a number of diseases (22)
  12. In another large-scale glycoproteomics study conducted by Hägglund et al. in 2007 human plasma proteins , derived from Cohn fraction IV of a plasma fractionation, were analyzed with respect to occupied N- and O-glycosylation sites (60)
  13. The analyzed Cohn fraction is supposed to contain mainly α-globulins, like plasminogen and haptoglobin , and is depleted from γ-globulins and serum albumin
  14. The authors have employed two different enzymatic deglycosylation strategies to pinpoint occupied N-glycosylation sites : ( 1) PNGase F + H218O; (2) endo-β-N-acetylglucosaminidases (Endo D and Endo H) + different exoglycosidases
  15. These two strategies were applied on HILIC enriched tryptic (glyco) peptides , that were fractionated by strong cation exchange chromatography and eventually measured by LC- ESI-MS/MS using high-energy CID
  16. The authors were able to identify 103 N-glycosylation sites as well as 23 O-glycosylation sites/regions derived from 61 and 11 human blood plasma proteins , respectively
  17. Unfortunately, the occupied O-glycosylation sites could not be pinpointed and no information on the glycan moiety could be deduced
  18. In 2012 Darula et al. reported on the O-glycoproteomic analysis of bovine serum (77)
  19. In this study the authors have combined different protein- and peptide-level prefractionation and enrichment strategies, including jacalin lectin affinity chromatography, mixed-mode chromatography, and electrostatic repulsion hydrophilic interaction chromatography (ERLIC) to enrich tryptic mucin-type O-glycopeptides
  20. After additional use of exoglycosidases to improve glycopeptide characterization, truncated glycopeptides were subjected to LC- ESI-MS/MS with HCD and ETD for automated peptide identification and glycosylation site determination
  21. Overall, the authors could identify and pinpoint 124 glycosylation sites in 51 proteins , including many O-glycosylation sites that have not been described before—unfortunately, though, at the expense of the intact glycan structure
  22. In a recent publication from Bai et al. an analytical workflow is presented, which allows the mapping of mucin-type O-glycosylation sites on glycoproteins present in human blood plasma (76)
  23. The authors have used jacalin lectin affinity chromatography to enriched tryptic O-glycopeptides (peptide\+GalNAc) which were treated with PNGase F and different exoglycosidases
  24. In this study 49 O-glycopeptides , belonging to 36 human blood plasma glycoproteins , were identified by LC- ESI-MS/MS (CID)
  25. Overall, the authors could assign 13 O-glycosylation sites unambiguously, of which nine sites have not been described before
*Output_Site_Fusion* (sent_index, protein, sugar, site):
Section : Proteinase K Digest

Content :
  1. The majority of large-scale glycoproteomic studies features trypsin for the generation of (glyco) peptides
  2. Trypsin is the proteolytic gold standard in LC-MS/MS based peptide identification and quantification, as it reproducibly generates predictable peptides that can be readily retained on reversed-phase column and that give enough fragment ions for an unambiguous peptide identification, in most cases
  3. In terms of glycoproteomics, though, the cleavage specificity of trypsin can be a limiting factor for the identification and the localization of certain glycosylation sites , in particular for densely clustered O-glycosylation sites
  4. Hence, the use of broad- and nonspecific proteases, like Pronase E or Proteinase K was proposed, to reduce the number of nonglycosylated peptides and to make certain glycosylation sites analytically amenable (34)
  5. Proteinase K , for instance, has been successfully used in a number of publications that are centered on the O-glycoproteomic analysis of single proteins ; though, the use of Proteinase K in large-scale glycoproteomic studies on complex samples has not been described so far
  6. In the present study we could show that Proteinase K generates (glyco) peptides from a complex sample, like human blood plasma, in a reproducible and nonrandom manner, which is in agreement with a report from Hua et al. (34)
  7. We could show that, most of the time, Proteinase K generates shorter peptides compared with trypsin (supplemental Fig
  8. S7), and that Proteinase K cleaves effectively in-between densely O-glycosylated regions—thus, rendering the determination of the occupied O-glycosylation site (s) less difficult
  9. In fact we could show that Proteinase K can generate O-glycopeptides that exhibit only one potential O-glycosylation site , thus allowing for an unambiguous localization of the occupied site
  10. We could clearly show that some O-glycosylation sites could only be identified and pinpointed by the use of Proteinase K , because tryptic peptides would have been too long and would have harbored too many potential O-glycosylation sites
*Output_Site_Fusion* (sent_index, protein, sugar, site):
Section : Glycopeptide Enrichment Via HILIC

Content :
  1. Glycopeptides are usually under-represented in a peptide mixture, because of the glycan microheterogeneity
  2. In a tryptic digest of a typical glycoprotein only about 2% to 5% of the peptides are glycopeptides (78)
  3. In addition, the ionization efficiency of glycopeptides is significantly lower compared with their nonglycosylated counterparts, thus making the efficient and selective enrichment of glycopeptides key to most glycoproteomics workflows
  4. The use of HILIC based glycopeptide enrichment methods has proven to be a vital tool in glycoproteomics because of their broad glycan specificity, reproducibility and compatibility with mass spectrometry
  5. In a previous report by Zauner et al. it could be shown, that Proteinase K-generated glycopeptides can be separated into earlier eluting O-glycopeptides and later eluting N-glycopeptides using HILIC (32)
  6. Based on this publication we have employed HILIC for the selective enrichment and fractionation of human blood plasma O-glycopeptides
  7. Here of particular importance is the removal of highly abundant nonglycosylated peptides derived from albumin and other major (glyco-) proteins
  8. Careful manual inspection of CID- MS2 fragment spectra of the acquired HILIC fractions revealed the efficient enrichment of glycopeptides - and indeed the presence of solely mucin-type core-1 O-glycosylated glycopeptides
  9. N-glycopeptides were not detected, as they were expected to be present in the late eluting HILIC wash fractions because of their generally higher hydrophilicity compared with the most commonly found forms of mucin-type O-glycopeptides (non-, mono- and disialylated core-1 and -2 O-glycopeptides )
*Output_Site_Fusion* (sent_index, protein, sugar, site):
Section : Identification of the O-glycan Com position

Content :
  1. For an automated glycopeptide spectra filtering and glycan fragment annotation the use of commercial software tools was considered, but turned out to be too error-prone in our case (data not shown)
  2. Hence, in the present work we relied on manual annotation and interpretation of low-energy CID- MS2 fragment spectra in order to elucidate the O-glycan com position-however, at the expense of throughput and the possibility to report false discovery rates
  3. In total we were able to characterize 88 O-glycopeptides with respect to their O-glycan com position
  4. The detected O-glycan com positions most likely correspond to mucin-type core-1 mono- and disialylated O-glycans ((di)sialyl-T-antigen)
  5. In agreement with literature, glycopeptides carrying disialylated O-glycans, were found in later eluting HILIC fractions (#15–#17), as the additional sialic acid renders the molecule more hydrophilic
  6. Mono- and disialylated glycoforms could be usually discriminated by the presence of distinct oxonium ions: whereas fragmentation of monosialylated O-glycans generated a characteristic oxonium ion at m/z 454.16 (Hex1NeuAc1) , disialylated O-glycans gave rise to an additional intense peak at m/z 495.18 (HexNAc1NeuAc1) (supplemental Fig
  7. S6, 266EAVPPVVDPDAPPSPPL283, m/z 818.683+, 267AVPPVVDPDAPPSPPL283, m/z 872.733+)
  8. Furthermore, in disialylated species characteristic fragment ions of the peptide\+HexNAc\+NeuAc ere observed
  9. In a few cases the glycan annotation was compromised by the presence of fragment ions corresponding to hexose rearrangement products (68, 69)
  10. Generally, it is important to note, that low-energy CID- MS2 fragmentation of glycopeptides does usually not produce fragment ions that relate to the linkage of the attached monosaccharides
  11. Therefore, validation of the inferred O-glycan structures using dedicated O-glycomics approaches, including for instance (reductive) beta-elimination or hydrazinolysis, is recommended
  12. However, our findings are in good agreement with literature, as mono- and disialylated mucin-type core-1 O-glycans are known to be present on the majority of secreted blood plasma glycoproteins , produced by hepatic cells of healthy individuals (79)
  13. Notably, a study on plasma-derived von Willebrand factor could show, that apart from mucin-type core 1 O-glycans (T-antigen), more complex O-glycan structures including ABH blood group antigen containing mucin-type core-2 ([GalNAcβ1–6-(Galβ1–3)-GalNAcα-O-Ser/Thr]) , can be present on human blood plasma glycoproteins , too (80)
  14. In the present work, analyzing the total human blood plasma O-glycoproteome, we could not detect any (glyco) peptide derived from von Willebrand factor , nor could we find any indication for the presence of fucosylated ( ABH blood group antigens) and/or LacNAc extended mucin-type core-2 O-glycans
*Output_Site_Fusion* (sent_index, protein, sugar, site):
Section : Glycopeptide Identification

Content :
  1. Low-energy CID- MS2 fragmentation of glycopeptides , as employed in the present work, almost exclusively generates fragment ions corresponding to the fragmentation of the glycan moiety, while leaving the peptide backbone mainly intact
  2. Thus, this type of fragmentation does usually not provide any information on the sequence of the peptide backbone nor on the occupied glycosylation site
  3. To identify the peptide we have employed manual CID-MS3 fragmentation on the putative peptide mass, which has been inferred from the annotation of the corresponding CID- MS2 spectra before
  4. In a few of cases the signal of the putative peptide mass was too low to yield sufficient fragment ions
  5. Consequently, the putative peptide\+HexNAc ion was subjected to CID-MS3 fragmentation instead
  6. We did not employ an automated CID-MS3 fragmentation procedure, e.g. fragmentation of the three most intense precursor ions in the CID- MS2 spectrum, because we wanted to generate and sum up as many fragment spectra as possible from the selected putative peptide mass, in order to increase spectra quality and therefore the chance of successful peptide identification
  7. By searching the acquired CID-MS3 fragment spectra against the human subset of the UniProtKB/Swiss- Prot protein database , a total of 60 peptides (of 88 detected O-glycopeptides ) could be identified unambiguously
  8. Notably, in a few cases also peptide fragment ions present in CID- MS2 spectra allowed for an unambiguous peptide identification (supplemental Fig
  9. S4, 267AVPPVVDPDAPPSPPL283, m/z 872.733+)
  10. Overall, the identified peptides belong to 22 different proteins —primarily acute phase proteins
  11. This constantly growing group of blood plasma proteins fulfills essential functions during inflammation (e.g. coagulation, anti-inflammatory and anti-pathogenic activity), and, accordingly, their expression is known to be either significantly up- or downregulated (positive and negative acute phase proteins ) in this context
  12. As a result, this group of proteins attracted a lot of attention as potential cancer biomarkers in recent years (5)
  13. Noteworthy, the identified proteins span a concentration range of 5 orders of magnitude
  14. Therefore, the applied approach seems to be suitable to also detect lower abundant proteins or peptides
  15. A group of O-glycosylated proteins that have frequently been identified in other large-scale glycoproteomic studies are Coagulation factors (30, 58, 60, 77)
  16. In our study there is an indication for the presence of an O-glycosylated peptide derived from Coagulation factor V (HILIC fraction #15, m/z 761.782+, 1453QIPPPDL1460\+HexNAc1Hex1NeuAc1 Table II, supplemental Fig
  17. S5)
  18. Interestingly, the detected Coagulation factor V O-glycosylation site ( Ser1455 ) has not been described so far
  19. Unfortunately, our data do not allow an unambiguous identification of this protein
*Output_Site_Fusion* (sent_index, protein, sugar, site):
  • 18. Coagulation factor V O-glycosylation, -, Ser1455
Section : General Remarks on Immunoglobulin O-glycoproteomics

Content :
  1. Another O-glycosylated protein that could not be identified in our study is Ig α-1 ( IgA1 )
  2. IgA1 is a high abundant human blood plasma glycoprotein that features a cluster of three to five mucin-type O-glycans in the hinge region of the heavy chain (81)
  3. This cluster harbors many prolines , hence corresponding Proteinase K generated peptides might have been not unambiguously identified (the tryptic IgA1 hinge region O-glycopeptide looks as follows: (K)89HYTNPSQDVTVPCPVPTPPTPPTPPTPPCCHPR126)
  4. Furthermore, because of the densely clustered O-glycans a potential IgA1 O-glycopeptide carrying mucin-type O-glycans at each potential site , such as PTPPTPPTPPTPPCC, might be too hydrophilic and consequently might have been among the (glyco) peptides present in the late eluting HILIC wash fraction
  5. Worth mentioning, in our study we could detect the IgA1 peptide 95QDVTVPCPVP105 in its nonglycosylated form (HILIC Fraction #11, CID, supplemental Table S2)
  6. Therefore, the O-glycosylation site S105 seems to be only partially occupied
  7. Surprisingly, human IgA1 O-glycopeptides have not been identified in any other large-scale glycoproteomic studies (30, 58, 60, 76, 77, 82)
  8. However, there is a targeted glycoproteomic study from Takahashi et al. focusing on IgA1 O-glycosylation (81)
  9. In this study the authors analyzed human plasma derived IgA1 O-glycopeptides (tryptic and nontryptic) with ESI-FT-ICR-MS/MS as well as ESI-LTQ-FT-MS/MS, both in online- and offline-Mode
  10. To pinpoint the O-glycosylation sites the authors have employed activated ion-electron capture dissociation (AI- ECD ) and ETD
  11. Another immunoglobulin that is reported to carry mucin-type O-glycans in the hinge region is Ig delta ( IgD ) (83)
  12. The plasma concentration of IgD is much lower than the concentration of IgA , IgG , and IgM but higher than that of IgE ( IgD represents 0.25% of total plasma immunoglobulins)
  13. Apart from the study conducted by Takayasu et al. from 1982 (83) on truncated O-glycopeptides (peptide\+GalNAc) at present no O-glycoproteomic data do exist for intact human IgD O-glycopeptides
  14. Also of particular interest is a recent finding by Plomp et al.: using a targeted glycoproteomics approach these authors could demonstrate, for the first time, that IgG3 is partially O-glycosylated in its hinge region (mucin-type core-1 O-glycans) (84)
*Output_Site_Fusion* (sent_index, protein, sugar, site):
Section : Pinpointing of O-glycosylation Sites

Content :
  1. Pinpointing the correct O-glycosylation sites is a crucial but very challenging task
  2. Proteinase K , in this regard, proved to be beneficial as it can generate short glycopeptides , which exhibit only one potential O-glycosylation site
  3. In case the occupied O-glycosylation site could not be inferred directly, we have employed ETD- MS2 fragmentation
  4. In first attempts database-assisted peptide identification via MASCOT was tested on the acquired ETD glycopeptide spectra, but turned out to be not successful
  5. One reason for this is the presence of intense signals in the ETD- MS2 spectrum, which do not correspond to peptide fragment ions (e.g. unfragmented precursor ions, glycan fragment ions), and which thus compromise automated peptide identification (85)
  6. A possible solution for this is the (manual) removal of these additional m/z-values from the ETD-spectra before running the search algorithm
  7. In the present study, however, this procedure did not improve the database-assisted peptide identification
  8. For these reasons we relied on manual spectra annotation and interpretation using DataAnalysis, Biotools as well as public repositories (UniProtKB and UniCarbKB)
  9. Furthermore, NetOGlyc 4.0 was employed to predict O-glycosylation sites and to support experimental findings
  10. Predicted and experimentally determined O-glycosylation sites were mostly in good agreement for already known O-glycosylation sites—however , support for potentially novel sites could only be found in a few cases
  11. A general shortcoming of glycopeptide enrichment methods is that they are biased toward glycosylated peptides , while underrepresenting potential corresponding aglyosylated counterparts
  12. Hence, in the present study no conclusions with respect to the macro-heterogeneity of the glycoproteins ( site-occupancy ) can be drawn
*Output_Site_Fusion* (sent_index, protein, sugar, site):

 

 

Protein NCBI ID SENTENCE INDEX