|Human lysyl hydroxylase isoforms: Multifunctionality of human LH3 and the amino acids important for its collagen glycosyltransferase activities|
Collagens constitute a highly specialized family of extracellular matrix proteins. In addition to the maintenance of the architecture of tissues, they also have other important functions, for instance, in early development and organogenesis, and in the regulation of cell behavior (van der Rest & Garrone 1991, Kielty et al. 1993, Kivirikko 1993, Pihlajaniemi & Rehn 1995, Prockop & Kivirikko 1995). More than twenty collagen types containing altogether at least 38 distinct polypeptide chains have now been identified, and their genes are dispersed among at least fifteen chromosomes. In addition, there are more than fifteen other proteins that have collagen-like domains (Ayad et al. 1998, Myllyharju & Kivirikko 2001).
All collagen molecules are built up from three polypeptide (α) chains, each with a left-handed helical conformation, that are coiled around each other to form a characteristic right-handed collagen triple helix. In addition, they all contain noncollagenous sequences at their termini, and some collagens also have these sequences as interruptions separating adjacent triple-helical regions to make the molecules more flexible. All collagen α chains have repeating -Gly-X-Y- sequences. The occurrence of glycine in every third position is an absolute requirement as glycine is the only residue with a side-chain small enough to fit the restricted space in the center of the triple helix. Proline is frequently in the X-position and Hyp in the Y-position. These residues are required for the correct conformation of the helix because they limit rotation of the polypeptide chains, and their hydrophobic and charged side-chains are located on the surface of the molecule making collagens polymerize into precisely ordered structures. Hyp is also essential for the thermal stability of the helix (van der Rest & Garrone 1991, Kielty et al. 1993, Pihlajaniemi & Rehn 1995, Prockop & Kivirikko 1995).
The collagen superfamily can be divided into two major classes, fibrillar collagens and nonfibrillar collagens, based on the assemblies and other features. There is also a group of proteins called noncollagen proteins that have collagen-like domains but are not defined as collagens.
Collagen types I, II and III are the classical fibril-forming collagens and account for 80-90% of all collagens in the human body. Type V and XI collagens are also classified as fibrillar collagens on the basis of their homology with type I-III collagens (Ayad et al. 1998). The fibrillar collagens have the same overall molecular structures. They all comprise a large triple-helical domain of about 1,000 amino acids, a highly conserved noncollagenous C-terminus and a variable noncollagenous N-terminus. By forming highly ordered quarter-staggered structures, these collagens provide the major mechanical strength for the body in the skeleton, skin, blood vessels, nerves, intestines, and in the fibrous capsules of organs (Vuorio & Crombrugghe 1990, Pihlajaniemi & Rehn 1995). Research suggests that these major collagens can exist as heterotypic fibrils with I and III forming copolymers, and with type V and XI collagens copolymerized largely on the inside of type I and II collagen fibrils, respectively (Ayad et al. 1998).
Collagen types IV, VI-X, and XII-XIX do not form fibrils, therefore are defined as nonfibrillar collagens. They exhibit great heterogeneity in structure, tissue distribution, macromolecular organization, and function. One common feature is that they all have one or more imperfections in the collagenous sequences in the triple helical domains which vary in length between about 330 and 1,530 amino acid residues, the shortest being in type VI and the longest being in type VII. Both noncollagenous C- and N- terminal ends are highly variable in sequence and length (Pihlajaniemi & Rehn 1995, Prockop & Kivirikko 1995).
Network-forming collagens include collagen types IV, VIII and X. Type IV collagen is only expressed in basement membranes where it is the major component. It aggregates to form three dimensional networks, participates in tissue integrity and filtration. Type VIII and X are very different in structure from type IV but similar to each other. They both form hexagonal networks and belong to short-chain collagens approximately half the size of the fibrillar collagens (van der Rest & Garrone 1991, Kielty et al. 1993, Pihlajaniemi & Rehn 1995, Prockop & Kivirikko 1995).
FACITs are fibril associated collagens with interrupted triple helices, which include type IX, XII, XIV, XVI, and XIX collagens. They contain one or two collagenous domains that attach to the surface of preexisting fibrils of fibrillar collagens. Type IX collagen is expressed in cartilage and usually found to be covalently linked to the major fibrillar collagen type II in the same tissue in antiparallel orientation. Types XII and XIV are found not only in tissues rich in type I collagen, but are also present in tissues containing type II collagen. Types XVI and XIX show similarities in structure to the FACITs, and therefore are classified in this subgroup. Type XVI collagen has a broad tissue distribution, being localized predominantly in heart, kidney, smooth muscles, intestine, ovary, testis, eye, and arterial walls. Type XIX is expressed in human rhabdomyosarcoma and fibroblast cell lines (Pihlajaniemi & Rehn 1995, Prockop & Kivirikko 1995, Ayad et al. 1998).
Collagen type VI, present in most connective tissues, is the only one known as a beaded filament-forming collagen. The protein contains a rather short unique triple helical domain that accounts for less than half of the total mass of the protein. Another feature is the assembly of collagen type VI monomers into well-defined oligomers which are the building blocks of microfibrils found in tissues and cell cultures and are the best-characterized microfibrillar structures existing in the extracellular matrix (Mayne & Burgeson 1987, van der Rest & Garrone 1991, Prockop & Kivirikko 1995).
Collagen type VII, one of the largest known collagens, forms anchoring fibrils that link the basal surface of epithelial cells with the underlying dermis. This network may provide a special trans-basement-membrane route for the transmission of information from the dermis to the epithelial cells or vice versa (Mayne & Burgeson 1987).
MACITs are the membrane-associated collagens with interrupted triple helices, including collagen types XIII and XVII (Pihlajaniemi & Rehn 1995). Type XIII collagen is found in many tissues and is a transmembrane component of focal adhesion sites (Hägg et al. 2001). Type XVII collagen is located in hemidesmosomes connecting epithelial cells to the matrix in skin, cornea and lung. It is identified as the autoantigen associated with the blistering skin disease bullous pemphigoid as well as bullous diseases of other epithelia including cornea and mucous membranes (Giudice et al. 1991, Tajima & Tokimitsu 1995, Gordon et al. 1997, Aho et al. 1999, Aho & Uitto 1999, Michelson et al. 2000)
Both collagen type XV and XVIII have a highly interrupted triple helix together with large globular domains at both N- and C-termini. They contain several potential attachment sites for serine-linked glycosaminoglycans and asparagines-linked oligosaccharides. They are widely expressed in the basement membrane zones of most tissues, but type XVIII collagen is found at a much higher level in the liver (Pihlajaniemi & Rehn 1995, Prockop & Kivirikko 1995). Type XV collagen is a structural component of the extracellular matrix needed for stabilizing skeletal muscle cells and microvessels (Eklund et al. 2001) whereas type XVIII collagen is needed for the normal development of the eye (Fukai et al. 2002). The C-terminal proteolytic fragment of the type XVIII collagen, endostatin, has been reported being an endogenous inhibitor of angiogenesis and tumor growth (O’Reilly et al. 1997).
Currently complete cDNA sequences for four additional collagen polypeptide chains are available, expanding the collagen superfamily to over 20 members. They encode a fibril-forming collagen-like chain, two FACIT collagen-like chains, and a type XIII collagen-like chain (Koch et al. 2001, Myllyharju & Kivirikko 2001). Furthermore, there are at least fifteen other proteins containing triple helical collagenous domains but not defined as collagens. These include the subcomponent C1q of complement, the tail structure of acetylcholinesterase, the pulmonary surfactant proteins SP-A and SP-D, mannan-binding protein, collectin-43, conglutinin, the ficolins, type I and II macrophage scavenger receptor, MARCO protein, an adipose-specific collagen-like factor apM1, a src-homologous-and-collagen (SHC) protein, aggretin and ectodysplasin (Beck & Brodsky 1998, Kivirikko & Pihlajaniemi 1998, Chung et al. 1999, Ezer et al. 1999, Kraal et al. 2000, Myllyharju & Kivirikko 2001).
Collagen biosynthesis is a multistep process that starts with the transcription and translation of the individual collagen gene (Kivirikko & Myllylä 1984, Kielty et al. 1993, Kivirikko 1993). It is characterized by the presence of a large number of co- and post-translational modifications, many of them being unique to collagens or collagen-like proteins. (Kivirikko & Myllylä 1982, Kivirikko & Myllylä 1984, Kielty et al. 1993, Kivirikko 1993, Prockop & Kivirikko 1995).
The fibril-forming collagens are synthesized as procollagens on the ribosomes of the rough ER (see Figure 1). The intracellular modifications occur when the procollagens are translocated across the ER membrane into the lumen. These modifications include the removal of signal peptides; hydroxylation of prolyl and lysyl residues to 4-Hyp, 3-Hyp, and Hyl residues; glycosylation of certain hydroxylysyl residues to galactosylhydroxylysyl and glucosylgalactosylhydroxylysyl residues; glycosylation of a mannose-rich oligosaccharide on one or both of the propeptides; chain association; disulfide bonding; and formation of a triple helix (Table 1, for reviews see Kivirikko & Myllylä 1982, Kivirikko & Myllylä 1984, Kielty et al. 1993, Ayad et al. 1998). The mechanism of procollagen secretion is poorly understood, but it is known that procollagen follows the classical secretion route for extracellular proteins, passing through the Golgi complex to the extracellular space (Kielty et al. 1993). Extracellular modifications consist of removal of large peptides from both N- and C-termini of the procollagen, ordered aggregation, and crosslink formation. These events convert the procollagens to collagens and incorporate the collagen molecules into stable cross-linked fibrils or other supramolecular aggregates (Kivirikko & Myllylä 1984, Kielty et al. 1993, Kivirikko 1993, Prockop & Kivirikko 1995, Myllyharju & Kivirikko 2001).
The processing and assembly of other collagens basically follow the same steps as for fibrillar collagens but with some exceptions. For example, the N- and/or C-terminal propeptides of many collagens are not cleaved; some collagens undergo N-glycosylation or have additional processing steps such as the addition of glycosaminoglycan side chains (Prockop & Kivirikko 1995, Myllyharju & Kivirikko 2001).
Figure 1. Biosynthesis of a fibrillar collagen. Procollagen polypeptide chains are synthesized on the ribosomes of the rough ER and secreted into the lumen, where the chains are modified by hydroxylation of certain prolyl and lysyl residues and glycosylation before chain association and triple helix formation. The procollagen molecules are secreted into the extracellular space where the N and C propeptides are cleaved by specific proteases. The collagen molecules then assemble into fibrils that are stabilized by the formation of crosslinks (Modified from Myllyharju and Kivirikko 2001 with permission of Taylor & Francis AB).
Table 1. General steps in collagen biosynthesis and the post-translational processing enzymes
|Biosynthetic step||Enzyme||Biological significance|
|A. Transcription and translation|
|1. Biosynthesis of pre-mRNA||Formation of translatable mRNA|
|2. Translation||Formation of primary structure|
|B. Intracellular modifications|
|1. Cleavage of signal peptide of pre-proα chain||Signal peptidase||Membrane translocation|
|2. 4-Hydroxylation of proline||Prolyl 4-hydroxylase||Essential for triple helix at 37°C|
|3. 3-Hydroxylation of proline||Prolyl 3-hydroxylase||Unknown|
|4. Hydroxylation of lysine||Lysyl hydroxylase||Essential for glycosylation of hydroxylysine and crosslinks|
|5. O-glycosylation of hydroxylysine||GT||Unknown|
|6. O-glycosylation of galactosylhydroxylysine||GGT||Unknown|
|7. Chain association and disulfide bonding||Protein disulfide isomerase||Essential for triple helix formation|
|8. Triple helix formation||Essential for secretion and later molecular function|
|9. Translocation and secretion of procollagen||Transport|
|C. Extracellular modifications|
|1. Conversion of procollagen to collagen||Procollagen N-proteinase||Essential for normal fibril formation|
|2. Ordered aggregation||Formation of fibrils or other native structures|
|3. Crosslink formation||Lysyl oxidase||Essential for stability of native structure|
|Modified from the tables presented by Kivirikko and Myllylä 1984 and Kielty and coauthors 1993 with permissions from Elsevier Science and Wiley-Liss, Inc., respectively|