Thursday, March 29, 2007

Escape CSV string in XSLT


To output CSV format, we will need to escape a field if it has quote, or comma or line feed. To escape a CSV string in XSLT is not a straight forward task. After a little search I found the following solution which passed my test. The code is from a thread I found here: http://www.cygwin.com/ml/xsl-list/2001-03/msg00794.html



 48:  <!-- Sample of calling the escape template -->
49: <xsl:template match="*">
50: <xsl:call-template name="display_csv_field">
51: <xsl:with-param name="field" select="myFieldToEscape"/>
52: </xsl:call-template>
53: </xsl:template>
54:
55: <!-- Template to escape csv field -->
56: <xsl:template name="display_csv_field">
57: <xsl:param name="field"/>
58:
59: <xsl:variable name="linefeed">
60: <xsl:text>&#10;</xsl:text>
61: </xsl:variable>
62:
63: <xsl:choose>
64:
65: <xsl:when test="contains( $field, '&quot;' )">
66: <!-- Field contains a quote. We must enclose this field in quotes,
67: and we must escape each of the quotes in the field value.
68: -->
69: <xsl:text>"</xsl:text>
70:
71: <xsl:call-template name="escape_quotes">
72: <xsl:with-param name="string" select="$field" />
73: </xsl:call-template>
74:
75: <xsl:text>"</xsl:text>
76: </xsl:when>
77:
78: <xsl:when test="contains( $field, ',' ) or contains( $field, $linefeed )">
79: <!-- Field contains a comma and/or a linefeed.
80: We must enclose this field in quotes.
81: -->
82: <xsl:text>"</xsl:text>
83: <xsl:value-of select="$field" />
84: <xsl:text>"</xsl:text>
85: </xsl:when>
86:
87: <xsl:otherwise>
88: <!-- No need to enclose this field in quotes.
89: -->
90: <xsl:value-of select="$field" />
91: </xsl:otherwise>
92:
93: </xsl:choose>
94: </xsl:template>
95:
96: <!-- Helper for escaping CSV field -->
97: <xsl:template name="escape_quotes">
98: <xsl:param name="string" />
99:
100: <xsl:value-of select="substring-before( $string, '&quot;' )" />
101: <xsl:text>""</xsl:text>
102:
103: <xsl:variable name="substring_after_first_quote"
104: select="substring-after( $string, '&quot;' )" />
105:
106: <xsl:choose>
107:
108: <xsl:when test="not( contains( $substring_after_first_quote, '&quot;' ) )">
109: <xsl:value-of select="$substring_after_first_quote" />
110: </xsl:when>
111:
112: <xsl:otherwise>
113: <!-- The substring after the first quote contains a quote.
114: So, we call ourself recursively to escape the quotes
115: in the substring after the first quote.
116: -->
117:
118: <xsl:call-template name="escape_quotes">
119: <xsl:with-param name="string" select="$substring_after_first_quote"/>
120: </xsl:call-template>
121: </xsl:otherwise>
122:
123: </xsl:choose>
124:
125: </xsl:template>

By the way, remember to set the XSLT's output like this:

<xsl:output method="text" encoding="iso-8859-1">
</xsl:output>

Reference: http://www.cygwin.com/ml/xsl-list/2001-03/msg00794.html

4 comments:

Stef said...

It works great! Thanks for sharing!

Guogang Hu said...

You are welcome

Anonymous said...

This saved me 2 days work. Thank you Bob originally for posting this (http://www.cygwin.com/ml/xsl-list/2001-03/msg00794.html) and Guogang for making this accessible.

Really appreciate it. Hope to return the favor in near future to other developers as you guys have!

Anonymous said...

thanks for this post

really it helps a lot

but when we have quote and comma together in one CDATA section it doesn't work