Cross-Site Scripting (XSS) is a misnomer. Originally this term was derived from early versions of the attack that were primarily focused on stealing data cross-site. Since then, the term has widened to include injection of basically any content. XSS attacks are serious and can lead to account impersonation, observing user behaviour, loading external content, stealing sensitive data, and more.
This cheatsheet contains techniques to prevent or limit the impact of XSS. Since no single technique will solve XSS, using the right combination of defensive techniques will be necessary to prevent XSS.
Framework Security
Fortunately, applications built with modern web frameworks have fewer XSS bugs, because these frameworks steer developers towards good security practices and help mitigate XSS by using templating, auto-escaping, and more. However, developers need to know that problems can occur if frameworks are used insecurely, such as:
- escape hatches that frameworks use to directly manipulate the DOM
- React’s
dangerouslySetInnerHTML
without sanitising the HTML - React cannot handle
javascript:
ordata:
URLs without specialized validation - Angular’s
bypassSecurityTrustAs*
functions - Template injection
- Out of date framework plugins or components
- and more
When you use a modern web framework, you need to know how your framework prevents XSS and where it has gaps. There will be times where you need to do something outside the protection provided by your framework, which means that Output Encoding and HTML Sanitization can be critical. OWASP will be producing framework specific cheatsheets for React, Vue, and Angular.
XSS Defense Philosophy
In order for an XSS attack to be successful, an attacker must be able to to insert and execute malicious content in a webpage. Thus, all variables in a web application needs to be protected. Ensuring that all variables go through validation and are then escaped or sanitized is known as perfect injection resistance. Any variable that does not go through this process is a potential weakness. Frameworks make it easy to ensure variables are correctly validated and escaped or sanitised.
However, no framework is perfect and security gaps still exist in popular frameworks like React and Angular. Output encoding and HTML sanitization help address those gaps.
Output Encoding
When you need to safely display data exactly as a user types it in, output encoding is recommended. Variables should not be interpreted as code instead of text. This section covers each form of output encoding, where to use it, and when you should not use dynamic variables at all.
First, when you wish to display data as the user typed it in, start with your framework’s default output encoding protection. Automatic encoding and escaping functions are built into most frameworks.
If you’re not using a framework or need to cover gaps in the framework then you should use an output encoding library. Each variable used in the user interface should be passed through an output encoding function. A list of output encoding libraries is included in the appendix.
There are many different output encoding methods because browsers parse HTML, JS, URLs, and CSS differently. Using the wrong encoding method may introduce weaknesses or harm the functionality of your application.
Output Encoding for “HTML Contexts”
“HTML Context” refers to inserting a variable between two basic HTML tags like a <div>
or <b>
. For example:
<div> $varUnsafe </div>
An attacker could modify data that is rendered as $varUnsafe
. This could lead to an attack being added to a webpage. For example:
<div> <script>alert`1`</script> </div> // Example Attack
In order to add a variable to a HTML context safely to a web template, use HTML entity encoding for that variable.
Here are some examples of encoded values for specific characters:
If you’re using JavaScript for writing to HTML, look at the .textContent
attribute. It is a Safe Sink and will automatically HTML Entity Encode.
& &
< <
> >
" "
' '
Output Encoding for “HTML Attribute Contexts”
“HTML Attribute Contexts” occur when a variable is placed in an HTML attribute value. You may want to do this to change a hyperlink, hide an element, add alt-text for an image, or change inline CSS styles. You should apply HTML attribute encoding to variables being placed in most HTML attributes. A list of safe HTML attributes is provided in the Safe Sinks section.
<div attr="$varUnsafe">
<div attr=”*x” onblur=”alert(1)*”> // Example Attack
It’s critical to use quotation marks like "
or '
to surround your variables. Quoting makes it difficult to change the context a variable operates in, which helps prevent XSS. Quoting also significantly reduces the characterset that you need to encode, making your application more reliable and the encoding easier to implement.
If you’re writing to a HTML Attribute with JavaScript, look at the .setAttribute
and [attribute]
methods because they will automatically HTML Attribute Encode. Those are Safe Sinks as long as the attribute name is hardcoded and innocuous, like id
or class
. Generally, attributes that accept JavaScript, such as onClick
, are NOT safe to use with untrusted attribute values.
Output Encoding for “JavaScript Contexts”
“JavaScript Contexts” refers to the situation where variables are placed into inline JavaScript and then embedded in an HTML document. This situation commonly occurs in programs that heavily use custom JavaScript that is embedded in their web pages.
However, the only ‘safe’ location for placing variables in JavaScript is inside a “quoted data value”. All other contexts are unsafe and you should not place variable data in them.
Examples of “Quoted Data Values”
<script>alert('$varUnsafe’)</script>
<script>x=’$varUnsafe’</script>
<div onmouseover="'$varUnsafe'"</div>
Encode all characters using the \xHH
format. Encoding libraries often have a EncodeForJavaScript
or similar to support this function.
Please look at the OWASP Java Encoder JavaScript encoding examples for examples of proper JavaScript use that requires minimal encoding.
For JSON, verify that the Content-Type
header is application/json
and not text/html
to prevent XSS.
Output Encoding for “CSS Contexts”
“CSS Contexts” refer to variables placed into inline CSS, which is common when developers want their users to customize the look and feel of their webpages. Since CSS is surprisingly powerful, it has been used for many types of attacks. Variables should only be placed in a CSS property value. Other “CSS Contexts” are unsafe and you should not place variable data in them.
<style> selector { property : $varUnsafe; } </style>
<style> selector { property : "$varUnsafe"; } </style>
<span style="property : $varUnsafe">Oh no</span>
If you’re using JavaScript to change a CSS property, look into using style.property = x
. This is a Safe Sink and will automatically CSS encode data in it.
// Add CSS Encoding Advice
Output Encoding for “URL Contexts”
“URL Contexts” refer to variables placed into a URL. Most commonly, a developer will add a parameter or URL fragment to a URL base that is then displayed or used in some operation. Use URL Encoding for these scenarios.
<a href="http://www.owasp.org?test=$varUnsafe">link</a >
Encode all characters with the %HH
encoding format. Make sure any attributes are fully quoted, same as JS and CSS.
Common Mistake¶
There will be situations where you use a URL in different contexts. The most common one would be adding it to an href
or src
attribute of an <a>
tag. In these scenarios, you should do URL encoding, followed by HTML attribute encoding.
url = "https://site.com?data=" + urlencode(parameter)
<a href='attributeEncode(url)'>link</a>
If you’re using JavaScript to construct a URL Query Value, look into using window.encodeURIComponent(x)
. This is a Safe Sink and will automatically URL encode data in it.
Dangerous Contexts
Output encoding is not perfect. It will not always prevent XSS. These locations are known as dangerous contexts. Dangerous contexts include:
<script>Directly in a script</script>
<!-- Inside an HTML comment -->
<style>Directly in CSS</style>
<div ToDefineAnAttribute=test />
<ToDefineATag href="/test" />
Other areas to be careful with include:
- Callback functions
- Where URLs are handled in code such as this CSS { background-url : “javascript:alert(xss)”; }
- All JavaScript event handlers (
onclick()
,onerror()
,onmouseover()
). - Unsafe JS functions like
eval()
,setInterval()
,setTimeout()
Don’t place variables into dangerous contexts as even with output encoding, it will not prevent an XSS attack fully.
HTML Sanitization
When users need to author HTML, developers may let users change the styling or structure of content inside a WYSIWYG editor. Output encoding in this case will prevent XSS, but it will break the intended functionality of the application. The styling will not be rendered. In these cases, HTML Sanitization should be used.
HTML Sanitization will strip dangerous HTML from a variable and return a safe string of HTML. OWASP recommends DOMPurify for HTML Sanitization.
let clean = DOMPurify.sanitize(dirty);
There are some further things to consider:
- If you sanitize content and then modify it afterwards, you can easily void your security efforts.
- If you sanitize content and then send it to a library for use, check that it doesn’t mutate that string somehow. Otherwise, again, your security efforts are void.
- You must regularly patch DOMPurify or other HTML Sanitization libraries that you use. Browsers change functionality and bypasses are being discovered regularly.
Safe Sinks
Security professionals often talk in terms of sources and sinks. If you pollute a river, it’ll flow downstream somewhere. It’s the same with computer security. XSS sinks are places where variables are placed into your webpage.
Thankfully, many sinks where variables can be placed are safe. This is because these sinks treat the variable as text and will never execute it. Try to refactor your code to remove references to unsafe sinks like innerHTML, and instead use textContent or value.
elem.textContent = dangerVariable;
elem.insertAdjacentText(dangerVariable);
elem.className = dangerVariable;
elem.setAttribute(safeName, dangerVariable);
formfield.value = dangerVariable;
document.createTextNode(dangerVariable);
document.createElement(dangerVariable);
elem.innerHTML = DOMPurify.sanitize(dangerVar);
Safe HTML Attributes include: align
, alink
, alt
, bgcolor
, border
, cellpadding
, cellspacing
, class
, color
, cols
, colspan
, coords
, dir
, face
, height
, hspace
, ismap
, lang
, marginheight
, marginwidth
, multiple
, nohref
, noresize
, noshade
, nowrap
, ref
, rel
, rev
, rows
, rowspan
, scrolling
, shape
, span
, summary
, tabindex
, title
, usemap
, valign
, value
, vlink
, vspace
, width
.
For a comprehensive list, check out the DOMPurify allowlist
Other Controls
Framework Security Protections, Output Encoding, and HTML Sanitization will provide the best protection for your application. OWASP recommends these in all circumstances.
Consider adopting the following controls in addition to the above.
- Cookie Attributes – These change how JavaScript and browsers can interact with cookies. Cookie attributes try to limit the impact of an XSS attack but don’t prevent the execution of malicious content or address the root cause of the vulnerability.
- Content Security Policy – An allowlist that prevents content being loaded. It’s easy to make mistakes with the implementation so it should not be your primary defense mechanism. Use a CSP as an additional layer of defense and have a look at the cheatsheet here.
- Web Application Firewalls – These look for known attack strings and block them. WAF’s are unreliable and new bypass techniques are being discovered regularly. WAFs also don’t address the root cause of an XSS vulnerability. In addition, WAFs also miss a class of XSS vulnerabilities that operate exclusively client-side. WAFs are not recommended for preventing XSS, especially DOM-Based XSS.
XSS Prevention Rules Summary
These snippets of HTML demonstrate how to render untrusted data safely in a variety of different contexts.
Data Type: String Context: HTML Body Code: <span>UNTRUSTED DATA </span>
Sample Defense: HTML Entity Encoding (rule #1)
Data Type: Strong Context: Safe HTML Attributes Code: <input type="text" name="fname" value="UNTRUSTED DATA ">
Sample Defense: Aggressive HTML Entity Encoding (rule #2), Only place untrusted data into a list of safe attributes (listed below), Strictly validate unsafe attributes such as background, ID and name.
Data Type: String Context: GET Parameter Code: <a href="/site/search?value=UNTRUSTED DATA ">clickme</a>
Sample Defense: URL Encoding (rule #5).
Data Type: String Context: Untrusted URL in a SRC or HREF attribute Code: <a href="UNTRUSTED URL ">clickme</a> <iframe src="UNTRUSTED URL " />
Sample Defense: Canonicalize input, URL Validation, Safe URL verification, Allow-list http and HTTPS URLs only (Avoid the JavaScript Protocol to Open a new Window), Attribute encoder.
Data Type: String Context: CSS Value Code: HTML <div style="width: UNTRUSTED DATA ;">Selection</div>
Sample Defense: Strict structural validation (rule #4), CSS hex encoding, Good design of CSS features. J
Data Type: String Context: JavaScript Variable Code: <script>var currentValue='UNTRUSTED DATA ';</script> <script>someFunction('UNTRUSTED DATA ');</script>
Sample Defense: Ensure JavaScript variables are quoted, JavaScript hex encoding, JavaScript Unicode encoding, avoid backslash encoding (\"
or \'
or \\
).
Data Type: HTML Context: HTML Body Code: <div>UNTRUSTED HTML</div>
Sample Defense: HTML validation (JSoup, AntiSamy, HTML Sanitizer…).
Data Type: String Context: DOM XSS Code: <script>document.write("UNTRUSTED INPUT: " + document.location.hash );<script/>
Sample Defense: DOM based XSS Prevention Cheat Sheet |
Output Encoding Rules Summary
The purpose of output encoding (as it relates to Cross Site Scripting) is to convert untrusted input into a safe form where the input is displayed as data to the user without executing as code in the browser. The following charts provides a list of critical output encoding methods needed to stop Cross Site Scripting.
Encoding Type: HTML Entity Encoding Mechanism: Convert &
to &
, Convert <
to <
, Convert >
to >
, Convert "
to "
, Convert '
to '
Encoding Type: HTML Attribute Encoding Encoding Mechanism: Encode all characters with the HTML Entity &#xHH;
format, including spaces, where HH represents the hexadecimal value of the character in Unicode. For example, A
becomes A
. All alphanumeric characters (letters A to Z, a to z, and digits 0 to 9) remain unencoded.
Encoding Type: URL Encoding Encoding Mechanism: Use standard percent encoding, as specified in the W3C specification, to encode parameter values. Be cautious and only encode parameter values, not the entire URL or path fragments of a URL.
Encoding Type: JavaScript Encoding Encoding Mechanism: Encode all characters using the Unicode \uXXXX
encoding format, where XXXX represents the hexadecimal Unicode code point. For example, A
becomes \u0041
. All alphanumeric characters (letters A to Z, a to z, and digits 0 to 9) remain unencoded.
Encoding Type: CSS Hex Encoding Encoding Mechanism: CSS encoding supports both \XX
and \XXXXXX
formats. To ensure proper encoding, consider these options: (a) Add a space after the CSS encode (which will be ignored by the CSS parser), or (b) use the full six-character CSS encoding format by zero-padding the value. For example, A
becomes \41
(short format) or \000041
(full format). Alphanumeric characters (letters A to Z, a to z, and digits 0 to 9) remain unencoded.