Demistifying Content Securiy Policy
After I have built the site, the next step was checking performance and security. The log, besides being my learning notebook, is also a test-bed for my experiments.
What is CSP? #
An HTTP header for fine-grained control over where resources are loaded from. By employing content-security-policy, we can eliminate almost all XSS (Cross Site Scripting) attacks. Read further for why XSS is a problem.
May 6 — GitHub Pages #
GitHub pages does not let us specify HTTP headers. One way is to include <meta http-equiv="Content-Security-Policy" content="...">
as first child of <head>
. Yet, netlify lets us set our response headers beyond other goodies, so I skipped ahead. Out of the box the grade is a D.
May 8 — Netlify (A false hope) #
Setting up a site on netlify from GitHub is trivial. Point to your repository, enter your build command and publish directory. Done.
Our response headers are in _headers
file at the root of publish directory. Mozilla suggests starting with default-src 'none'; img-src 'self'; script-src 'self'; style-src 'self'
for CSP.
base-uri
- restrict URLs for<base>
Directive Feature default-src
default policy for allowed fallback sources img-src
for images font-src
for fonts script-src
for scripts, i.e. JavaScript style-src
for styles, i.e. CSS frame-src
for iframes connect-src
e.g. XMLHttpRequest, WebSocket object-src
for plugins, e.g. Flash, Silverlight
'none'
- nothing
'self'
- same sitehttps://example.com/external.js
- specific external resource
https:
- only HTTPS'unsafe-hashes'
- only code in event handler attributes, e.g. onclick'unsafe-inline'
- only inline blocks'unsafe-eval'
- `eval,
I began with something like what Mozilla suggests, extended to allow CDNs for third party scripts. All <script>
s required to have a hash or a nonce. A nonce is a cryptographically secure random token per request for a script block. It is impossible for a static site to return them. So we should include sha256
hashes on integrity
attributes (SRI — Sub-resource integrity) to ensure they are not tampered. Simple with Hugo templates.
<!-- For inline script blocks -->
{{ with (resources.Get "inline.js" | minify | fingerprint) }}
<script integrity="{{ .Data.Integrity }}">{{ .Content | safeJS }}</script>
{{ end }}
<!-- For external scripts -->
{{ $script := resources.Get "external.js" | minify | fingerprint }}
<script src="{{ $script.RelPermalink }}" integrity="{{ $script.Data.Integrity }}"></script>
Result is an A+. Yet, there is a problem. Error log shows that MathJax contains inline scripts and eval
s. So we are not yet done.
Hugo can highlight code blocks (no highlight.js), can preprocess SCSS (via hugo-extended, no node.js), can minify resources and generate hashes during build. But it can’t yet generate diagrams (mermaid.js) nor typeset math (KaTeX, MathJax). Also, a client side search (FlexSearch in my case) requires JavaScript. Thus, we still need some third party libraries.
May 9 — Two steps back #
I had to add unsafe-inline
s and domains of CDNs to restore full functionality, although errors about eval
s were false flags. B+.
May 10 — Onward! #
I learned strict-dynamic
and parsed integrity hashes of all inline and external scripts. It worked in Chrome. Sadly caused many problems on Firefox. Following hours of debugging and reading bug reports, I grasped, though it is supported for years, it is unusable. Since CSP-3 being a working draft, hashes for external scripts are unsupported. Still B+.
'strict-dynamic'
- let trusted code blocks to load additional scripts
A Bittersweet Victory #
It seems CSP-2 (current W3C Recommendation) only supports hashes for inline scripts, requiring more fine-grained regexps.
I created a git pre-commit hook to update hashes whenever I commit my site, in PowerShell being on Windows. Search is using ripgrep, -oIN
meaning only print matches without filename or line numbers, and -r
to modify result by adding single quotes around it. Unique results filtered and joined on a single line, and written to a file.
Where first regexp for all integrity strings, second one filters only inline scripts, and third one listing sources of external scripts.
hugo --minify
(rg -oIN '<script.*?(sha\d{3}-.{43}=)\"' -r '''$1''' public | sort -unique) -join ' ' | out-file -encoding ASCII -noNewline data/script_hash.txt
(rg -oIN '<script.*?(sha\d{3}-.{43}=)\".*?>[^\n<>]+?</script>' -r '''$1''' public | sort -unique) -join ' ' | out-file -encoding ASCII -noNewline data/inline_script_hash.txt
(rg -oIN '<script.*?src=\"?(http.*?\.js)[ \">]' -r '$1' public | sort -unique) -join ' ' | out-file -encoding ASCII -noNewline data/external_script_source.txt
Hugo layout template index.headers
is used to generate _headers
. Here is only the relevant part for script-src
.
script-src 'sha256-aECzxYUJ57J5H6YymaVqtppSpIqD2Z9YAIAZfd/2xMY=' 'sha256-MktN23nRzohmT1JNxPQ0B9CzVW6psOCbvJ20j9YxAxA=' 'sha256-OBZ1TAxtlr9xf3a+8VMnoX0v39PPCWCsN6DfNkKio/I=' 'self' https://cdn.jsdelivr.net/npm/mathjax@3.1.4/es5/tex-mml-chtml.js https://cdn.jsdelivr.net/npm/mermaid@8.9.3/dist/mermaid.min.js;
And the result is A+. Whole content security policy line:
default-src 'none';
base-uri 'self';
manifest-src 'self';
connect-src 'self';
font-src 'self' https://cdn.jsdelivr.net;
img-src 'self' data:;
script-src 'sha256-aECzxYUJ57J5H6YymaVqtppSpIqD2Z9YAIAZfd/2xMY=' 'sha256-MktN23nRzohmT1JNxPQ0B9CzVW6psOCbvJ20j9YxAxA=' 'sha256-OBZ1TAxtlr9xf3a+8VMnoX0v39PPCWCsN6DfNkKio/I=' 'self' https://cdn.jsdelivr.net/npm/mathjax@3.1.4/es5/tex-mml-chtml.js https://cdn.jsdelivr.net/npm/mermaid@8.9.3/dist/mermaid.min.js;
style-src 'self' 'unsafe-inline' https://cdn.jsdelivr.net;
object-src 'none'