Create a self caching website ready for offline usage with HTML5 and jQuery

Thursday, February 3, 2011

In this tutorial we will create a website that is able to cache its own content, including markup, stylesheets and javascripts, as a user browse it. To dynamically restore cached files we use a offline manifest and the local storage. Or in other words: every page a user visited online will be available offline too.

Please note that the demo application is written in CoffeeScript and must be compiled to JavaScript. If you have Thor installed use the command thor compile:js .

How does it basically work?

We use HTML5 Application Cache (see next chapter) to tell the browser which files to cache and make available for offline browsing. We make minimal usage of it by caching only four files as our basic framework:

  • jquery.js - for some convenient things like $.get and $.inArray
  • scow.js - the main script, cache if online, restore from cache if offline
  • index.html - the main entry point of our website
  • offline.html - the content of this file will be delivered when offline and not cached

No matter if online or offline, this framework and our script will be loaded.

User is online

When a user visit a page the script will do two things. First it will asynchronously load every linked stylesheet and javascript from the header. Once the content of a asset is received it's stored in HTML5's localStorage. Second the content of the current page, the title and a index of the linked asset files are stored in a JSON encoded string.

This gives a complete snapshot of the current page.

User is offline

The script checks if the requested page is cached by localStorage. If not the content of offline.html is simply displayed, telling the user that the requested file is not cached.

If the page is cached the assets are loaded into the current page header so the stylesheets take effect and the javascripts are loaded. Next it rebuilds the original content and title.

This results in the same page the user has seen online, but it was dynamically loaded from the cache. Viewing the source code will reveal the content of offline.html .

HTML5 Applicaiton Cache - Fallback File

Application Cache is the key feature for this tutorial and the demo. Basically we tell the browser which specific files to cache. These files then will be loaded from the application cache, offline and online .

There are really good tutorials about the application cache on the interwebs. For a introduction and further reading check out these:

This is the .appcache file, the offline manifest, for the demo application. It tells the browser to specifically cache the four 'framework' files and use offline.html as fallback file.

  1. CACHE MANIFEST
  2. CACHE:
  3. index.html
  4. offline.html
  5. js/jquery.js
  6. js/scow.js
  7. FALLBACK:
  8. / offline.html
  9. NETWORK:
  10. *
  11. # LAST UPDATE
  12. # 01/17/12

For this tutorial it's crucial to understand the fallback file mechanism.

A example: A user is offline and tries to load a non-cached page style.html . Since the file is not cached the browser falls back to the content of offline.html . The path in the browser is still pointing to style.html and can be read via location.pathname .

Since we know what the visitor originally was trying to load we can restore it from localStorage if it's cached.

And this is how offline.html looks like. We rely on the cache manifest offline.appcache . jquery.js and scow.js are cached by application cache, so we load them like normally.

  1. <title></title>
  2.     <script src='js/jquery.js'></script>
  3.     <script src='js/scow.js'></script><p>
  4.     This content is not cached. <a href="index.html">Home</a>.</p>

As you can see, a minimal entry point which is only loading jQuery and our script. The content then will be replaced by the cached content, stylesheets and javascripts added.

Development environment - A small Sinatra app

Testing a offline application can be tricky since application cache will load the files, no matter if online or offline, from the cache.

To reset a cache it's not enough to update single files, you have to update the content of the cache manifest offline.appcache .

So for more convenient development I wrote a small Sinatra app that has a simple appcache reset function. When visiting the /new_cache path a new timestamp is generated resulting in a updated cache manifest file and forced reloading.

It also makes sure the cache manifest is delivered with the proper mimetype text/cache-manifest .

Then we could have a simple link:

  1. <a class="btn" href="/new_cache" id="new-cache">Trigger New Cache</a>

That is only shown if we are in development mode, speak working on localhost :

  1. if location.host.indexOf('localhost') isnt -1 # check if we are in dev mode
  2.   $('#new-cache').show() # if so show the link to trigger a new cache via /new_cache

If you have Thor installed use the command thor server:start to start the dev server.

Check out the source code of the server at GitHub .

Clearing localStorage when the application cache is updated

Since we now can re-cache files from the application cache we also need a way to reset the localStorage. Application cache provides several events we can use. We are listening for the updateready event which will be fired when the browser is done receiving a new offline manifest. At this point we reset the whole localStorage and clear every cached file by our script.

  1. applicationCache.addEventListener 'updateready', -> # if the manifest files are newly cached
  2.   localStorage.clear() # clear also local store

Checking if the client is online/offline

This is also HTML5 specific . navigator.onLine will return true if there is a internet connection, otherwise it returns false .

  1. if navigator.onLine # check if there is a internet connection
  2.   @cacheCurrentFile() # if so cache the current file
  3. else
  4.   @restoreFromCache() # try to restore the requested file from cache

Read/write from/to localStorage

Working with HTML5's localStorage is pretty straight forward. We just use .getItem() to read from the storage and .setItem() to write to the storage.

Additionally we use JSON so we can store objects in the localStorage. It will be encoded when storing a value and decoded when reading a value.

  1. getStorage: (name) -> # helper function to read/decode JSON from local storage
  2.   item = localStorage.getItem(name) # try to read from local storage
  3.   if item isnt null # item was found in storage
  4.     item = JSON.parse(item) # json encoded object
  5.     item # return the object or null if not found
  6.     setStorage: (name, value) -> # helper function to write/encode JSON to local storage
  7.     item = JSON.stringify(value) # create json string
  8.   localStorage.setItem name, item # write json string to local storage

Caching asset files

We get every linked stylesheet and javascript file path from the header of the current page. Then we start to load the files asynchronously via jQuery's $.get . When the content is loaded store it in localStorage as simple string.

  1. loadAssetCallback: (path, content) -> # is called when a file is fully loaded
  2.   unless @isAssetCached(path) # check if asset already cached
  3.   @assets.push path # add path to assets index
  4.   @setStorage path, content # cache the file with the path as key
  5.   @setStorage 'assets', @assets # store the new assets index array
  6.   @updateAssetsIndex() # update the demo list of the assets index
  7. loadAsset: (path) -> # load a asset with $.get
  8.   $.get path, (content) => # get the file content
  9.     @loadAssetCallback path, content # content loaded, invoke the callback
  10. loadAssets: (paths) -> # cache either css or scripts
  11.   for path in paths # go through all paths
  12.   unless @isAssetCached(path) # check if asset already cached
  13.   @loadAsset path # start loading the asset
  14.   getAssets: (selector, src) -> # extract the assets from the header and return array
  15.   retArr = [] # array to return
  16.   for el in @headEl.find(selector) # all elements matching the selector
  17.   retArr.push $(el).attr(src) # read the attribute containing the source path and store it in array
  18.   retArr # return a array of the file paths

Caching HTML files

Caching .html files is a bit more complicated than storing a simple string in localStorage. We need the content of the file, the title and all the linked assets as index. We create a object that is storing all these information and store it as JSON encoded string.

  1. cacheCurrentFile: -> # cache the requested .html file
  2.   cssAssets = @getAssets('link[rel="stylesheet"]', 'href') # get array of stylesheets
  3.   jsAssets = @getAssets('script', 'src') # get array of javascripts
  4.   cacheObj = # this will hold the cached page and all assets references
  5.   bodyHtml : @bodyEl.html() # cache the content of the current file
  6.   title : document.title # cache the title
  7.   stylesheets: cssAssets # array with all stlesheets
  8.   javascripts: jsAssets # array with all javascripts
  9.     @loadAssetCallback(@curFileName, cacheObj) # manually invoke the callback and save the object
  10.     @loadAssets cssAssets # begin to load all the css assets
  11.     @loadAssets jsAssets # same with the javascripts

Restoring assets from cache

When the user is offline we have to restore somehow the cached assets. We do this by reading the content from localStorage and insert the stylesheets or javascript directly into the page's header. Stylesheets will take effect and javascripts are initially loaded.

  1. restoreHeader: (paths, wrapper) -> # reassemble and include the cached files
  2.   combined = '' # include all in one string
  3.   for path in paths # go through all cached assets
  4.   content = @getStorage(path) # try to get the content from local storage
  5.   if content isnt null # check if the requested file is cached
  6.     combined += content # add the cached content
  7.     $(wrapper) # create a jquery object from the wrapper markup
  8.     .text(combined) # set the content
  9.     .appendTo @headEl # and append it to the head

Restore HTML and title from cache

The original cached content will be inserted into the pages body element. The title will be restored via the good old document.title .

  1. restoreFromCache: -> # restore requested file and assets from cache
  2.   cached = @getStorage(@curFileName) # get the cached object with body, title and assets refs
  3.   if cached isnt null # check if the file is cached
  4.     @bodyEl.html cached.bodyHtml # restore body with original dom
  5.     document.title = cached.title # restore original title
  6.     # restore all stylesheets via including them into the header
  7.     @restoreHeader cached.stylesheets, '<style type="text/css">'
  8.     # restore all javascripts
  9.     @restoreHeader cached.javascripts, '<script type="text/javascript"/>'

Comments

Add new comment

Filtered HTML

  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <blockquote> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Target Image