Create a self caching website ready for offline usage with HTML5 and jQuery

Posted over 1 year ago


Introduction

In this tutorial we will create a website that is able to cache its own content, including markup, stylesheets and javascripts, as a user browse it. To dynamically restore cached files we use a offline manifest and the local storage. Or in other words: every page a user visited online will be available offline too.

Please note that the demo application is written in CoffeeScript and must be compiled to JavaScript. If you have Thor installed use the command thor compile:js.

How does it basically work?

We use HTML5 Application Cache (see next chapter) to tell the browser which files to cache and make available for offline browsing. We make minimal usage of it by caching only four files as our basic framework:

  • jquery.js - for some convenient things like $.get and $.inArray
  • scow.js - the main script, cache if online, restore from cache if offline
  • index.html - the main entry point of our website
  • offline.html - the content of this file will be delivered when offline and not cached

No matter if online or offline, this framework and our script will be loaded.

User is online

When a user visit a page the script will do two things. First it will asynchronously load every linked stylesheet and javascript from the header. Once the content of a asset is received it's stored in HTML5's localStorage. Second the content of the current page, the title and a index of the linked asset files are stored in a JSON encoded string.

This gives a complete snapshot of the current page.

User is offline

The script checks if the requested page is cached by localStorage. If not the content of offline.html is simply displayed, telling the user that the requested file is not cached.

If the page is cached the assets are loaded into the current page header so the stylesheets take effect and the javascripts are loaded. Next it rebuilds the original content and title.

This results in the same page the user has seen online, but it was dynamically loaded from the cache. Viewing the source code will reveal the content of offline.html.

HTML5 Applicaiton Cache - Fallback File

Application Cache is the key feature for this tutorial and the demo. Basically we tell the browser which specific files to cache. These files then will be loaded from the application cache, offline and online.

There are really good tutorials about the application cache on the interwebs. For a introduction and further reading check out these:

This is the .appcache file, the offline manifest, for the demo application. It tells the browser to specifically cache the four 'framework' files and use offline.html as fallback file.

CACHE MANIFEST

CACHE:
index.html
offline.html
js/jquery.js
js/scow.js

FALLBACK:
/ offline.html

NETWORK:
*

# LAST UPDATE
# 01/17/12

For this tutorial it's crucial to understand the fallback file mechanism.

A example: A user is offline and tries to load a non-cached page style.html. Since the file is not cached the browser falls back to the content of offline.html. The path in the browser is still pointing to style.html and can be read via location.pathname.

Since we know what the visitor originally was trying to load we can restore it from localStorage if it's cached.

And this is how offline.html looks like. We rely on the cache manifest offline.appcache. jquery.js and scow.js are cached by application cache, so we load them like normally.

<!DOCTYPE html>
<html manifest="offline.appcache">
  <head>
    <title>Offline</title>

    <script src='js/jquery.js'></script>
    <script src='js/scow.js'></script>
  </head>
  <body>
    This content is not cached. <a href="index.html">Home</a>.
  </body>
</html>

As you can see, a minimal entry point which is only loading jQuery and our script. The content then will be replaced by the cached content, stylesheets and javascripts added.

Development environment - A small Sinatra app

Testing a offline application can be tricky since application cache will load the files, no matter if online or offline, from the cache.

To reset a cache it's not enough to update single files, you have to update the content of the cache manifest offline.appcache.

So for more convenient development I wrote a small Sinatra app that has a simple appcache reset function. When visiting the /new_cache path a new timestamp is generated resulting in a updated cache manifest file and forced reloading.

It also makes sure the cache manifest is delivered with the proper mimetype text/cache-manifest.

Then we could have a simple link:

<a class="btn" href="/new_cache" id="new-cache">Trigger New Cache</a>

That is only shown if we are in development mode, speak working on localhost:

if location.host.indexOf('localhost') isnt -1 #check if we are in dev mode
  $('#new-cache').show()                      #if so show the link to trigger a new cache via /new_cache 

If you have Thor installed use the command thor server:start to start the dev server.

Check out the source code of the server at GitHub.

Clearing localStorage when the application cache is updated

Since we now can re-cache files from the application cache we also need a way to reset the localStorage. Application cache provides several events we can use. We are listening for the updateready event which will be fired when the browser is done receiving a new offline manifest. At this point we reset the whole localStorage and clear every cached file by our script.

applicationCache.addEventListener 'updateready', -> #if the manifest files are newly cached
  localStorage.clear()                              #clear also local store

Checking if the client is online/offline

This is also HTML5 specific. navigator.onLine will return true if there is a internet connection, otherwise it returns false.

if navigator.onLine                     #check if there is a internet connection
  @cacheCurrentFile()                   #if so cache the current file
else
  @restoreFromCache()                   #try to restore the requested file from cache

Read/write from/to localStorage

Working with HTML5's localStorage is pretty straight forward. We just use .getItem() to read from the storage and .setItem() to write to the storage.

Additionally we use JSON so we can store objects in the localStorage. It will be encoded when storing a value and decoded when reading a value.

getStorage: (name) ->                     #helper function to read/decode JSON from local storage
  item = localStorage.getItem(name)       #try to read from local storage                        

  if item isnt null                       #item was found in storage                             
    item = JSON.parse(item)               #json encoded object                                   

  item                                    #return the object or null if not found                

setStorage: (name, value) ->              #helper function to write/encode JSON to local storage 
  item = JSON.stringify(value)            #create json string                                    

  localStorage.setItem name, item         #write json string to local storage  

Caching asset files

We get every linked stylesheet and javascript file path from the header of the current page. Then we start to load the files asynchronously via jQuery's $.get. When the content is loaded store it in localStorage as simple string.

loadAssetCallback: (path, content) ->     #is called when a file is fully loaded
  unless @isAssetCached(path)             #check if asset already cached
    @assets.push path                     #add path to assets index
    @setStorage path, content             #cache the file with the path as key
    @setStorage 'assets', @assets         #store the new assets index array
    @updateAssetsIndex()                  #update the demo list of the assets index

loadAsset: (path) ->                      #load a asset with $.get    
  $.get path, (content) =>                #get the file content
    @loadAssetCallback path, content      #content loaded, invoke the callback

loadAssets: (paths) ->                    #cache either css or scripts
  for path in paths                       #go through all paths
    unless @isAssetCached(path)           #check if asset already cached
      @loadAsset path                     #start loading the asset

getAssets: (selector, src) ->             #extract the assets from the header and return array
  retArr = []                             #array to return

  for el in @headEl.find(selector)        #all elements matching the selector
    retArr.push $(el).attr(src)           #read the attribute containing the source path and store it in array

  retArr                                  #return a array of the file paths

Caching HTML files

Caching .html files is a bit more complicated than storing a simple string in localStorage. We need the content of the file, the title and all the linked assets as index. We create a object that is storing all these information and store it as JSON encoded string.

cacheCurrentFile: ->                      #cache the requested .html file
  cssAssets = @getAssets('link[rel="stylesheet"]', 'href') #get array of stylesheets
  jsAssets  = @getAssets('script', 'src') #get array of javascripts
  cacheObj  =                             #this will hold the cached page and all assets references
    bodyHtml   : @bodyEl.html()           #cache the content of the current file
    title      : document.title           #cache the title
    stylesheets: cssAssets                #array with all stlesheets
    javascripts: jsAssets                 #array with all javascripts

  @loadAssetCallback(@curFileName, cacheObj) #manually invoke the callback and save the object
  @loadAssets cssAssets                   #begin to load all the css assets
  @loadAssets jsAssets                    #same with the javascripts

Restoring assets from cache

When the user is offline we have to restore somehow the cached assets. We do this by reading the content from localStorage and insert the stylesheets or javascript directly into the page's header. Stylesheets will take effect and javascripts are initially loaded.

restoreHeader: (paths, wrapper) ->        #reassemble and include the cached files
  combined  = ''                          #include all in one string

  for path in paths                       #go through all cached assets
    content = @getStorage(path)           #try to get the content from local storage

    if content isnt null                  #check if the requested file is cached
      combined += content                 #add the cached content
  $(wrapper)                              #create a jquery object from the wrapper markup
    .text(combined)                       #set the content
    .appendTo @headEl                     #and append it to the head

Restore HTML and title from cache

The original cached content will be inserted into the pages body element. The title will be restored via the good old document.title.

restoreFromCache: ->                      #restore requested file and assets from cache
  cached = @getStorage(@curFileName)      #get the cached object with body, title and assets refs

  if cached isnt null                     #check if the file is cached
    @bodyEl.html     cached.bodyHtml      #restore body with original dom
    document.title = cached.title         #restore original title

    #restore all stylesheets via including them into the header
    @restoreHeader cached.stylesheets, '<style type="text/css"/>'

    #restore all javascripts
    @restoreHeader cached.javascripts, '<script type="text/javascript"/>'

The complete script

class SCOW
  constructor: ->
    @headEl      = $ 'head'                 #snapshot of the head element
    @bodyEl      = $ 'body'                 #snapshot of the body element
    @curFileName = @getFileName()           #get the requested path
    @assets      = @getStorage('assets')    #initialize assets index
    
    if @assets is null                      #if there is no assets index yet
      @assets = ['js/jquery.js', 'js/scow.js'] #create it, these two assets are cached by appcache
      @setStorage 'assets', @assets         #and store it
      
    applicationCache.addEventListener 'updateready', -> #if the manifest files are newly cached
      localStorage.clear()                  #clear also local store
    
    if navigator.onLine                     #check if there is a internet connection
      @cacheCurrentFile()                   #if so cache the current file
    else
      @restoreFromCache()                   #try to restore the requested file from cache
      
    if location.host.indexOf('localhost') isnt -1 #check if we are in dev mode
      $('#new-cache').show()                #if so show the link to trigger a new cache via /new_cache 
    
    @updateAssetsIndex()                    #output the current asset index
  
  getFileName: ->
    path     = location.pathname.split('/') #get current pathname and split it by /
    filename = path[path.length - 1]        #last part in array is filename
    
    if filename.length is 0                 #check if the root path / is requested
      filename = 'index.html'               #fallback to index.html

    filename                                #return filename
                                                                                               
  getStorage: (name) ->                     #helper function to read/decode JSON from local storage
    item = localStorage.getItem(name)       #try to read from local storage                        
                                                                                               
    if item isnt null                       #item was found in storage                             
      item = JSON.parse(item)               #json encoded object                                   
                                                                                               
    item                                    #return the object or null if not found                
                                                                                               
  setStorage: (name, value) ->              #helper function to write/encode JSON to local storage 
    item = JSON.stringify(value)            #create json string                                    
                                                                                               
    localStorage.setItem name, item         #write json string to local storage                    
  
  updateAssetsIndex: ->                     #output cached files in a list for the demo
    listEl = $ '#cached-files'              #we dont cache this list element globally because it could change in $body
    
    listEl.children().remove()              #remove all previously added list items
    for asset in @assets                    #for every path in the assets index
      listEl.append "<li>#{asset}</li>"     #append a list item containing the path
      
  isAssetCached: (path) ->                  #check if a asset is already cached - in the assets index
    $.inArray(path, @assets) isnt -1        #return true if already cached otherwise false
                                                                                               
  loadAssetCallback: (path, content) ->     #is called when a file is fully loaded
    unless @isAssetCached(path)             #check if asset already cached
      @assets.push path                     #add path to assets index
      @setStorage path, content             #cache the file with the path as key
      @setStorage 'assets', @assets         #store the new assets index array
      @updateAssetsIndex()                  #update the demo list of the assets index
  
  loadAsset: (path) ->                      #load a asset with $.get    
    $.get path, (content) =>                #get the file content
      @loadAssetCallback path, content      #content loaded, invoke the callback
  
  loadAssets: (paths) ->                    #cache either css or scripts
    for path in paths                       #go through all paths
      unless @isAssetCached(path)           #check if asset already cached
        @loadAsset path                     #start loading the asset
    
  getAssets: (selector, src) ->             #extract the assets from the header and return array
    retArr = []                             #array to return
    
    for el in @headEl.find(selector)        #all elements matching the selector
      retArr.push $(el).attr(src)           #read the attribute containing the source path and store it in array
      
    retArr                                  #return a array of the file paths
    
  cacheCurrentFile: ->                      #cache the requested .html file
    cssAssets = @getAssets('link[rel="stylesheet"]', 'href') #get array of stylesheets
    jsAssets  = @getAssets('script', 'src') #get array of javascripts
    cacheObj  =                             #this will hold the cached page and all assets references
      bodyHtml   : @bodyEl.html()           #cache the content of the current file
      title      : document.title           #cache the title
      stylesheets: cssAssets                #array with all stlesheets
      javascripts: jsAssets                 #array with all javascripts
    
    @loadAssetCallback(@curFileName, cacheObj) #manually invoke the callback and save the object
    @loadAssets cssAssets                   #begin to load all the css assets
    @loadAssets jsAssets                    #same with the javascripts

  restoreHeader: (paths, wrapper) ->        #reassemble and include the cached files
    combined  = ''                          #include all in one string
      
    for path in paths                       #go through all cached assets
      content = @getStorage(path)           #try to get the content from local storage
      
      if content isnt null                  #check if the requested file is cached
        combined += content                 #add the cached content

    $(wrapper)                              #create a jquery object from the wrapper markup
      .text(combined)                       #set the content
      .appendTo @headEl                     #and append it to the head
  
  restoreFromCache: ->                      #restore requested file and assets from cache
    cached = @getStorage(@curFileName)      #get the cached object with body, title and assets refs
    
    if cached isnt null                     #check if the file is cached
      @bodyEl.html     cached.bodyHtml      #restore body with original dom
      document.title = cached.title         #restore original title
      
      #restore all stylesheets via including them into the header
      @restoreHeader cached.stylesheets, '<style type="text/css"/>'
      
      #restore all javascripts
      @restoreHeader cached.javascripts, '<script type="text/javascript"/>'

jQuery -> (new SCOW)                        #initially create the class when the DOM is ready

Browser support

This has been tested successfully in the latest versions of Chrome, Safari and Firefox. It also works with the browser that comes with Android 2.2.

When I tested it on my mobile phone the browser alerted that there is no internet connection while loading the page successfully in the background. Pretty good trick :).

Don't bother with IE. It may support localStorage from version 8 on, but application cache will not be supported until version 10.

If you have a chance to test it on more browsers/devices like on iPhone/iPad please let us know in the comments if it works.

Things to keep in mind

Usage of the offline manifest

Application cache is only used for the main framework (as described above). The offline manifest must only be included in app-cached files like offline.html and index.html like so:

<!DOCTYPE html>
<html manifest="offline.appcache">
  <head>
    ...

Any additional .html file must not include the offline manifest because it will be loaded from localStorage.

Limits of localStorage

There are different storage limits for different browsers, but you can assume that it has space of 3-5MB. For a growing production ready site you should keep that in mind. Before caching new files it should be checked if there is enough space in the storage.

Images are not cached, yet

Currently markup, stylesheets and javascripts are cached/restored. What about dynamically caching images you may ask? This is possible with Base64 encoded images.

See Encode images for offline usage with HTML5 Canvas.