Kolibri supports web content through the use of
HTML5AppNode, which renders
the contents of a
HTMLZipFile in a sandboxed iframe. The Kolibri
will load the
index.html file which is assumed to be in the root of the zip file.
hrefs and other
src attributes must be relative links to resources within
the zip file. The
iframe rendering of the content in Kolibri is sandbox so
there are some limitations about use of plugins and parts of the web API.
HTMLZipFilemust have an
index.htmlfile at the root of the zip file.
A web application packaged as a
HTMLZipFilemust not depend on network calls for it to work (cannot load resources references via http/https links)
A web application packaged as a
HTMLZipFileshould not make unnecessary network calls (analytics scripts, social sharing functionality, tracking pixels). In an offline setting none of these functions would work so it is considered best practices to “clean up” the web apps as part of packaging for offline use.
The web application must not use plugins like swf/flash.
A raw HTML example that consists of basic unstyled HTML content taken from the “Additional Online Resources” section of this source page. Note links are disabled (removed blue link, and replaced by display of target URL. If the links were to useful resources (documents, worksheets, sound clips), they could be included in the zip file (deep scraping) with link changed to a relative path. By modern cheffing standards, this HTML node would be flagged as “deficient” since it lacks basic styled and readability. See the recommended approach to basic HTML styling in the next example.
A basic styled HTML example. The code uses a basic template which was copy-pasted from html-app-starter. This presentation applies basic fonts, margins, and layout to make HTML content more readable. See the section “Usability guidelines” below for more details.
A section from a math textbook that includes text, images, and scripts for rendering math equations.
Proof of concept of a Vue.js App. This is a minimal webapp example based on the vue.js framework. Note the shell script used to tweak the links inside index.html and build.js to make references relative paths.
A powerpoint sideshow presentation packaged as a standalone zip with PREV/NEXT buttons.
Extracting Web Content¶
Most content integration scripts for web content require some combination of crawling (visiting web pages on the source website to extract the structure), and scraping (extracting the metadata and files from detail pages).
The page Parsing HTML contains some basic info and code examples
that will allow you to get started with crawling and scraping.
You can also watch this cheffing video tutorial
that will show the basic steps of using
BeautifulSoup for crawling a website.
See the sushi-chef-shls code repo
for the final version of the web crawling code that was used for this content source.
Static assets download utility¶
We have a handy function for fetching all of a webpage’s static assets (JS, CSS, images, etc.), so that, in theory, you could scrape a webpage and display it in Kolibri exactly as you see it in the website itself in your browser.
See the source in
example usage in a simple app: MEET chef,
which comprises articles with text and images, and another example in a complex app: Blockly Games chef, an interactive JS game with images and sounds.
Text should be legible (high contrast, reasonable font size)
Responsive: text should reflow to fit screens of different sizes. You can preview on a mobile device (or use Chrome’s mobile emulation mode) and ensure that the text fits in the viewport and doesn’t require horizontal scrolling (a maximum width is OK but minimum widths can cause trouble).
Ensure navigation within HTML5App is easy to use:
consistent use of navigation links (e.g. side menu with sections)
consistent use of previous/next links
Ensure links to external websites are disabled (remove
<a></a>tag), and instead show the
hrefin brackets next to the link text (so that users could potentially access the URL by some other means). For example “some other text link text(http://link.url) and more text continues”
It’s important to “cut” the source websites content into appropriately sized chunks:
As small as possible so that resources are individually trackable, assignable, remixable, and reusable accross channels and in lessons.
But not too small, e.g., if a lesson contains three parts intended to be followed one after the other, then all three parts should be included in a same HTML5App with internal links.
Use nested folder structure to represent complex sources. Whenever an HTML page that acts as a “container” with links to other pages and PDFs, turn it into a TopicNode (Folder) and put content items inside it.
We also have a starter template for apps, particularly helpful for displaying content that’s mostly text and images, such as articles. It applies some default styling on text to ensure readability, consistency, and mobile responsiveness.
It also includes a sidebar for those apps where you may want internal navigation. However, consider if it would be more appropriate to turn each page into its own content item and grouping them together into a single folder (topic).
How to decide between the static assets downloader (above) and this starter template? Prefer the static assets downloader if it makes sense to keep the source styling or JS, such as in the case of an interactive app (e.g. Blockly Games) or an app-like reader (e.g. African Storybook). If the source is mostly a text blob or an article – and particularly if the source styling is not readable or appealing—using the template could make sense, especially given that the template is designed for readability.
The bottom line is ensure the content meets the usability guidelines above: legible, responsive, easy to navigate, and “look good” (you define “good” :P). Fulfilling that, use your judgment on whatever approach makes sense and that you can use effectively!
Using Local Kolibri Preview¶
The kolibripreview.py script can be used to test
the contents of
webroot/ in a local installation of Kolibri without needing to
go through the whole content pipeline.
Creating a HTMLZipFile¶
No special technique is required to create HTMLZipFile files—as long as the .zip
file contain the index.html in it’s root (not in a subfolder), it can be used
HTMLZipFile and added as a file to an
Since creating the zip files is such a common task of the cheffing process, we
provide two helpers to save you time: the
create_predictable_zip method and
Zipping a folder¶
can be used to create a zip file from a given directory. This is the recommended
approach for creating zip files since it strips out file timestamps to ensure
that the content hash will not change every time the chef script runs.
Here is some sample code that show how to use this function:
# 1. Create a temporary directory webroot = tempfile.mkdtemp() # 2. Create the index.html file inside the temporary directory indexhtmlpath = os.path.join(webroot, 'index.html') with open(indexhtmlpath, 'w') as indexfile: indexfile.write("<html><head></head><body>Hello, World!</body></html>") # add images the webroot dir # add css files the webroot dir # add js files to the webroot dir # ... # 3. Zip it! (see https://youtu.be/BODSCrj9FHQ for a laugh) zippath = create_predictable_zip(webroot)
You can then use this zippath as follows
zipfile = HTMLZipFile(path=zippath, ...)
and add the
zipfile to a
HTML5AppNode object using its
for a full code sample.
HTMLWriter utility class¶
ricecooker.utils.html_writer provides a basic helper
methods for creating zip files directly in compressed form, without the need for
creating a temporary directory first.
To use the
HTMLWriter class, you must enter the
from ricecooker.utils.html_writer import HTMLWriter with HTMLWriter('./myzipfile.zip') as zipper: # Add your code here
To write the main file (
index.html in the root of the zip file), use the
contents = "<html><head></head><body>Hello, World!</body></html>" zipper.write_index_contents(contents)
You can also add other files (images, stylesheets, etc.) using
To check if a file exists in the zipfile, use the
# Zipfile has "index.html" file zipper.contains('index.html') # Returns True zipper.contains('css/style.css') # Returns False
You can then call
zipfile = HTMLZipFile(path=''./myzipfile.zip', ...) and add
zipfile to a
HTML5AppNode object using its
See the source code for more details: ricecooker/utils/html_writer.py.
Conceptually, we could say that
.epubfiles are a subkind of the
.zipfile format, but Kolibri handles them differently, using
H5PFilefor more info.