Files
Each ricecooker content node is associated with one or more files stored in a
content-addressable file storage system. For example, to store the file sample.pdf
we first compute md5 hash of its contents (say abcdef00000000000000000000000000)
then store the file at the path storage/a/b/abcdef00000000000000000000000000.pdf.
The same storage mechanism is used on Kolibri Studio and Kolibri applications.
File objects
The following file classes are defined in the module ricecooker.classes.files:
AudioFile # .mp3
DocumentFile # .pdf
EPubFile # .epub
HTMLZipFile # .zip containing HTML,JS,CSS
H5PFile # .h5p
VideoFile # .mp4 (`path` is local file system or url)
WebVideoFile # .mp4 (downloaded from `web_url`)
YouTubeVideoFile # .mp4 (downloaded from youtube based on `youtube_id`)
SubtitleFile # .vtt (`path` is local file system or url)
YouTubeSubtitleFile # .vtt (downloaded from youtube based on `youtube_id` and `language`)
SlideImageFile # .png/.jpg an image that is part of a SlideshowNode
ThumbnailFile # .png/.jpg/.jpeg (`path` is local file system or url)
Base classes
The file classes extent the base classes File(object) and DownloadFile(File).
When creating a file object, you must specify the following attributes:
path(str): this can be either local path likedir/subdir/file.ext, or a URL like ‘http://site.org/dir/file.ext’.language(str orle_utilslanguage object): what is the language is the file contents.
Path
The path attribute can be either a path on the local filesystem relative to the
current working directory of the chef script, or the URL of a web resource.
Language
The Python package le-utils defines the internal language codes used throughout
the Kolibri platform (e.g. en, es-MX, and zul). To find the internal language
code for a given language, you can locate it in the lookup table,
or use one of the language lookup helper functions defined in le_utils.constants.languages:
getlang(<code>) --> lang_obj: basic lookup used to ensure<code>is a valid internal language code (otherwise returnsNone).getlang_by_name(<Language name in English>) --> lang_obj: lookup by name, e.g.Frenchgetlang_by_native_name(<Language autonym>) --> lang_obj: lookup by native name, e.g.,françaisgetlang_by_alpha2(<two-letter ISO 639-1 code>) --> lang_obj: lookup by standard two-letter code, e.gfr
You can either pass lang_obj as the language attribute when creating nodes and files,
or pass the internal language code (str) obtained from the property lang_obj.code.
See [languages][./languages.md] to read more about language codes.
Audio files
Use the AudioFile(DownloadFile) class to store mp3 files.
audio_file = AudioFile(
path='dir/subdir/lecture_recording.mp3',
language=getlang('en').code
)
Document files
Use the DocumentFile class to add PDF documents:
document_file = DocumentFile(
path='dir/subdir/lecture_slides.pdf',
language=getlang('en').code
)
Use the EPubFile class to add ePub documents:
document_file = EPubFile(
path='dir/subdir/lecture_slides.epub',
language=getlang('en').code
)
HTML files
The HTML5ZipFile class is a generic zip container for web content like HTML, CSS,
and JavaScript. To be a valid HTML5ZipFile file, the file must have a index.html
in its root. The file index.html will be loaded within a sandboxed iframe when
this content item is accessed on Kolibri.
Chef authors are responsible for scraping the HTML and all the related JS, CSS,
and images required to render the web content, and creating the zip file.
Creating a HTML5ZipFile is then done using
document_file = HTML5ZipFile(
path='/tmp/interactive_js_simulation.zip',
language=getlang('en').code
)
Use the H5PFile class to add H5P files:
h5p_file = H5PFile(
path='dir/subdir/presentation.h5p',
language=getlang('en').code
)
Videos files
The following file classes can be added to the VideoNodes:
class VideoFile(DownloadFile)
class WebVideoFile(File)
class YouTubeVideoFile(WebVideoFile)
class SubtitleFile(DownloadFile)
class YouTubeSubtitleFile(File)
To create VideoFile, you need the code
video_file = VideoFile(
path='dir/subdir/lecture_video_recording.mp4',
language=getlang('en').code
)
VideoFiles can also be initialized with ffmpeg_settings (dict), which will be used to determine compression settings for the video file.
video_file = VideoFile(
path = "file:///path/to/file.mp4",
ffmpeg_settings = {"max_height": 480, "crf": 28},
language=getlang('en').code
)
WebVideoFiles must be given a web_url (str) to a video on YouTube or Vimeo, and YouTubeVideoFiles must be given a youtube_id (str).
video_file2 = WebVideoFile(
web_url = "https://vimeo.com/video-id",
language=getlang('en').code,
)
video_file3 = YouTubeVideoFile(
youtube_id = "abcdef",
language=getlang('en').code,
)
WebVideoFiles and YouTubeVideoFiles can also take in download_settings (dict) to determine how the video will be downloaded and high_resolution (boolean) to determine what resolution to download.
Subtitle files can be created using
subs_file = SubtitleFile(
path = "file:///path/to/file.vtt",
language = languages.getlang('en').code,
)
Kolibri uses the .vtt subtitle format internally, but the following formats can
be automatically converted: .srt, .ttml, .scc, .dfxp, and .sami.
The subtitle format is inferred from the file extension of the path argument.
Use the subtitlesformat keyword argument in cases where the path does not end
on a format extension:
subs_file = SubtitleFile(
path = "http:/srtsubs.org/subs/29323923",
subtitlesformat = 'srt', # specify format because not in URL
language = languages.getlang('en').code,
)
You can also get subtitles using YouTubeSubtitleFile which takes a youtube_id
and youtube language code (may be different from internal language codes).
Use the helper method is_youtube_subtitle_file_supported_language to test if
a given youtube language code is supported by YouTubeSubtitleFile and skip the
ones that are not currently supported. Please let the LE content team know when
you run into language codes that are not supported so we can add them.
Thumbnail files
The class ThumbnailFile defined thumbnails that can be added to channel,
topic nodes, and content nodes. The extensions .png, .jpg, and .jpeg and supported.
The recommended size for thumbnail images is 400px by 225px (aspect ratio 16:9).
SlideImageFile files
The SlideImageFile class is used in conjunction with the SlideshowNode class
to create powerpoint-like slideshow presentations.
slide_image_file = SlideImageFile(
path='some/local/path/firstslide.png',
caption="The caption text to be displayed below the slide image",
descriptive_text="Description of the slide for users that cannot see the image",
language=getlang('en').code,
)
Use the caption field to provide the text that will be displayed under the slide
image as part of the presentation. Use the descriptive_text field to provide
the “alt text” the image contents for visually impaired users.
File size limits
Kolibri Studio does not impose any max-size limits for files uploaded, but chef authors need to keep in mind that content channels will often be downloaded over slow internet connections and viewed on devices with limited storage.
Below are some general guidelines for handling video files:
Short videos (5-10 mins long) should be roughly less than 15MB
Longer video lectures (1 hour long) should not be larger than 200MB
High-resolution videos should be converted to lower resolution formats: Here are some recommended choices for video vertical resolution:
Use max height of
480for videos that work well in low resolution (most videos)Use max height of
720for high resolution videos (lectures with writing on board)
Ricecooker can handle the video compression for you if you specify the
--compresscommand line argument, or by setting theffmpeg_settingsproperty when creatingVideoFiles. The default values forffmpeg_settingsare as follows:ffmpeg_settings = {'crf':32, 'max_height':480 }
The
ffmpegsettingcrfstands for Constant Rate Factor and is very useful for controlling overall video quality. Settingcrf=24produces high quality video (and possibly large file size),crf=28is a mid-range quality, and values ofcrfabove 30 produce highly-compressed videos with small size.
PDF files are usually not large, but PDFs with many pages (more than 50 pages) can be difficult to views and browse on devices with small screens, so we recommend that long PDF documents be split into separate parts.
Note: Kolibri Studio imposes a file storage quota on a per-user basis. By default
the storage limit for new accounts is 500MB. Please get in touch with the content
team by email (content@le...) if you need a quota increase.