Planet Sakai

January 16, 2018

Michael Feldstein

College Rankings Revisited: What Might an Artificial Intelligence Think?

This post is from guest contributor Steve Lattanzio from MetaMetrics. While we do not tend to cover college rankings at e-Literate, we do care about transparency in usage of data as well as understanding opportunities where technology and data might inform students, faculty, administrators and the general educational community. The following post is an interesting exploration in the usage of the full set of College Scorecard data in a way that is understandable and usable. - ed

Emphasis on might.

Ranking colleges has become a bit of a national pastime. There are many organizations that publish “overall” rankings for our institutions of higher education (such as Forbes, Niche, Times Higher Education, and US News & World Report), each with their own methodologies.

We don’t typically get the complete and precise picture of how these rankings are constructed. The common assertion by critics is that these methodologies, which are definitely subjective, are also quite arbitrary. They may seem complex, often relying on many different variables, but at the end of the day experts and other higher education authorities are making a set of choices about what data should be used and how to weigh those variables. What if those experts were just tweaking what variables to include and how to weigh them until they got results that “feel right” or meet some other criteria they had in mind? Some methodologies go a bit further and outright include human judgments, sounding the fudge-factor alarm. Furthermore, there is reason to be concerned about the fact that none of these rankings exist in a vacuum—it’s very possible that they are, to some extent, reflections of each other (see “herding” in the polling industry). At the same time and counter to herding, there’s a desire to provide a unique twist to rankings which leads to a lack of consensus about what the underlying construct should be behind overall college rankings.

Against this backdrop, we now have access to ever-increasing amounts of data about our colleges. Newly released datasets like the College Scorecard present a vast trove of data to the public, enabling all sorts of new analytics. But while this provides an apparently more objective foundation for analysis, leveraging all of the data can be challenging.

This led MetaMetrics to consider whether we could apply some more current machine learning methods to overcome these issues, the type of methods that we employ everyday in our K-12 research. Was it possible to have a computer algorithm take in a bunch of raw data and, through a sufficiently black-box approach, remove decision points that allow ratings to become subjective? Forgive me the gratuitous use of such a buzzword, but could an artificial intelligence discover a latent dimension hidden behind all the noise that was driving data points such as SAT scores, admission rates, earnings, loan repayment rates, and a thousand other things, instead of combining just a few of them in a subjective fashion?

Out of an abundance of concern that the results of this experiment would be misrepresented, we’ll immediately point out that we make no claim that the rankings in this piece are the proper method for ranking these institutions, and we caution anyone from thinking of them as such. It is merely an alternative that we present that might be similar enough to other rankings to validate them, or different enough to invalidate them or this ranking. It is also possible that ranking colleges is an exercise in futility.

The data

Choosing a college is likely to be one of the most consequential decisions, financially and otherwise, of a postsecondary education consumer’s life. In an attempt to bring transparency to higher education and empower young Americans to make a more informed choice, the Obama administration created the College Scorecard in 2015.

The College Scorecard contains thousands of variables for thousands of schools going back almost two decades. It’s a great initiative that allows someone to look at all of the usual suspects, such as average SAT scores, along with very specific things, such as the “percent of not-first-generation students who transferred to a 4-year institution and were still enrolled within 2 years.” The catch, however, is that there is a lot of missing data and only a minority of the possible data elements actually exist. It’s fairly straightforward to search, filter, or sort by specific fields of information for specific schools, but it’s not really clear how you could utilize all of the data. Consequently, most analytic efforts with the College Scorecard are likely to gravitate towards the archetypal and complete variables you would find in a much less ambitious dataset anyway. Our goal is to take advantage of all of the data available in the College Scorecard.

The algorithm

Traditional statistical analyses work best with clean and complete data that have nice linear relationships. These analyses are also going to have trouble handling too many variables at once. But cleaning and curating specific variables in the dataset present more opportunities for humans to unduly (wittingly or not) impact the final results.

We also find ourselves lacking an independent variable to model. That is, we aren’t trying to predict one piece of data from a bunch of other data. We built an algorithm to find something not directly observable in the data that’s a driving force behind a lot of the directly observable things in the data. In machine learning, such a task is considered to be “unsupervised learning.”

To tackle this problem, we use neural networks1 to perform “representational learning” through the use of what is called a stacked autoencoder. I’ll skip over the technical details, but the concept behind representational learning is to take a bunch of information that is represented in a lot of variables, or dimensions, and represent as much of the original information as possible with a lot fewer dimensions. In a stacked neural network autoencoder, data entering into the network is squashed down into fewer and fewer dimensions on one side and squeezed through a bottleneck. On the other side of the network, that squashed information is unpacked in an attempt to reconstruct the original data. Naturally, information is lost during this process, but it’s lost in a deliberate fashion as the AI learns how it can combine the raw variables into new, more efficient, variables that it can push through a bottleneck consisting of fewer channels and still reconstruct as much of the original data as possible. To be clear, the AI isn’t figuring out which subset of variables it wants to keep and which it wants to discard; it is figuring out how to express as much of the original data as possible in brand new meta-variables that it is concocting by combining the original data in creative ways. As noise and redundancies are squeezed out over the many layers of the deep neural network, the hope is that a set of underlying dimensions - ones that represent the most important, overarching features of the data - emerge from the chaos, with one being a candidate for overall college quality.

The results

The nature and context of the representational learning problem dictates how far you can reasonably compress a dataset. In this case, it’s reasonable to compress to as few dimensions as possible where the meanings of the dimensions are still interpretable and we retain some amount of broad ability to reconstruct the original data.

It turns out that we were able to compress all of the information down to just two dimensions, and the significance of those two dimensions was immediately clear.

One dimension has encoded a latent dimension that is related to things such as the size of the school and whether it is public or private (in fact, the algorithm decided there should be a rift mostly separating larger public institutions from smaller schools). The other dimension is a strong candidate for overall quality of a school and is correlated with all of the standard indicators of quality. It seems as if the algorithm learned that for higher education, if you must break it down into two things, is best broken down into two dimensions that can loosely be described as quantity and quality.

Below are the top 20 colleges according to the AI and the resultant two dimensions.


Duke University


College of William and Mary


Stanford University


University of Southern California


Vanderbilt University


Wesleyan University


Cornell University


Yale University


Brown University


Massachusetts Institute of Technology


Emory University


Northwestern University


University of Virginia


Bucknell University


University of Chicago


University of Pennsylvania


Boston College


Santa Clara University


University of Notre Dame


Carnegie Mellon University

Top 20 Colleges in the United States, according to our AI.2,3

Visualization of AI-based college rankings

College quality between 2005-2014 for the top 10 private and top 10 public schools as of 2014. The line thickness is proportional to the size of the student population.

Chart of quality vs quantity

College quality versus quantity for the top 10 private and top 10 public schools in 2014. Circle area is proportional to the size of the student population. Approximate SAT score contour lines are superimposed.

Most of the schools in the top 20 are present in the top 20 in at least one of the published rankings listed earlier. Seven schools—University of Virginia (7), Boston College (9), William and Mary (11), Wesleyan (13), Bucknell University (17), and Santa Clara University (19)—are the newcomers. Of those, the first four schools are reasonably close to being ranked in the top 20 in at least one other ranking, while the latter two are more surprising.

The most conspicuous name is 19th ranked Santa Clara University, a private school of about 5,000 undergraduate students located in Silicon Valley. It is typically ranked in the low 100s (the consensus still places it in the top 10% of all schools) with its best ranking of 64 by Forbes. However, it is impressing the AI and likely disproportionately benefits from a more holistic use of the data instead of using only the typical metrics used to differentiate top schools.

The most conspicuously missing names are the Ivy League schools Harvard (ranked 31st by the AI), Princeton (51), Columbia (23) and Dartmouth (26) along with Caltech (74) and Rice (25). It seems like blasphemy to rank Harvard and Princeton, arguably the most prestigious colleges in the United States, so far down. Caltech at 73 is probably the most jarring of all. However, we take this opportunity to remind you that the AI is not developing a metric strictly of prestige, reputation, the academic caliber of students, or earnings potential of its graduates, but something else that is different, but related.

Duke, Stanford, and Vanderbilt are at the top of the rankings and in any given year any one of them can take the top spot according to the AI. All three schools are often, if not always, ranked in the top 20 in other published rankings. Duke sometimes makes the top five while Stanford does so more often.

Although it goes against our human instincts, not too much weight should be given to the exact ranking of the top schools—relative to the variation in the rest of the field, the differences in quality are small and it’s very tight at the top.

The caveats and more

Throwing things through a black box is often a double-edged sword. You can avoid certain errors and biases that occur in human thinking, but algorithms often come with their own—or at least what we would consider—errors and biases. To an algorithm, data is data, and it’s all fair-game to use to meet some end. What if the neural network believes higher tuition rates, because they are associated with other favorable school characteristics, places a school higher on the dimension that encodes those things? A human would know that higher costs, without commensurate changes in other metrics, should count against a school. Sure, if corresponding quality was not reflected in other metrics, it’s likely the algorithm would mostly ignore the tuition data, but it might not actually lower the resulting quality output. That’s something humans bring to the table with their broad and vast real-world knowledge.

Even more concerning, what if it uses racial demographics to do the same? Unsurprisingly, an algorithm that’s agnostic to what data it is fed has the potential to be politically and socially insensitive. One may think the solution is to just curate what goes into the black box, but there are often proxies for the same information that the algorithm can exploit. This is a commonly cited, controversial hazard of black box machine learning algorithms that should always be kept in mind.

Additionally, these results are based on data aggregated across entire schools. Each student applying to or attending a school has a unique situation. There is much variation in a student population and the programs offered within a school. A single measure or ranking applied to a whole school does not tell you everything you need to know to make the best college decision, but it can provide some valuable context and some level of accountability for the schools themselves.

Of course, there is the axiom that an analysis can only be as good as the data, and while the AI should be relatively robust to sporadic random data errors, systematic errors are another story.

There are many more nuanced and technical caveats for this type of analysis. It is not perfect and the rankings should not be viewed as infallible. But when viewed among other college rankings, its validity is undeniable. It’s not merely a measure of prestige, and it addresses most of the concerns of critics of college rankings, while undoubtedly raising some new ones. However, the results somewhat “feel right.” The renowned “sabermetricianBill James was credited with saying, “If you have a metric that never matches up with the eye test, it’s probably wrong. And if it never surprises you, it’s probably useless. But if four out of five times it tells you what you know, and one of out five it surprises you, you might have something.’’ I think we might have something.

Whether you are researching schools to apply to, are curious about your own alma mater, or generally curious, full results can be found in an interactive table, along with other (possibly more useful and less controversial) results that are generated from this type of methodology (such as discovering “hidden” Ivy League schools, value-add metrics, and relatedness of schools).


  1. We actually train an ensemble of neural networks and average for more reliable results.
  2. These rankings are as of 2014, the last year of the College Scorecard that has sufficient data.
  3. Wake Forest University is in the top 20 between the years 2004-2009, but has insufficient data afterwards.
Steve Lattanzio is a Research Engineer at MetaMetrics Inc., working in AI, machine learning, natural language processing, and data science. MetaMetrics is an education research company and are the developers of The Lexile® Framework for Reading and The Quantile® Framework for Mathematics.

The post College Rankings Revisited: What Might an Artificial Intelligence Think? appeared first on e-Literate.

by Steve Lattanzio at January 16, 2018 09:02 PM

Apereo Foundation

ATLAS Now Open!

ATLAS Now Open!

ATLAS 2018 is now OPEN! Please submit by February 26 2018.

by Michelle Hall at January 16, 2018 09:01 PM

Adam Marshall

ATLAS (Teaching and Learning Awards) 2018 is now OPEN

NB Please get in touch with the WebLearn team if you are interested in entering. We will be very happy to help you with your entry. Oxford has supplied winners in the past. 

Posed on behalf of the ATLAS Committee

The ATLAS committee welcomes submissions from the Apereo open source education community. We invite applications that demonstrate innovative teaching and learning using Sakai (ie WebLearn), OAE, Karuta, Xerte and/or Opencast.

Based on merit, we hope to select up to six winners. Winners will be announced by the end of March 2018 and recognized at the Open Apereo Conference June 3-7, 2018 in Montreal, Quebec, Canada. Registration and travel expenses will be covered for award winners.

For further inquiries, please email ATLAS Chair Luisa Li (
For more information on ATLAS and previous winners, visit:

There are two steps you need to consider to apply for this award:

Step 1: Download the application and rubric.

You will begin the process of applying for the award by completing a brief questionnaire that helps identify the best innovation rubric and application form for your entry.

Click here to start the brief filter questionnaire.

Step 2: Complete and submit the application form by February 26 2018.

You will need to fill out the application form that you have downloaded in Step 1, and save the file as a PDF with your name and the submission date in the file name, for example “ATLAS_cbrown_Feb05_2018.pdf”.  Use the link below to submit your application to the ATLAS awards committee.

Click here to submit your application.

*Notification on Supplementary Videos/Animations for Your Application:

New this year, we accept supplementary videos or animations to further demonstrate merits of your course/project or portfolio. Supplementary video/animations are NOT a requirement. Below are the guidelines for your videos:

  • Videos or animations should be produced to demonstrate innovative teaching and learning of an application and show a direct association with a criterion in the application. They should  not be a snippet of actual learning materials used in an instructional unit of your application.
  • Videos or animations supplement the screenshots you shall provide as the evidence to corroborate your rating for each criterion in the application. They do not replace the screenshots.
  • Videos or animations should be a MAXIMUM of 5 minutes. For example, you may provide one 5-min-long video or five 1-min-long videos.
  • We encourage you provide English captions in these videos or animations to meet accessibility standards and help peer reviewers understand your video content in case you use French or Spanish language in the video.
  • Any videos or animations provided should be independently created or you MUST provide information about support of the creation of the resource in the application.
  • Please upload videos to YouTube, Vimeo, or other video sharing platform and make them public. Then share the links to these videos in corresponding evidence section of each criterion in the application.


by Adam Marshall at January 16, 2018 04:43 PM

Michael Feldstein

Fall 2016 Top 20 Largest Online Enrollments In US – With Trends Since 2012

The National Center for Educational Statistics (NCES) and its Integrated Postsecondary Education Data System (IPEDS) provide the most official data on colleges and universities in the United States. This is the fifth year of data, and we have an opportunity to view trends over time.

Let’s look at the top 20 online programs for Fall 2016 (in terms of total number of students taking at least one online course for grad and undergrad levels combined) in the US. Some notes on the data source:

  • I have combined the categories ‘students exclusively taking distance education courses’ and ‘students taking some but not all distance education courses’ to obtain the ‘at least one online course’ category;
  • IPEDS tracks data based on the accredited body, which can differ for systems – I manually combined most for-profit systems into one institution entity as well as Arizona State University;
  • I have highlighted for-profit institutions in yellow and added sparklines to help visualize trends;
  • See this post for Fall 2016 profile by sector and state.

Top 20 Online Enrollment in US

One additional trend to capture is the dramatic change in the (previous) dominance of the University of Phoenix for overall and online enrollment. For perspective I have also labeled Western Governors University and Southern New Hampshire University.

Top 20 Online Enrollment 2012 thru 2016

The post Fall 2016 Top 20 Largest Online Enrollments In US – With Trends Since 2012 appeared first on e-Literate.

by Phil Hill at January 16, 2018 01:09 AM

January 15, 2018

Apereo Foundation

Opencast Community Summit 2018: Registration is Open

Opencast Community Summit 2018: Registration is Open

The annual Opencast conference will be hosted from February 14 to February 16, 2018 at the University of Vienna. The meeting will allow participants from the Opencast community to present their current activities related to Opencast and academic video management in general.

by Michelle Hall at January 15, 2018 08:38 PM

Opencast Community Summit 2018: Registration is Open

Opencast Community Summit 2018: Registration is Open

The annual Opencast conference will be hosted from February 14 to February 16, 2018 at the University of Vienna. The meeting will allow participants from the Opencast community to present their current activities related to Opencast and academic video management in general.

by Michelle Hall at January 15, 2018 08:38 PM

January 14, 2018

Michael Feldstein

Fall 2016 IPEDS Data: New Profile of US Higher Ed Online Education

The National Center for Educational Statistics (NCES) and its Integrated Postsecondary Education Data System (IPEDS) provide the most official data on colleges and universities in the United States. I have been analyzing and sharing the data since the inaugural Fall 2012 dataset.

Below is a profile of online education in the US for degree-granting colleges and university, broken out by sector and for each state.

Please note the following:

  • For the most part distance education and online education terms are interchangeable, but they are not equivalent as DE can include courses delivered by a medium other than the Internet (e.g. correspondence course).
  • I have provided some flat images as well as an interactive graphic at the bottom of the post. The interactive graphic has much better image resolution than the flat images.
  • There are two tabs below in the interactive graphic - the first shows totals for the US by sector and by level (grad, undergrad); the third shows a map view allowing filtering by sector.

Table of DE enrollments

There is also a map view of state data colored by number of, and percentage of, students taking at least one online class for each sector. If you hover over any state you can get the basic data. As an example, here is a view highlighting Virginia private 4-year institutions.

State map view of DE enrollments

Interactive Graphic

For those of you who have made it this far, below is the interactive graphic, which can also be found here. Enjoy the data.

<noscript><a href=""><img alt=" " src="" style="border: none;" /></a></noscript>

The post Fall 2016 IPEDS Data: New Profile of US Higher Ed Online Education appeared first on e-Literate.

by Phil Hill at January 14, 2018 10:18 PM

January 12, 2018

Dr. Chuck

Abstract: Implementing a Standards Compliant Educational App Store with Tsugi

When you combine the IMS LTI, Deep Linking, and Common Cartridge standards and use them together in a coordinated fashion, you can build and Educational App Store that has smooth integration into the major LMS systems using only IMS standards. Tsugi ( is a software framework that reduces the effort required to build IMS standards-compliant applications and integrate them into a learning ecosystem. This presentation will highlight how IMS standards can be used to deploy an educational app store like and talk about how an App Store lays a foundation towards a Next Generation Digital Learning Ecosystem (NGDLE).

by Charles Severance at January 12, 2018 07:28 PM

January 08, 2018

Adam Marshall

WebLearn and Turnitin Courses and User Groups: Hilary Term 2018

IT Services offers a variety of taught courses to support the use of WebLearn and the plagiarism awareness software Turnitin. Course books for the formal courses (3-hour sessions) can be downloaded for self study. Places are limited and bookings are required. All courses are free of charge.

Click on the links provided for further information and to book a place.

WebLearn 3-hour courses:

Byte-sized lunch time sessions:

These focus on particular tools with plenty of time for questions and discussion

Plagiarism awareness courses (Turnitin):

User Group meetings:

by Jill Fresen at January 08, 2018 05:05 PM

January 01, 2018

Dr. Chuck

A Happy Tsugi New Year – A look back and a look ahead to 2018

I figured I should reflect on Tsugi as we move into the new year. It has been over six years since I started the code base that would become Tsugi in 2013.

In 2017, we made a lot of progress so Tsugi can be used by by a much broader audience. Some important Tsugi achievements include:

– A place to host open source Tsugi tools at scale for free – – this required new Amazon features and required improving the “App Store” experience for tools-only servers. There is an app store with metadata and screenshots like any app store.

– Adding support for Google Classroom in addition to LTI for LMS integration. I heard that Google Classroom already owns >60% of the K12 market share. I think that over time Classroom will erode market share in K12 market and in time will begin to make inroads into higher education starting with Community Colleges / FE. So strategically, I want Tsugi to have an early presence in that new emerging market.

– Cleaned up the existing tools in the “tsugitools” repo – like the peer grader with an eye to making the tools more usable by folks other than me :)

– Started to lay the legal groundwork to establish the first Tsugi Commercial affiliate. This is a lower priority activity – once the free/open Tsugi / TsugiCloud is solid – I will progress a commercial offering. If demand emerges for a commercial Tsugi offering, it will be quite easy to replicate the TsugiCloud infrastructure for a commercial offering.

Looking forward to 2018, I have a few goals:

– Begin to document and market to build a beta customer base. My first customers will likely be Sakai schools but I will work to get exposure to K12 alpha testers to get a small base of K12 customers. Let me know if you are interested in being an early customer or if you know someone who might want to use TsugiCloud.

– Recruit new open source applications for TsugiTools and host them for free on TsugiCloud

– Focus on cleaning up the developer documentation on to make it easier to develop new applications.

– I will be running an “Tsugi Developer” class on-campus at UMich during Winter semester. This will help improve my documentation and work out the kinks in the developer experience.

– It will be a high priority to build 2-3 new high-quality tools: (1) A threaded discussion tool with grading, (2) A wiki-like tool, perhaps based on HAX from ELMSN, (3) A tool to include H5P content. This fits with the 2018 focus on building tools on the Tsugi infrastructure.

– I am trying to think of something to trigger Tsugi tool development – perhaps a hack-athon or a contest – something to build interest in developing tools.

So it should be an interesting 2018. There is a lot of work to do but a lot of great work to build on.

by Charles Severance at January 01, 2018 07:12 PM

December 28, 2017

Dr. Chuck

App Store Progress on

I have done a bunch of updates to Tsugi’s support for stand alone App Stores (without any kind of lesson content). This is all in production at

I have expanded the contract in register.php for the tools to describe themselves and improved the pattern in .htaccess / tsugi.php to better support the App Store. You can see this all in action at:

Play with “Test” and “Tool URLs”. A much smoother flow and richer experience.

You can see the new patterns for developers to take advantage of this in a relatively simple tool like:

Look at .htaccess / tsugi.php / register.php and the store folder which holds screen shots. Some notes:

– The new and expanded register.php is what drives the pretty store view under /tsugi/store

– The new tsugi.php makes it so every tool has a Canvas configuration URL and can dump its own configuration in JSON (more to come here):

– There are new options in tsugi/config.php to include a privacy url and service level agreement url:

$CFG->privacy_url = '';
$CFG->sla_url = '';

These are important when connecting to Google Classroom so you should have them for your sites. Don’t point to mine – make your own and be honest and thorough.

And while I am on the topic – you might want to take a minute and play with Google Classroom. It is easiest to use a non-enterprise Google account. Some enterprises (like do not let their users use Google Classroom. But my account works fine.

Log in to and make a course. Then go to

And connect to Google Classroom. All of a sudden little green squares show up to let you push tools into Google. Grades flow and everything. Google Classroom flow is pretty nice – but like any proprietary integration – to make it work on the Tool Provider side requires special tooling.

So in summary, if you are a Tsugi tool developer, you might want to up your game in register.php, tsugi.php (adding .htaccess if you don’t already have it) and adding some screen shots in a store folder. The App Store falls back nicely with a simpler view until you upgrade your tool to feed the necessary metadata to expanded store.

Hope you like it and comments welcome.

by Charles Severance at December 28, 2017 09:49 PM

December 21, 2017

Adam Marshall

Write your own tools and utilities using WebLearn’s Entity Broker REST interface

Entity Broker is a REST web service interface to Sakai. It is self-documenting, see, but as you will see, some of the documentation is somewhat lacking.

I stumbled across a blog post that Damion Young’s made about WebLearn (Sakai’s) Entity Broker, he has very kindly filled is some of the missing pieces of the jigsaw.

It is entirely possible to write useful utilities using a combination of JavaScript, HTML and calls to Entity Broker, indeed, this is how the original Mobile Oxford offered a mobile interface to WebLearn. (The current Mobile Oxford no longer offers such an interface.)

Some of the most recent WebLearn utilities / dashboards have been written in JavaScript, examples,


by Adam Marshall at December 21, 2017 11:07 AM

October 11, 2017

Apereo OAE

Getting started with LibreOffice Online - a step-by-step guide from the OAE Project

As developers working on the Apereo Open Academic Environment, we are constantly looking for ways to make OAE better for our users in universities. One thing they often ask for is a more powerful word processor and a wider range of office tools. So we decided to take a look at LibreOffice Online, the new cloud version of the LibreOffice suite.

On paper, LibreOffice Online looks like the answer to all of our problems. It’s got the functionality, it's open source, it's under active development - plus it's backed by The Document Foundation, a well-established non-profit organisation.

However, it was pretty difficult to find any instructions on how to set up LibreOffice Online locally, or on how to integrate it with your own project. Much of the documentation that was available was focused on a commercial spin-off, Collabora Online, and there was little by way of instructions on how to build LibreOffice Online from source. We also couldn't find a community of people trying to do the same thing. (A notable exception to this is m-jowett who we found on GitHub).

Despite this, we decided to press on. It turned out to be even trickier than we expected, and so I decided to write up this post, partly to make it easier for others and partly in the hope that it might help get a bit more of a community going.

Most of the documentation recommends running LibreOffice Online (or LOO) using the official Docker container, found here. Since we recently introduced a dockerised development setup for OAE, this seems like a good fit. A downside to this is that you can’t tweak the compilation settings, and by default, LOO is limited to 20 connections and 10 documents.

While this limitation is fine for development, OAE deployments typically have tens or hundreds of thousands of users. We therefore decided to work on compiling LOO from source to see whether it would be possible to configure it in a way that allows it to support these kinds of numbers. As expected, this made the project substantially more challenging.

I’ve written down the steps to compile and install LOO in this way below. I’m writing this on Linux but they should work for OSX as well.

Installation steps

These installation steps rely heavily on this setup gist on GitHub by m-jowett, but have been updated for the latest version of LibreOffice Online. To install everything from source, you will need to have git and Node.js installed; if you don’t already have them, you can install both (plus npm, node package manager) with sudo apt-get install git nodejs npm. You need to symlink Node.js to /usr/bin/node with sudo ln -s /usr/bin/nodejs /usr/bin/node for the makefiles. You’ll also need to install several dependencies, so I recommend creating a new directory for this project to keep everything in one place. From your new directory, you can then clone the LOO repository from the read-only GitHub using git clone

Next, you’ll need to install some dependencies. Let’s start with C++ library POCO. POCO has dependencies of it’s own, which you can install using apt: sudo apt-get install openssl g++ libssl-dev. Then you can download the source code for POCO itself with wget Uncompress the source files, and as root, run the following command from your newly uncompressed POCO directory:

./configure --prefix=/opt/poco
make install

This installs POCO at /opt/poco.

Then we need to install the LibreOffice Core. Go back to the top level project directory and clone the core repository: git clone Go into the new 'core' folder. Compiling the core from source requires some more dependencies from apt. Make sure the deb-src line in /etc/apt/sources.list is not commented out. The exact line will depend on your locale and distro, but for me it’s deb-src xenial main restricted. Next, run the following commands:

sudo apt-get update
sudo apt-get build-dep libreoffice
sudo apt-get install libkrb5-dev

You can also now set the $MASTER environment variable, which will be used when configuring parts of LibreOffice Online:

export MASTER=$(pwd)

Then run to prepare for building the source with ./ Finally, run make to build the LibreOffice Core. This will take a long time, so you might want to leave it running while you do something else.

After the core is built successfully, go back to your project root folder and switch to the LibreOffice Online folder, /online. I recommend checking out the latest release, which for me was 2.1.2-13: git checkout 2.1.2-13. We need to install yet more dependencies: sudo apt-get install -y libpng12-dev libcap-dev libtool m4 automake libcppunit-dev libcppunit-doc pkg-config, after which you should install jake using npm: npm install -g jake. We will also need a python library called polib. If you don’t have pip installed, first install it using sudo apt-get install python-pip, then install the polib library using pip install polib. We should also set some environment variables while here:

export SYSTEMPLATE=$(pwd)/systemplate
export ROOTFORJAILS=$(pwd)/jails

Run ./ to create the configuration file, then run the configuration script with: 

./configure --enable-silent-rules --with-lokit-path=${MASTER}/include --with-lo-path=${MASTER}/instdir --enable-debug --with-poco-includes=/opt/poco/include --with-poco-libs=/opt/poco/lib --with-max-connections=100000 –with-max-documents=100000

Next, build the websocket server, loolwsd, using make. Create the caching directory in the default location with sudo mkdir -p /usr/local/var/cache/loolwsd, then change caching permissions with sudo chmod -R 777 /usr/local/var/cache/loolwsd. Test that you can run loolwsd with make run. Try accessing the admin panel at https://localhost:9980/loleaflet/dist/admin/admin.html. You can stop it with CTRL+C.

That, as they say, is it. You should now have a LibreOffice Online installation with a maximum connections and maximum documents both set to 100000. You can adjust these numbers to your liking by changing the with-max-connections and with-max-documents variables when configuring loolwsd.

Final words

Overall, I found this whole experience a bit discouraging. There was a lot of painful trial and error. We are still hoping to use LibreOffice Online for OAE in the future, but I wish it was easier to use. We'll be posting a request in The Document Foundation's LibreOffice forum for a docker version without the user limits to be released in future.

If you're also thinking about using LOO, or are already, and would like to swap notes, we'd love to hear from you. There are a few options. You can contact us via our mailing list at or directly at

October 11, 2017 11:00 AM

September 18, 2017


Online Video Tutorial Authoring – Quick Overview

As an instructional designer a key component to my work is creating instructional videos.  While many platforms, software and workflows exist here’s the workflow I use:

    1. Write the Script:  This first step is critical though to some it may seem rather artificial.  Writing the script helps guide and direct the rest of the video development process. If the video is part of a larger series, inclusion of some ‘standard’ text at the beginning and end of the video helps keep things consistent.  For example, in the tutorial videos created for our Online Instructor Certification Course, each script begins and ends with “This is a Johnson University Online tutorial.” Creating a script also helps insure you include all the content you need to, rather than ad-libbing – only to realize later you left something out.As the script is written, particular attention has to be paid to consistency of wording and verification of the steps suggested to the viewer – so they’re easy to follow and replicate. Some of the script work also involves set up of the screens used – both as part of the development process and as part of making sure the script is accurate.


  1. Build the Visual Content: This next step could be wildly creative – but typically a standard format is chosen, especially if the video content will be included in a series or block of other videos.  Often, use of a 16:9 aspect ratio is used for capturing content and can include both text and image content more easily. Build the content using a set of tools you’re familiar with. The video above was built using the the following set of tools:
    • Microsoft Word (for writing the script)
    • Microsoft PowerPoint (for creating a standard look, and inclusion of visual and textual content – it provides a sort of stage for the visual content)
    • Google Chrome (for demonstrating specific steps – layered on top of Microsoft PowerPoint) – though any browser would work
    • Screencast-O-Matic (Pro version for recording all visual and audio content)
    • Good quality microphone such as this one
    • Evernote’s Skitch (for grabbing and annotating screenshots), though use of native screenshot functions and using PowerPoint to annotate is also OK
    • YouTube or Microsoft Stream (for creating auto-generated captions – if it’s difficult to keep to the original script)
    • Notepad, TextEdit or Adobe’s free Brackets for correcting/editing/fixing auto-generated captions VTT, SRT or SBV
    • Warpwire to post/stream/share/place and track video content online.  Sakai is typically used as the CMS to embed the content and provide additional access controls and content organization
  2. Record the Audio: Screencast-O-Matic has a great workflow for creating video content and it even provides a way to create scripts and captions. I tend to record the audio first, which in some cases may require 2 to 4 takes. Recording the audio initially, provides a workflow to create appropriate audio pauses, use tangible inflection and enunciation of terms. For anyone who has created a ‘music video’ or set images to audio content this will seem pretty doable.
  3. Sync Audio and Visual Content: So this is where the use of multiple tools really shines. Once the audio is recorded, Screencast-O-Matic makes it easy to re-record retaining the audio portion and replacing just the visual portion of the project. Recording  the visual content (PowerPoint and Chrome) is pretty much just listening to the audio and walking through the slides and steps using Chrome. Skitch or other screen capture software may have already been used to capture visual content I can bring attention to in the slides.
  4. Once the project is completed, Screencast-O-Matic provides a 1 click upload to YouTube or save as an MP4 file, which can then be uploaded to Warpwire or Microsoft Stream.
  5. Once YouTube or Microsoft Stream have a viable caption file, it can be downloaded and corrected (as needed) and then paired back with any of the streaming platforms.
  6. Post of the video within the CMS is as easy as using the LTI plugin (via Warpwire) or by using the embed code provided by any of the streaming platforms.

by Dave E. at September 18, 2017 04:03 PM

September 01, 2017

Sakai Project

Sakai Docs Ride Along

Sakai Docs ride along - Learn about creating Sakai Online Help documentation September 8th, 10am Eastern

by MHall at September 01, 2017 05:38 PM

August 30, 2017

Sakai Project

Sakai get togethers - in person and online

Sakai is a virtual community and we often meet online through email, and in real time through the Apereo Slack channel and web conferences. We have so many meetings that we need a Sakai calendar to keep track of our meetings. 

Read about our upcoming get togethers!

SakaiCamp Lite
Sakai VC

by NealC at August 30, 2017 06:37 PM

Sakai 12 branch created!

We are finally here! A big milestone has been reached with the branching of Sakai 12.0. What is a "branch"? A branch means we've taken a snapshot in time of Sakai and put it to the side so we improve it, mostly QA (quality assurance testing) and bug fixing until we feel it is ready to release to the world and become a community supported release. We have a stretch goal from this point of releasing before the end of this year, 2017. 

Check out some of our new features.

by NealC at August 30, 2017 06:00 PM

July 18, 2017

Steve Swinsburg

An experiment with fitness trackers

I have had a fitness tracker of some descript for many years. In fact I still have a stack of them. I used to think they were actually tracking stuff accurately. I compete with friends and we all have a good time. Lately though, I haven’t really seen the fitness benefits I would have expected from pushing myself to get higher and higher step counts. I am starting to think it is bullshit.

I’ve have the following:

  1. Fitbit Flex
  2. Samsung Gear Wear
  3. Fitbit Charge HR
  4. Xiaomi Mi Band
  5. Fitbit Alta
  6. Moto 360
  7. Phone in pocket setup to send to Google Fit.
  8. Garmin ForeRunner 735XT (current)

Most days I would be getting 12K+ just by doing my daily activities (with a goal of 11K): getting ready for work and children ready for school (2.5K), taking the kids to school (1.2K), walking around work (3K), going for a walk at lunch (2K), picking up the kids and doing stuff around the house of an evening (3.5K) etc.

My routine hasn’t really changed for a while.

However, two weeks ago I bought the Garmin Forerunner 735XT, mainly because I was fed up with the lack of Android Wear watches in Australia as well as Fitbit’s lack of innovation. I love Android Wear and Google Fit and have many friends on Fitbit, but needed something to actually motivate me to exercise more.

The first thing I noticed is that my step count is far lower than any of the above fitness trackers. Like seriously lower. We are talking at least 30% or more lower. As I write this I am sitting at ~8.5K steps for the day and I have done all of the above plus walked to the shops and back (normally netting me at least 1.5K) and have switched to a standing desk at work which is about 3 metres closer to the kitchen that my original desk. So negligible distance change. The other day I even played table tennis at work (you should see my workplace) and it didn’t seem to net me as many steps as I would have expected.

Last night I went for a 30 min walk and snatched another 2K, which is pretty accurate given the distance and my stride length. I think the Fitbit would have given me double that.

This is interesting.

Either the Garmin is under-reporting or the others are over-reporting. I suspect the latter. The Garmin tracker cost me close to $600 so I am a bit more confident of its abilities than the $15 Mi band.

So, tomorrow I am performing an experiment.

As soon as I wake up I will be wearing my Garmin watch, Fitbit Charge HR right next to it, and keeping my phone in my pocket at all times. Both the watch and Fitbit will be setup for lefthand use. The next day, I will add more devices to the mix.

I expect the Fitbit to get me to at least 11K, Google fit to be under that (9.5K) and Garmin to be under that again (8K). I expect the Mi band to be a lot more than the Fitbit.

The fitness tracker secret will be exposed!

by steveswinsburg at July 18, 2017 12:46 PM

June 16, 2017

Apereo OAE

OAE at Open Apereo 2017

The Open Apereo 2017 conference took place last week in Philadelphia and it provided a great opportunity for the OAE Project team to meet and network for three whole days. The conference days were chock full of interesting presentations and workshops, with the major topic being the next generation digital learning environment (NGDLE). Malcolm Brown's keynote was a particularly interesting take on this topic, although at that point the OAE team was still reeling from having a picture from our Tsugi meeting come up during the welcome speech - that was a surprising start for the conference! We made note about how the words 'app store' kept popping up in presentations and in talks among the attendees again and again - perhaps this is something we can work towards offering within the OAE soon? Watch this space...

The team also met with people from many other Apereo projects and talked about current and future integration work with several project members, including Charles Severance from Tsugi, Opencast's Stephen Marquard and Jesus and Fred from Big Blue Button. There's some exciting work to be done in the next few weeks... While Quetzal was released only a few days before the conference, we are now teeming with new ideas for OAE 14!

After the conference events were over on Wednesday, we gathered together to have a stakeholders meeting where we discussed strategy, priorities and next steps. We hope to be delivering some great news very soon.

During the conference, the OAE team also provided assistance to attendees in using the Open Apereo 2017 group hosted on *Unity that supported the online discussion of presentation topics. A lot of content was created during the conference days so be sure to check it out if you're looking for slides and/or links to recorded videos. The group is public and can be accessed from here.

OAE team members who attended the conference were Miguel and Salla from *Unity and Mathilde, Frédéric and Alain from ESUP-Portail.

June 16, 2017 12:00 PM

June 01, 2017

Apereo OAE

Apereo OAE Quetzal is now available!

The Apereo Open Academic Environment (OAE) project is delighted to announce a new major release of the Apereo Open Academic Environment; OAE Quetzal or OAE 13.

OAE Quetzal is an important release for the Open Academic Environment software and includes many new features and integration options that are moving OAE towards the next generation academic ecosystem for teaching and research.


LTI integration

LTI, or Learning Tools Interoperability, is a specification that allows developers of learning applications to establish a standard way of integrating with different platforms. With Quetzal, Apereo OAE becomes an LTI consumer. In other words, users (currently only those with admin rights) can now add LTI standards compatible tools to their groups for other group members to use.

These could be tools for tests, a course chat, a grade book - or perhaps a virtual chemistry lab! The only limit is what tools are available, and the number of LTI-compatible tools is growing all the time.

Video conferencing with Jitsi

Another important feature introduced to OAE in Quetzal is the ability to have face-to-face meetings using the embedded video conferencing tool, Jitsi. Jitsi is an open source project that allows users to talk to each other either one on one or in groups.

In OAE, it could have a number of uses - maybe a brainstorming session among members of a globally distributed research team, or holding office hours for students on a MOOC. Jitsi can be set up for all the tenancies under an OAE instance, or on a tenancy by tenancy basis.


Password recovery

This feature that has been widely requested by users: the ability to reset their password if they have forgotten it. Now a user in such a predicament can enter in their username, and they will receive an email with a one-time link to reset their password. Many thanks to Steven Zhou for his work on this feature!

Dockerisation of the development environment

Many new developers have been intimidated by the setup required to get Open Academic Environment up and running locally. For their benefit, we have now created a development environment using Docker containers that allows newcomers to get up and running much quicker.

We hope that this will attract new contributions and let more people to get involved with OAE.

Try it out

OAE Quetzal can be experienced on the project's QA server at It is worth noting that this server is actively used for testing and will be wiped and redeployed every night.

The source code has been tagged with version number 13.0.0 and can be downloaded from the following repositories:


Documentation on how to install the system can be found at

Instruction on how to upgrade an OAE installation from version 12 to version 13 can be found at

The repository containing all deployment scripts can be found at

Get in touch

The project website can be found at The project blog will be updated with the latest project news from time to time, and can be found at

The mailing list used for Apereo OAE is You can subscribe to the mailing list at

Bugs and other issues can be reported in our issue tracker at

June 01, 2017 05:00 PM