Posts tagged ‘api’

December 29th, 2014

Automatic Scaling with Chef and Kaltura API

by Jess Portnoy

Consider the following Kaltura cluster, built with Chef and Amazon’s EC2: A Chef server, 1 load balancer, 2 fronts nodes, 2 batch nodes, 2 Sphinx nodes, and a single MySQL DB.
Usually, this cluster layout will do well in handling the average load of a medium sized user-generated video site (or video app). What if all of sudden there are significantly more videos uploaded, how can you avoid downtime due to the increase in traffic?

This post demonstrates how to automatically scale a Kaltura cluster based on system load monitoring using Opscode Chef and Kaltura API.

To simulate the heavy load, we will use Kaltura’s PHP5 Client Library, to call the bulk upload API adding videos to the transcoding queue, and build a Kaltura watchdog script that will run as a cronjob to alert us when the conversion load hits a certain threshold.

As the watchdog alerts on loaded transcoding queue, a Chef knife command (using its EC2 plugin) will launch additional batch instances to handle the load.

Live Demo


What you need


Setting up

To connect the Chef server to the Kaltura cluster and run the Kaltura watchdog script, install the kaltura-base package. To install kaltura base, SSH to the Chef machine and, as super user:

Note: Doing only this configuration step will not start any unneeded Kaltura daemons or expose the Kaltura web interfaces from the Chef server. The kaltura-base package will only allow our watchdog script to connect the rest of the Kaltura cluster and monitor it via the Kaltura API.

Next, also on the Chef machine, edit: /opt/kaltura/app/tests/monitoring/config.ini

To retrieve the account API secret keys for partner id -4, run from command line with a power user:

To retrieve the account API admin secret key for partner id -1, run from command line with a power user:

To test the watchdog, run from command line with a power user:


The watchdog script

See the watchdog code on GitHub. ( Feel free to fork and submit pull requests! )

Save the code to /usr/local/bin/ and give it the executable permission:

Test the watchdog using bulk upload. From Chef server, run the following:

Run the upload_bulk script a few times to get a conversion queue going.

Normally, you will run the watchdog in crontab, at about 5 min interval. To see it in action, lets run it manually:

Let’s pass very small thresholds to the watchdog to see it working. Pass 1 for warning and 10 for critical. (Naturally, in Production, numbers will be higher.) From command line, run the following command:

This will run the watchdog in an endless loop in the shell we’re at so we can see its output:

As you can see, we successfully launched a new EC2 instance, and applied the nfs and kaltura::batch Chef recipes using chef-client.


What’s next?

To extend this functionality into production mode, run a manager that will:

  • Keep monitoring the transcoding queue using the watchdog
  • Keep a list of new batch servers launched when the load gets high
  • When the load calms down, stops the batch daemon on the new transcoding node, waits 20 minutes to makes sure the load remains low, and terminate the instance

Note: that the same practice can be applied to other cloud infrastructures or VM clusters (such as VMWare) using their respective APIs.

If you build on it, please submit a pull request on the GitHub project.

June 27th, 2014

Test Driven Learning Begets Test Driven Development

by Michael 'Flip' McFadden


Like most developers, I was approached by my management to “Make Something Work” without having any prior experience.  The job was to connect our Plone/Zope content management system to Kaltura, so web content editors could seamlessly upload and edit video content and metadata that is managed by the KMC.  It wasn’t hard to find the Kaltura Python API Client Library, but once you have the Client Library, you have to learn how to use it – and at the same time, learn the features that the KMC provides (see:
Kaltura Management Console Training Track).

I can read through the many docs from cover to cover (I usually don’t) and still have the uncomfortable, lost feeling of having no clue what’s going on. And then there’s always the pressure of overcoming the learning curve in a reasonable amount of time.
So I begin by writing “Playground Code”. A directory that will be filled with useless, proof-of-concept code that helps me get the hang of a language, an API, or a new concept. This code will never be used in production, which gives me the permission to write really bad code while I climb up the learning curve.
Being able to become unattached to code, throw it all away and start over, was an important step for me. You learn the ‘right way’ to do things by doing them the ‘wrong way’ first. It also helped me figure out where exactly I should be reading in the docs to get done what needs to be done.

In the past few years, I’ve been working a lot with the concept of Test-Driven-Development. In TDD you write very small, encapsulated tests before you actually code the functionality or patch you are implementing. You are, in fact, intentionally writing failing test cases. I found this method very useful for isolating and fixing bugs, but not so much for new large projects or new enhancement development. The requirement that the tests should be atomic and very specific does not lend itself to complicated projects with many moving parts. Until now.

When I found myself having to learn the Kaltura Python Client Library – having no prior experience, I found the concept of “Playground Code” and Test Driven Development coming together. I simply took my proof of concept, put together some code and threw an assert() statement at the end, and viola – it’s now a test case!

“How do I connect to a Kaltura Server with the Python API”

The answer was “testConnect()” – that was easily incorporated into a test suite using python’s excellent testing framework ‘unittest‘ (Then, assert that something like returns something that looks like a response. Or, at the very least, not an exception).

I developed the trivial, but important test case at the same time I learned how to connect to the Kaltura server! My code doesn’t have to be thrown away, nor does it have to be perfect. However, it can now serve the purpose of being a proof of concept, a unittest, and a code example for the next developer all at the same time.

When I got confused with something, I could easily take my entire test case, which was an atomic, very specific exercise/problem, and post it to the forums as is – and quickly get a direct answer on what was confusing me – instead of submitting a link to my entire application with the “xxxx not working” title, which would have made it harder for others to review and help.

And then it got even better. The proof of concept code grew as I learned more about the API. A large tests module started forming. I started coming finding small bugs in the Kaltura Python Client Library, nothing critical, but important to my application – And I was able to patch, test and contribute my code upstream to the Kaltura project.

Through my humble experience (from complete newbie state) with Kaltura’s API and Python Client Library, I was able to submit and contribute a more polished and complete Python Test suite for the Kaltura API Client Library!


Want to join the Kaltura project and become an active contributor? Start Here.

January 15th, 2013

How to Create a Successful Open Source Business Model

by Zohar Babin

This post was originally published on the Computer Weekly Open Source Insider blog. It was written by myself and Dr Shay David, co-founder of the Kaltura.

open source comapnies icon v2

Open source projects are measured by the size of their developer communities, by market adoption, by the number of downloads and other such metrics; companies are measured in terms of revenue and profits.

Often, attempts at maximising profit can conflict with the interests of the community or the adoption metrics. So how can competing interests be aligned?

What makes for a successful balance that allows a commercial open source software company to thrive while serving all of its masters?

Open sourcing commercial software poses many challenges, the biggest of which revolves around the meaning of ‘FREE’. At the heart of the matter is the need to release code for free vs. protecting existing business interests, staying ahead of competition, and allowing customers to own their commercial deployments.

There are various ways to make money while developing open source software, including:

● Providing integration and support services (Acquia)
● Selling subscriptions to updates and support (Red Hat)
● Selling proprietary components to segments of the user base (Funambol)
● Selling premium plugins, applications, services and themes (Joomla, WordPress)
● Selling hosting services (i.e. Software as a Service SaaS model, adopted by companies such as Acquia, Alfresco)
● Selling the software under a commercial licence and releasing the code under an open source licence simultaneously, aka Dual licensing (MySQL).

As an example, at Kaltura we chose to maintain a combination of business models, leading with a dual licence model that is combined with a SaaS offering and an API centric architecture. Released under AGPL and a commercial licence, Kaltura has rapidly grown to be the leading media management platform on the market.

Adopting a dual licence approach enables developers and customers to adapt and modify the open source software to their needs, while the commercial licence allows companies to provide customers with the ability to keep derivatives – or to embed the software in proprietary solutions along with warranty, indemnification, and professional services. For those who need help running their system, a SaaS offering provides a set of affordable hosting services.

An open API architecture is also important. It provides platform-agnostic means for developers to create video-centric applications quickly and cost effectively. As an example, Kaltura’s community and partners program (dubbed the Kaltura Exchange), enables developers and third party vendors to play an important and valuable role in extending the Kaltura software, by building innovative solutions on top of our video platform and expanding its reach to new markets while supporting the ever-growing needs of the existing customer base.

The last few years have shown us that open systems win every time: Android, for example, now outsells the closed iOS 2.1, despite Apple’s huge early lead. Every large organisation in the world now relies on open source software for mission-critical infrastructure. By combining different open business models software vendors can do well by doing good. Gone are the days where customer retention was achieved by locking data into proprietary formats or developing secret software.

With its extreme transparency, open source forces vendors to focus on creating customer value – or risk becoming obsolete. Entering the virtuous cycle of value creation that customers are willing to pay for, rapid development that leverages the community, and quick cycles to create more value… this is the promise of good open source business models.

June 21st, 2012

Want to Improve Search and Video Recommendations? Try The New Tags Editor

by Thomas Huzij

Tags are awesome, but only if you really give them the attention they deserve. Just like kids (or pets) each one is different and deserves special care. However, what do you do when you have 100s of them? We had this problem with the Kaltura Video Portal and we needed a smarter way for metadata management. While the KMC (Kaltura’s Management Console) provides many robust versatile tools for managing your media and its metadata, simple fields like tags, are managed only at the entry level, while missing the overall-view of your content library.


Introducing Tags Editor… 









Tags Editor is a new tool to let you quickly and efficiently add and remove tags from your media entries. What used to be slow and tedious has now been made a painless process. Updating the tags for your videos has never been easier.


Before the Tags Editor


  • To update the tags for your videos, you would have to go through each individual video in the KMC and update the tags. This would be slow because the KMC has to track a lot more metadata than just the list of tags.
  • Updating individual entries could lead to many redundant tags. For example, one video might be tagged with “aquarium” while another may be tagged with “tank”. There was no way to track all of the tags you were using across your entire media.list
  • If you no longer wanted to use certain tags, you would have to track down every video with those tags and remove them manually.


How the Tags Editor makes things easier


  • Keeps track of all the tags in your media.list and even counts how often they show up. This lets you know just how many videos are using certain tags.
  • Rather than typing out tags for each video, you can add tags to your temporary tag database. Once a tag is in that database, adding it to individual videos is much faster and ensures that no redundancy occurs.
  • When using the “Remove Tags” form, removing a tag from the database will go ahead and actually remove it from all of your videos so nothing is left over.


The libraries used to get the job done


  • Kaltura’s PHP5 Client Library
  • Harvest’s Chosen to make all the delightful and user friendly select boxes
  • Loadmask, a jQuery plugin to mask elements while they’re loading and prevent any hiccups


A trick used to access all the media entries

To generate the list of tags for the user’s entire media.list there is no choice but to traverse every entry and retrieve its tags. However, doing so without a filter and using a pager to increment the page index will not end well. The server has a hard limit of 10,000 entries when accessing the media.list. A user may in fact have more than 10,000 entries stored but without the proper filters you cannot simply go in order and access all of them using a pager. There is however a way to get around this and it involves a clever use of filters.

As the entries are traversed there are two properties that we keep track of, their creation time and their entry id’s. The page index on the other hand is not used at all. Instead of blindly going through each page, we set a filter that arranges the entries by their createdAt times in descending order. So when a call to the API is made to retrieve 500 entries, it retrieves the 500 newest entries it can find.

So now we’re starting to get somewhere. However, if we were to create a loop, this would just keep giving us the same 500 entries over and over. This is where the filter’s “createdAtLessThanOrEqual” and “idNotIn” fields come into play. Each time the loop iterates, the id of every entry examined is added to a list and we record the entry’s createdAt time as well. That way, the next time the loop iterates, createdAtLessThanOrEqual and idNotIn ensure that the 500 entries being pulled from the server have not been traversed yet. This is much faster than using a pager and gets around the server limitations.


Source Code

The source code for the tool can be found at our Github page.

Feel free to fork it or suggest new features!



You can view a demo of Tags Editor by clicking the thumbnail above or right here.


Stay tuned for more API best-practices and apps. To learn more now, check out the Kaltura API Documentation Set and subscribe to the Kaltura Newsletter.