Eyal ZilberbergHow to run commands on Amazon EC2 windows instances during launch time only

In this post I will explain how to run commands on windows EC2 instances during launch time only. While this is easily done on a Linux machine, the solution for windows is not as trivial. First we’ll present the Linux way and explain what we mean by “only during launch time”. Then, we’ll show 3 implementation attempts on windows, until we’ll reach the desired behavior. This post was written for programmers / devops professionals and assumes knowledge of EC2, launching instances and using user data.

(more…)

Read More

Paz BiberMoving to single page application with WebAPI and AngularJS

Single Page Application (SPA). Even the acronym sounds so inviting.
If you are developing a web application you’re probably creating a single page application, or at least you have a plan to make your current application a single page app. In CloudShare, we started a few years ago with ASP.NET. It was nice technology, but it lacked separation between view and code.
Then ASP.NET MVC arrived, mitigated this problem and eased handling AJAX calls with MVC controllers. So we moved to ASP.NET MVC. You can create wonderful web pages simply by writing Controllers, Views and defining the correct Model.
You can even give it a “Single Page Flavour” by invoking ajax calls that retrieve data from a server. But there’s still a problem here. Separation between code and HTML is hard (sometimes impossible) on the server side. Did you ever find yourself writing some c# code in the view just because “the product really needs this, so please make a small change to make it work”? Or client side (jQuery… you know what I’m talking about…) (more…)

Read More

Jonathan SadanHTML5 Remote Access

Chrome, Firefox, Plugins oh my!

Remember browser plugins? Not ‘apps’ or ‘add-ons’ from stores, but plugins. Those installers you used to download in the 90s in order to run special content on the web, such as a game, chat program, or watch some videos. Those were written using NPAPI (the Netscape Plugin API) and ActiveX controls (Internet Explorer’s equivalent). These are APIs that enable web authors to interact in real-time with their users and use any media a computer could deliver. The price though, was to write and maintain your plugin for each and every platform your users had.

Later, with the advent of Flash and Java applet plugins, most of that special content was implemented using only those two. Not incidentally, Flash and Java enable writing the code once and running it on all their supported platforms, unburdening the developer from maintaining multiple code bases.

Then along came HTML5, with the premise of enabling web-pages to become web-apps. Supplying web developers with media, storage, threading, and network APIs. APIs that up until 2009 were only available through the use of plugins. HTML5 made sense too. Web pages looked more and more like applications and less and less like text pages. And why should a plugin become an inseparable part of a system. By definition it plugs-in, it should also be able to plug-out.

Jumping to the present, it looks like it’s an end of an era. The Chromium team announced last year that they’re deprecating NPAPI plugins from Chrome at the beginning of 2014 and completely removing support by the end of that year. Also, Mozilla is marking NPAPI plugins as unwelcome citizens in their ecosystem by making them all ‘Click-To-Play’ by default. And last but not least, ActiveX control support was removed in the ‘New Windows 8 browsing experience’ (the ‘Metro’ application) edition of Internet Explorer available on Windows 8.

If you’re a web developer and this all sounds very irrelevant to you, consider yourself lucky! Unfortunately, this was quite relevant for CloudShare. Here’s why…

Remote Access

Remote Access is a name (and verb) given to protocols such as RDP and VNC that allow a user to use a machine remotely.

The CloudShare web application allows its users to remote access their machines through the browser. With Windows being our most popular OS, RDP is naturally our most popular remote access protocol.

To use RDP through a browser in 2008, CloudShare’s only option was to use NPAPI plugins and ActiveX controls. These plugins allowed Windows clients (browsers) to RDP to a remote machine using a web page element. A marvel at the time. These plugins served us well. At least up until now.

HTML5 Remote Access

Enter the Guacamole project.

Using Canvas and WebSockets HTML5 technologies, the Guacamole project proved any browser can become a client in a remote access protocol. It works something like this:

guac-ses

A machine, acting as a remote-access gateway (a proxy to remote-access other machines) is run with a Guacamole web server and daemon. The user logs into the Guacamole web application, using a browser and chooses his desired machine to connect to. The gateway then acts as a translator between the selected protocol (RDP/SSH/VNC) used by the remote machine and the Guacamole protocol used by the browser.

The actual client is pure HTML, CSS and Javascript, and is served to the browser using the same Guacamole web application.

Multiplatform and Protocol Abstraction

Let’s illustrate our problem. Our remote access page allows a user to connect to a machine using VNC, RDP, or SSH. RDP uses NPAPI plugins or ActiveX controls, and VNC and SSH use Java applets. Each has its own code base and a different interface. To make things more complex, our NPAPI plugin uses Window’s native ActiveX control DLL to connect to a machine using RDP. Different Windows versions provide different DLLs with different behaviors, and non-Windows platforms cannot be supported. This adds the platform dimension to our testing matrix as well.

Using Guacamole not only allows us to maintain one code base to run on any platform a browser runs on, but it also allows us to use any protocol (supported by the Guacamole project) and deliver it on top of a single WebSockets protocol. In other words, not only do we get platform agnostic code, but we also get  the same code to connect to different protocols.

Using Guacamole, our remote-access client becomes much simpler. Changing a protocol is simply a change in configuration and capabilities, not interface or code. As it’s written in Javascript and HTML, it’s cross-platform. With a bit of effort, also cross-browser.

A New Hope (A New Client)

So now we have the task of building a new Remote Access client. The most important gap we needed to fill was security, specifically – authentication and authorization. Guacamole’s gateway application comes with a few out of the box authentication methods that you can install:

Hard coded configuration

An XML file containing your users’ credentials and remote machines. This could work for on-premise use with a small number of users. But it surely isn’t scalable nor maintainable for a Cloud Computing service.

MySQL and LDAP

Managing your users and remote connections using a database schema or Directory Access listings. The problem with those lies with the fact that CloudShare already has users and machines with complex and intricate permissions logic. Mirroring that information and logic is an effort that we would like to avoid.

Security Concerns

Being behind our firewall, we should not forget that Guacamole is a remote-access gateway and as such can access any machine that has SSH, VNC or RDP connections available, even itself. Even without administrative access. This needs to be addressed in our authentication method.

So what do we do?

Three words: ‘One Time Passwords’. Authentication happens in four stages:

  1. A Client requests Cloudshare to connect to a remote machine via Guacamole.
  2. Cloudshare authenticates the client’s request (the user and connection), and saves a temporary cryptographic hash (“password”); or rejects it.
  3. The Client requests connection from Guacamole using the given password.
  4. Guacamole queries Cloudshare and authenticates (or rejects) the request.

This solution removes the need to mirror application logic, and needs minimum maintenance leading to better scalability.

Fortunately Guacamole allows you to write your own authentication method via dynamic class loading, that allowed us to implement the server side of the above solution.

Also, Guacamole’s Javascript client modules are decoupled from Gateway and authentication code. The client modules are the engine that translates the Guacamole protocol instructions to Canvas element images and DOM events, and DOM events back to the Guacamole protocol. This allowed us to embed the client in our Remote Access page and change its authentication method.

A Brighter Web

Hopefully this problem-solution blog was enlightening for you. Guacamole is one of the more inspiring examples of what can be accomplished with HTML5 technologies. Today 90% of all our users use Guacamole to access their machines via VNC and SSH and we’re working hard on bringing RDP into that statistic soon.

Read More

David DvoraJasmine Unit Tests – Testing Legacy Pages

Intro

When approaching testing client side JavaScript code, you first need to ask yourself – “What are we testing?”

We also tried answering it, when we realized that our client-side architecture was mixing DOM manipulation with application logic. This is not surprising for developers that use jQuery in the old fashioned way, as it seems that jQuery actually encourages that kind of behavior because of its syntax and the simplicity of using it.

So which part do we really want to test? This depends on what’s more important for you to verify in your application. One thing we agreed on: we need to separate the client side code into layers in order to make it testable. One layer will be pure application data logic, and the other one will be a wrapper to the code that actually manipulates the DOM (first one uses the latter).

(more…)

Read More

Gilad LazarovichUsing Applitools to radically reduce UI Automation code

We, the CloudShare Quality Assurance team, migrated from a web-based UI Automation implemented using C# and Watin (http://watin.org/) to a solution using C#, Selenium (http://docs.seleniumhq.org/) and Applitools Eyes (http://applitools.com/). With the use of Applitools Eyes, we validate all the UI elements of the application across multiple browsers (Google Chrome, Firefox and Internet Explorer) and across various resolutions.

Applitools Eyes has saved us quite a bit of time and effort from coding (including maintenance), and has allowed us to achieve much greater automated UI coverage. This tool has helped us find quite a few layout issues, flow issues as well as functional issue. As a result, we’ve reduced the overall number of missed bugs and improved the product quality.

We use CloudShare Production environments to host and run all our test environments (for testing both Production and our local Development labs). This includes machines from most modern Windows operating systems, and with all browsers combinations supported (Chrome, Firefox and Internet Explorer 9+). Our automation is executed with each deployment (which happens several times a day), with each Production deployment (every Sunday), and we have full regression cycles happening nightly and throughout the weekend.

Applitools Eyes allows labeling any regions to ignore (useful where we have animated gifs running as an example, or where we use dynamic data), or define floating regions (which can move several pixels depending on page), and these regions automatically propagate to all the screens that include similar elements. Additionally, when we make a change in the application which affects multiple pages in the same way (e.g. changing the header of our site), we only need to review and accept this change for a single window, and Applitools Eyes scans all other windows and automatically approves similar changes.

Following the addition of Applitools Eyes, the amount of Automation code has been greatly reduced (with almost all UI checks are now handled within the screenshot comparison) while increasing browser coverage, resolution coverage, overall test reliability and significantly lowering the time spent on code maintenance. With our previous solution, we only tested using Internet Explorer and missed a good number of bugs that only showed up on one of the other browsers. In addition, we now test with a minimum resolution of 1024*768 and up to 1920*1080 which lets us ensure the UI looks right.

Applitools provides constant functionality updates based on our feedback and was very simple to integrate into our Selenium based project. The latest releases provide OCR support (your can read text from regions labeled within their application and using your application screenshot). In addition, there will be support for several possible scenarios within a screen (when we have warnings on screen as an example during a deployment). As with Selenium, Applitools updates are handled via Nuget packages through Visual Studio.

Please note the following screenshots taken during automation execution and which demonstrate how UI bugs can be found using Applitools Eyes, and how you can configure dynamic data to be ignored when comparing screenshots.

Here’s a screenshot within Applitools that shows our Console access test, you can see the dynamic data regions that are ignored within the screen comparison:

(click image to get full size)

Applitools_Eyes_Console_screenshot_with_region_exclusion
Marking with applitools which UI region should be ignored

Here’s a screenshot within Applitools that shows our RDP access test, you can see the differences highlighted when the dynamic region that has the Windows time is not ignored:

(click image to get full size)

Applitools_Eyes_RDP_screenshot_differences_with_no_region_exclusion_time_different
What happens when clock area isn’t excluded

Here’s a screenshot within Applitools that shows a text overlap/layout issue that was found as part of using the Applitools screenshot comparison (with a resolution of 1024 by 768):

Applitools_Eyes_Text_overlap_layout_issue_detected_with_lower_resolution
Applitools_Eyes_Text_overlap_layout_issue_detected_with_lower_resolution

This is the story about how we reduced our UI Automation code by over 50% while greatly increasing the overall test coverage.

 

Read More

Ido BarkanUsing vagrant and fabric for integration tests

At cloudshare, our service is composed of many components. When we change the code of a given component we always test it. We try to walk the thin line of balancing the amount of unit tests and integration tests (or system tests) so that we can achieve reasonable code coverage, reasonable test run time and, most importantly, good confidence in our code.

Not long ago, we completely rewrote a component called gateway. The gateway runs on a Linux machine, and handles many parts of our internal routing, firewalling, NAT, network load balancing, traffic logging and more. It’s basically a router/firewall that gets configured dynamically to impose different network related rules according to the configuration it knows about. This is done by reconfiguring its (virtual) hardware and kernel modules. It is a python application packaged and deployed using good old debian packaging.

(more…)

Read More

Asaf KleinbortThe Road to Continuous Integration and Deployment

Brief history background:

Six years ago, when I joined CloudShare, I was really impressed by the fact that we release a new version every two weeks. Back then, many companies were still practicing the waterfall methodology and “continuous delivery” was still an unknown term for most of us.

Over time, we evolved. We migrated from SVN to GIT, and began practicing continuous integration using Jenkins. We even had two attempts toward ‘continuous delivery’. Both were purely technical, and handled solely the engineering aspect.

The first was focused on our development pipeline and had a very positive impact: our CI process has matured, we improved our infrastructure to allow much less downtime during deployment, and we upgraded our deployment scripts. However, the release cycle didn’t change.

The second effort was focused on configuration management. We initiated an effort (which is still on-going) of keeping all of our configuration in Git using Chef. As a result, the deployment and configuration management of our very complex service has become significantly more mature. However, again, the release cycle and our dev process in general did not change.

Our old process cons (in short):

Looking at our development process as a whole I identified a few problems and aspects that were ‘out dated’ and needed improvement.

1. We had a two week code freeze for each release. This resulted in developers working on a different version than the testers were testing. Actual time from finishing coding to release was generally 3-4 weeks and never less than 2 weeks.

2. Due to historical reasons, QA was not an integral part of the teams. A very ‘waterfall-like’ structure. The downside of this is described in many places. In my view, as a former developer and a former team leader in CloudShare, the most painful disadvantage in this structure is the fact that a team leader is not independent. Even when implementing a simple work item that was completely under her ‘jurisdiction’, the Team leader needs to depend on the QA team to finish her tasks. And QA teams usually have their own prioritization and plans.

3. We had everything coupled together: delivery schedule was coupled with user story definition, user story implementation and prioritization process.

The above resulted in a risk for our quality. Team leaders and developers were required to handle complex coordination tasks to ensure the quality of what they release. We had too many parallel items ‘in progress’., For a developer/team leader or QA engineer to “not drop the ball” had become a very non-trivial task.

Due to the above (and more) I had been thinking of changing our process for more than a year. But I always had excuses to postpone the change: “I have a new QA manager”; “We have a new product manger, this is not the time for a change”; “I am missing several QA engineers to support the new structure”; “We just need to finish our new build scripts / our new provisioning methods / our new something very technical”; and so on. You get the picture.

Change!

About a month ago, I was (again) challenged by our new product leader: “This is a nice process, but it is really not optimal and up-to-date. Why can’t we release every week?” I finally decided to bite the bullet and lead the change in our development process.

The change is happening right now, so it is much too early for a retrospective or conclusions. I’ll just mention the highlights of what we are doing and will elaborate on our choices and conclusions in different post(s).

What we are doing:

1. We re-organized the teams. QA engineers are now an integral part of each team. The QA team remained very small and is focused mostly on automation tasks. The main advantage here (among many) in my perspective is that every dev team is now able to independently deliver most of its items. Our Team leaders will continue to act as product owners of their teams, but will now have an interdisciplinary and more capable team,

2. We are moving to a delivery cycle of 1 week. The act of the deployment itself (which is pretty straight forward) will be done by all the Dev group members in turns.

3. We are implementing a Kanban method. This will allow us to decouple the delivery cadence from user story implementation. We are still trying to fit every task in no more than two weeks in order to keep our ‘delivery batches’ small, but we do not force a ‘strict time box’ for each work item.

It will also allow us to easily evolve towards continuous delivery for when our deployment pipelines are automatic and mature enough for us to deploy every day – or even several times a day. In other words, we are not waiting for the technology to lead us. We’re building a process that fits current and future deployment capabilities. We decoupled the prioritization process as well. We will have a ‘backlog grooming’ meeting once a month for start. Using Kanban, we will be able to enforce strict limits on the amount of our ‘work in progress’ and to identify our bottlenecks. We are still learning the limits.

Summary:

That’s it. This is our plan. We started last week. I am sure we will hit a lot of bumps in the road, but I am very excited and have a very good feeling about this change. We’ll update on specifics (like our Kanban board columns and limits), lessons learned, and other outcomes in future posts.

 

Read More

Asaf KleinbortCloudShare’s new TechBlog

Hi,

I am happy to introduce the new CloudShare Tech Blog.

It will be focused purely on technical aspects, mainly those that the CloudShare Dev Team encounters in our daily work.

A significant part of CloudShare customers are developers, so part of our customers or prospects might find the ideas and thoughts we will share here interesting. However, this blog is not about marketing to our customers – it is about sharing our ideas, challenges and thoughts with our peers in the software development community.

The idea of starting this blog came to me from the Netflix tech blog. I never really use Netflix, however I read and enjoy their tech blog, specifically during the period when they describe the challenges of migrating their infrastructure to AWS.

The service we develop here is complex and interesting from a software engineering point of view. We have a modern UI, a very complex businesses logic layer, and our own cloud, which is delivered to our customer as a service. We have a lot to share around these topics.

Another interesting aspect (especially for me) is the Development process.
There are many questions here: how to code, how to test, what to test and how much, how often we should release, what our org structure should be in order to provide the best quality and efficiency, and many more.

Since CloudShare’s early days, have been striving for efficiency in our process and have tried to keep up to date with what we thought were the best methodologies.
However, the world is moving as well, and fast. And we are continuously finding out that what is today a state of the art modern approach, can become commodity very quickly. And that there is always room for improvement. We will use this stage to share our thoughts around development methodologies.

I hope we can use this blog to hear back opinions from the developer community as well.
Thank you for reading and participating!

Read More