Thursday, September 7, 2017

Is R production ready?

It has been a long time since our last post, and many things changed for us both. New jobs, new adventures, and new challenges. I will not delve into the details but let's say that our business domain changed, but we are still facing architectural and programmatic problem that we can all enjoy J.



Intro

Recently I have been asked to incorporate Time Series Forecasting into our product. We have been working close with an in house Data Scientists team. They developed the algorithm and then it was up to us to decide how to productize it. If you ask me, one of the biggest challenges facing developers in this Big Data era is the ability to take very complex algorithms and methods (usually written by non-product-facing people), and understand what are the correct approaches and tools to make them work in production. An algorithm can provide great results, but when looking close at the time it takes to run and the load you will be facing you understand that practically it is useless.



The problem

So the problem we were facing was creating a solution for time series forecasting running 24/7. We needed to gather utilization values and from a 3rd party system, and create weekly and daily predictions.



Expected load


Number of requests

Our system works in peaks. We can be requested to compute prediction based on a change in the system, or several times on a daily basis. I will concentrate on the latter cases, since that will give us the max load. Several times a day we needed to serve 4000 forecasting requests coming all in a one minute interval.


Data Points Length

We needed to forecast for about 3 months to the future, on a daily and weekly granularity. In numbers this means that we had 180 data points, and we needed to predict ~90 values to the future.



Possible solutions to run the prediction

We really cracked our head on the best way to tackle this. You need to remember the data scientists wrote their code using R.

At first there were suggestions to rewrite their R code into Java/Python. Since this is not our field of expertise we quickly decided to drop this approach.

We than looked at the possibility to run the R code using Rserve. Rserve is a TCP/IP server which allows other programs to use R. This solution required us to create another container that will run the actual Rserve, require maintenance, and also we would now have a network lag. We could have installed the Rserve on the same machine as the client but that wasn’t possible due to security reasons.

Lastly we looked at rpy2. rpy2 is an interface to R running embedded in a Python process. This gave us the freedom we needed and expertise we have in python accompanied with the ability to withstand the load we were expecting. It’s easy to use and configure (windows needed some hacking but we were able to overcome it). 


Chosen Architecture

So we decided to stay with the R code to run the prediction using rpy2. But that was just half of the way. We still needed a supporting ecosystem that will trigger the R code and sustain our required load. We created a dedicated RedHat VM running 8 cores. We installed there an apache web server that used flask to receive requests from consumers. Python was than used with rpy2 to run the R code. We had a ProcessPoolExecutor that uses a pool of processes to execute calls asynchronously.

The component that actually sent out prediction requests was another service written in Java we had in the system. We decided that it will send asynch requests in bulks. It had a retry mechanism and was sending a callback address to the apache web server.

Results


Well, I must say I was really surprised. The system is running for six months now. The web server is stable, holds the load easily. Moreover the load keeps on growing with no signs of breaking.

As we grow, we expect new challenges. In other words – watch this space J

Installing and configuring rpy2 on Windows using Intellij


This is going to be a quick one. If you need to install rpy2 on Windows there are a few tips I thought I might share:
How do I install all this on my machine?
  • Install
    • R 3.3
    • Anaconda2 (with python 2.7)
  • Add to Path:
    • C:\Program Files\R\R-3.3.3\bin
    • <Anaconda2 folder>
    • <Anaconda2 folder>\Scripts
    • <Anaconda2 folder>\Library\bin
  • In your <Anaconda2 folder> run as administrator: "conda install -c r rpy2=2.8.5"
  • Validate python packages installed in anaconda python:  (In Intellij using alt+enter on the missing import)
    • numpy
    • pandas
    • rpy2
    • etc.
  • Validate existing env variables:
    • R_HOME = C:\Program Files\R\R-3.3.3
    • R_User = <Anaconda2 folder>\Lib\site-packages\rpy2
  • Navigate to R_HOME location and run R ("r" in cmd) to install the following libraries (install.packages("<library>"):
    • >install.packages("zoo")
    • >install.packages("timeDate")
    • >install.packages("forecast")
    • >install.packages("xts")
  • Open the code in Intellij and configure the Intellij python interpreter to <Anaconda2 folder>\python.exe

Monday, June 15, 2015

Agile Estimation - Are We Missing Something?

An honorable member in any development group manifest is enhancing people’s knowledge and expertise. However, it often gets positioned in the infamous “Important/Not Urgent” quarter of the Eisenhower matrix, and gets less attention than it should. Even if an organization does invest effort in this, it will be managed separately from the development process.
In this post I will suggest an agile approach for measuring and monitoring knowledge transfer among team members in an agile environment. I believe that implementing this technique will also have a positive effect on the team estimation quality.
The agile team routine revolves around a “Task”. We plan, estimate and implement tasks. Most of them are created during user stories break down, but many others exist such as bugs, merges, refactoring, etc.
Time spent on a typical task can roughly be divided in two. Time spent learning and time spent working. Given a task, can you state which portion of it was spent on learning and which on working? Well, you might be able to give a rough estimation but it certainly not trivial. This made me wonder if there is an easy way. Here a mental exercise for you. Imagine you perform a task twice in the exact same way. Each time you start from the exact same point. Only the second time you are armed with the experience and knowledge you got at the first time.
How much time would it take you to complete the task the second time?
The second time you don’t need to learn. So it is safe to say that all time spent the second time is “working time”. This gives us the following formal definition:
Given a team member M and a task T the learning factor of a task T with regards to team member M  is defined as the amount of time it will take M to perform the task the first time divided by the amount of time it will take him to complete the task the second.
I apologize for the formality. Let’s have a look at two examples in the form of imaginary tasks belonging to an imaginary team of experienced Java developers. We will then try to estimate the learning factor.
Example 1: Use Xtream java library to map about 50 XML’s to java classes.
What is the learning factor here? Well, if you believe their tutorial (and you should), it takes about 10 minutes to learn how to use XTream. Thus the learning factor here is about ~1. If you start this task again it will roughly take you the same time.
Example 2: Add to a website the option to login with a Google account.
Well, assuming you haven't done this before most of the time here will be spent on learning. The actual implementation is just a few lines of code. So I would roughly go here with ~8.


OK. So now we have a nice new definition. What can we do with it? First, let’s have a look on the existing method for planning the estimation.
Spoiler alert: The following paragraph might be considered a heresy by some.
Agile methods aim to aspire all members to agree on the effort needed for a given task. This is usually achieved by playing Planning Poker. In this game each player will reveal his estimation for a task at the exact same time as the rest of the players. The estimates are then discussed in order to achieve agreement. If you stop for a moment and think about it you can’t help but wonder why?
In practice we know this is not true. The duration on which a person will work on a task is highly dependent on who this person is. So why we should all agree on task estimation? Especially when we still do not know which person will perform the task? It only makes sense that the exact same task will take different amount of time for different people. In most development groups, people have different skills and expertise. The knowledge discrepancy can be at technologies, code, business domain, etc. So even if team members agree on the desired solution it will still take more time to perform by some than others.
How can we do it better?
What if we change the planning poker method just a little? From now on each person will state two numbers. The first is estimation for the working part and the second is estimation for the learning part. The learning part is personal. All team members should only agree on the working part.
Why is it better?
It gives us a nice way to measure and monitor learning in our team. We already measure how much time people spend working. What about learning? Using the suggested method, over each sprint we measure the amount of time people spend on learning. We can call it the Learning Chart. This is exactly like the Burn down Chart only here we measure the number of point’s people spent learning over time. As with Burn down Chart we use the points given at the planning stage.
Here are some examples of learning charts and what they are telling us:
Consider a very horizontal learning chart resulted from most tasks with low learning factor. What does it mean? It means that the team is very efficient and everybody is doing exactly what they are good at. It also means that they are not learning new things and the knowledge is not well spread among team members.
On the other hand a very steep learning chart might suggest less efficient team and probably low quality work since most of the time people do things they know little about.
So what is a good learning factor?
I actually don’t know the answer for that. I would assume it is at least 2 so on average people spend half of their time learning. This might also correlate to the satisfaction and challenge your team member experience in their daily work.

What’s next? Try out my new idea and see if it actually makes sense. I will keep you updated…

Saturday, August 24, 2013

Writing a Java Regular Expression Without Reading the ***** Manual


Writing and/or maintaining regular expressions is a part of every developer's routine work. Hey, and we usually can't stand it. It's annoying, the syntax is not humanly memorable, and overall it is an experience that one wants to leave behind him as quickly as possible, so he can move on to the actual problem he is facing. Wonder what will happen if we would Poker Estimate an RE problem. what would be the deviation between the estimation and the real time it took?
You see, when we need to write a new RE we go through the following steps: 
  1. Visit Pattern for a quick recap on the syntax. 
  2. Describe the RE in English. It goes something like: "start with 4 digits followed by spaces afterwards the string "DUR" then again some spaces and finally one digit"
  3. Translate the English description to Java syntax: "\d{4}\s+DUR\s+\d"
  4. Come up with examples. So here it will be something like: "1234 DUR 9" 
  5. Write a test validating the examples, thinking on edge cases, and making sure the RE is valid.
The situation is even worse when one needs to change an existing regular expression. Here we need to translate the RE syntax back to English, apply the changes and translate it back to RE syntax. This is again followed by examples and testing.
We are not alone facing this problem. Several solutions exist to help ease the process (e.g. txt2re). The problems with these solutions are:
  • They always require leaving the IDE.
  • They usually don't help with understanding an existing RE, but rather only help create new ones.
So what do we suggest? We present you with the Regular Expression Wizard, a new approach for writing and maintaining Java Regular Expression. This is a Java based project that aims to help you write RE fluently using the Wizard Design Pattern.
How simple can it get? Let's write the RE from our previous example using the new wizard. Just create a wizard object, and than using static methods slowly build your own RE, followed by examples for testing. 
   1: RE_Wizard re = new RE_Wizard();
   2: String dur = re.start().
   3:         a_character_described_as(a_digit).exactly(4L).then().
   4:         a_character_described_as(a_whitespace_character).once_or_more().then().

   5:         a_fixed_string("DUR").then().
   6:         a_character_described_as(a_whitespace_character).then().
   7:         a_character_described_as(a_digit).
   8:         for_example("1234 DUR 9").for_example("4423   DUR 1").the_end();

Here there is no need for steps A (syntax recap), C (using the syntax) and E (writing a test). Note that if the stated example does not match the regular expression than an ExampleDoesNotMatchRegularExpression exception will be thrown. All you need to do is to describe the RE in English and come up with some examples. The best part comes when later on you need to change it. Again you do not need to deal with weird syntax. You only need to know English.

Let us take another example. Mkyong wrote a post on "10 Java Regular Expression Examples You Should Know". We took the one for creating a regular expression for time in a 24-hour format. 

   1:       //([01]?[0-9]|2[0-3]):[0-5][0-9]
   2:        RE_Wizard re = new RE_Wizard();
   3:        String timeRE = re.start().start_group().
   4:                any_character_in("01").no_more_then(1L).then().
   5:                any_character_in_the_range("0","9").
   6:                or().
   7:                a_fixed_string("2").then().
   8:                any_character_in_the_range("0","3").then().
   9:                close_group().
  10:                then().
  11:                a_fixed_string(":").then().
  12:                any_character_in_the_range("0","5").then().
  13:                any_character_in_the_range("0","9").then().
  14:                for_example("06:58").
  15:                for_example("6:45").
  16:                for_example("23:12").
  17:                the_end();

So where can you get a hold of this? The wizard code can be found on https://github.com/azarian/wizards.Use it, share it, feedback us, and forget about losing time writing RE's. 

Disclaimers
  • We did not implement all java regular expression syntax mostly due to time limitation. If anyone wishes to contribute he will be highly appreciated.
  • We do not include instructions on how to use the builder. We hope it is straight forward. If it is not than we are missing the point, so please inform us.

Saturday, May 25, 2013

The Dev-QA Delicate Relationship

Success to your product is directly influenced by the ability of your QA and Dev teams to work well together. This is even more tightly coupled in the agile world when QA and Dev work and deliver under the same team. Symbiosis between QA and Dev will accelerate delivery time, create a more robust product, and overall will increase team member satisfaction. 

Saying the above is obvious. However, failing to understand the relationship between QA and Dev will take your product/team in the opposite direction. There is a delicate relationship between the two and a certain tension that must be confronted and not overlooked. Most of you probably felt it in your work place. You hear a QA's question thrown to the air, followed by a smug reply that is basically telling him that he will never understand since he didn't write the code. Or the other way around, when a developer asks a question about the product and the QA looks at him in a look that says "you really need to get out of your little world. There is a whole cosmos waiting for you..."

There are several symptoms/causes that can help you identify the level of tension in your workplace:

Domain knowledge is mostly in the QA hands

In this situation the developer works in a vacuum. He understands enough to accomplish his tasks, but not enough so that his code will be reusable. He can not foresee new advances in the field of interest. He is like an ox plowing in long corridor blind folded. 

Lack of respect

You all know it is there and from both sides. "This feature was written with so many bugs, my grandma would have written it better", or maybe "How dare he open this bug? It just shows me how little he understands..." Each side is building his own trench while accusing the other side in every single problem earth has encountered.

Over Testing

There seems to be a tendency to retest the entire product after each change (which should be prevented by proper sanity automated tests, and not by manual checks). Checks are too strict. This leads to slowness in the product improvement and frustration for developers.

Under Testing

Features are written under pressure, and as such are tested under pressure. Not all extremity cases are simulated. This may cause frustration in QA side, since they are the one that signed off the feature.

Who's the Boss?

Developers sometimes see QA as their personal assistants. They might ask the QA to complete tasks that are not directly related to QA but mostly to save "expensive" developer's time. 

Who is to blame?

In places where the QA is hold responsible for product quality every bug which was shipped with the product has the potential to flame a new fire. Who is to blame?

What can we do as managers to help reduce this tension?

  • Cross Functional teams. Putting them in the same team and make the entire team responsible for the product. As we said before this is already happening in the agile era.
  • Let them do each other's job. Let the QA do some Dev in the form of writing scripts or anything that will make them understand bugs are inevitable. Let developers do some QA so they will understand the horror of saying: "I tested it and it is ready for shipment"
  • The layer of team managers should originate from Dev and QA both, thus giving the management a broader perspective.
  • Management must have excellent interpersonal relations and be aware of the tension, confronting it when necessary.

Sunday, May 12, 2013

Gambling in Software


I want to tell you about a meeting we had a few days ago. It reminded me of “The Jack Story” (which was part of an old stage routine of Danny Thomas many years ago).

Here’s how it goes:
Traveling salesman gets stuck one night on a lonely country road with a flat tire and no jack. He starts walking toward a gas station about a mile away, and as he walks, he talks to himself. "How much can he charge me for a jack?" he wonders. "Fifty dollar, sounds reasonable. But it's the middle of the night, so maybe there's an after-hours fee. Probably another five dollars. Wait.... He'll probably figure I got no place else to go for the jack. Fifty dollars more."
He goes on walking and thinking, and the price and the anger keep rising. Finally, he gets to the gas station and is greeted cheerfully by the owner: "What can I do for you, sir?" But the salesman will have none of it. "You got the nerve to talk to me, you robber," he says. "You can take your stinkin' jack and shove it..."

The meeting was about a new feature requested by one of our customers. The feature was quite clear and we started talking about how we should implement it. At some point one of the participants claimed that if they need this feature they will surely need another related feature. A third guy immediately followed with: "if this is the case then we should also implement this feature…". This routine continued a few rounds until everybody were convinced that this feature was too big and should be rejected.

It seems that more often than one might think we follow 'The Jack Story" while building software. Fortunately, our story ends well. When we got back to the costumer and explained to him why we must reject the feature he stated that none of our guessing were true and he really only needs the original request. This time we got lucky. No extra work was done and we did not lose any costumers.

But it got me thinking. Did we do something wrong?

Now the typical agile practitioner would argue that we simply should not have added new requests on the original user story. Well...obviously my colleagues and I know this argument. We also know that the costumer often does not fully understand what he really needs. Moreover (perhaps not in this case) any company sometimes needs to be a head of the market instead of following it.
Actually, many times as software engineers we do more then we are explicitly requested (over doing). We enhance existing features. We build our code more generic and powerful than we currently need. We basically gamble on future needs, and I deliberately use the verb 'gamble' and not the verb 'guess' because there is a definite rewords for good bets. Naturally, 'Over Doing' also relates to a person character. Some will choose the 'Over Doing' approach more often than others. But everyone does it at some level.

Usually where ever there is a gamble there are measures and statistics. This must be done in order to track our gamble and measure the profit. This is also the case, for example, in software estimation which in essence also involves gambling. We continuously review our past estimation in order to improve our future ones. But this is not the case with 'Over Doing'. We never mark which of our work is mandatory for now and which is a gamble on a future need. As a direct result, we never come back to check if we were right.

So I answered myself: No, we did not do anything wrong. We should continue to gamble on future needs. But we also must find a way to document and review our gambling. It will enable us to estimate the profit of our gamble and help us improve future ones, avoid over engineering where it is not needed, and insist on generic code where we see future opportunities.

 

Prolog


A key tool for a manager is matrices. We already know that traditional matrices in software engineering often provide little help for a project success. You can read about another matrices we suggested in Effective Unit Testing - Not All Code is Created Equal. In the agile era we are in a quest for finding new matrices. New things to measure which might help us navigate our project to safe shore. This post tries to suggest such alternative metric which might be useful.


Wednesday, March 13, 2013

The Inner Software Model and the End User


When I build software I always do it aligned with a model. The model evolves with the  software and in many cases defines the boundaries of what can and can't be done (that is without modifying or breaking it). A good model is one which is simple to understand yet powerful enough to allow the introduction of new features.

A good model makes me happy. If it was developed by me then it will be the first thing I will show when presenting my work. If it is others it will be the first things that will make me appreciate their work. Actually I think so high on the importance of a good model that I have made the mistake of asking my users to learn it too.

Users obviously view the world through their eyes. In places you might recognize several use cases as the same one, your users might see them as completely different cases.
It seems I am not the only one taking this approach. Remember the first days of Android OS. One of the first things they were proud of was: "Everything is an application". Indeed as a Software Engineer the fact that every functionality on top of the operating system is modeled as an application is simple yet powerful. But as a user I always moved uncomfortably in my chair when pressing on the applications button and find the Phone application there. You see, as a user I have a phone device with phone related functions and I have the applications which is an extension to the phone. Finding the phone icon and contacts icon in the applications section confused me. Especially in the early days of Android where the phone application shortcuts was permanent. IPhone OS took a different approach where some of the device functionality was presented to the user as OS features (e.g. Siri,).

Another example is the JavaScript language and Object Oriented Paradigm. In this example the user is the JavaScript developer trying to use it as an Object Oriented language. Again you have a powerful simple model (everything is a function) which enables you to implement any Object Oriented principle. However, each concept requires a special usage of the model (Hint: want to define a class? use a function).
On the other hand Java takes a different approach. One example that comes to mind are the Enums introduced in Java 5. Although one can easily implement an Enum (see http://www.javacamp.org/designPattern/enum.html) they still decided to include it in the language.

What is the correct approach? Taking the first approach, in which the model is generic and it is also introduced to the user, is cheaper and easier to develop. Yet it will produce a less friendly software. So I believe the key consideration here is: Who is your user and will he be able to learn and adapt?

Recently I have started to adopt a hybrid approach. I expose both the general powerful model to the advanced user and a simple domain oriented interface for the average user.

To sum things up I highly recommend (especially for developers) to pay attention to the difference in the point of view of users vs. the model. Moreover, to decide on the correct approach consider both the user nature and your resources.