Wednesday, December 10, 2014

I was on call for a week...and it didn't entirely suck!

Yup, it didn't suck. In fact it was actually pretty good -- I'd go so far as to call it kind of exciting and certainly educational. If you've been (or imagined being) on 24/7 call for a week for a high traffic internet service and think I sound insane, allow me to explain...

Earlier this year I started a new job as a software engineer at Sovrn in Boulder, CO.
“sovrn is an advocate of and partner to almost 20,000 publishers across the independent web, representing more than a million sites, who use our tools, services and analytics to grow their audience, engage their readers, and monetize their site.”
In plainer English this means we offer technology so websites can display ads and earn money. Although I personally think it would be great if more content-based websites could sustain themselves through contributions from their visitors like Wikipedia and Brainpickings do, the reality is that revenue from advertising is a more pragmatic revenue model for many website operators. In fact advertising revenue helps power much of the internet you and I use every day.

We at sovrn work at serious scale: according to Quantcast, in October of 2014 sovrn ranked as the 4th largest ad network in the world and the 3rd largest in the US.

This means we move billions of records of data all day every day from multiple datacenters to a central collection point for processing and analysis. This constant flow of data has to cope with the inevitable network and hardware issues that arise and ultimately transform data into warehouse and near-realtime reporting data stores.

Interruptions to any part of the system can cause a variety of issues, from delayed data through capacity challenges and loss of revenue for our customers. One facet of maximizing uptime and dealing with service interruptions is a sophisticated monitoring and alerting system that functions continually, necessitating on call engineers with software development, data management and enterprise IT skill sets.

My first exposure to being on call recently ended and I really did enjoy it. Although serving adverts might sound straightforward, there's a fascinating degree of sophistication involved, and when doing it at scale the problems only get more interesting.

Up until my on call week my understanding of the "big picture" of our operations was limited, having focused primarily on the needs of the team I'm part of. One of the great things about my on call experience was how it gave me a much greater appreciation for how everything fit together, and some exposure to the operational aspects of the data processing pipeline and big data toolset we employ at sovrn. (For the curious we're using Kafka, Mirrormaker, Zookeeper, Camus, Storm, Cassandra, Hadoop and more.)

Besides getting to see all of this stuff hum along in production, there's a definite air of excitement to dealing with an incident. We use a small set of tools to manage our on call duties including Icinga (for system monitoring), VictorOps (for managing on schedules and messaging on-call engineers), HipChat (we use a dedicated channel for production issues which helps keep all interested parties informed and allows multiple participants work an incident collaboratively) as well as a wiki for knowledge-base articles.

I've worked in jobs before where the software engineers didn't get anywhere near production -- primarily due to regulatory considerations necessitating a strong separation between development and operations. Although those separations may help address fraud and other similar concerns, they inhibit other very positive things besides the excitement and "big picture" comprehension I've already mentioned.

First, there's a definite camaraderie that emerges from trying to figure out what's going on when you're getting alert after alert one evening on a weekend and have to ask colleagues to help. This necessitates a level of communication and cooperation across teams that might not otherwise happen all that often and is definitely a very positive thing.

Secondly, seeing how your code responds in production is a phenomenal feedback loop for software engineers. You have a lot more skin in the game when you and your colleagues will be receiving alerts for failing systems. Suddenly great logging and debugging characteristics are first class concerns. Nothing will focus the mind on the need for writing high quality, easy to support code quite like this.

Hopefully now that explains my viewpoint and you no longer think I'm completely mad...


Wednesday, June 11, 2014

Oh CenturyLink, thou art a source of much mirth...

Had some trouble getting a connection over VPN for my new job. This resulted in much to-ing and fro-ing with their elite "Internet Escalation Team" (like the Apple store's Genius Bar with less genius, if you see what I mean...)

This exchange yielded several hilarious gems, not least of which:

Honestly Jon, you are FAR more the expert than any of us here, we have no training except to help customers find the application forwarding section in the modem under the advanced setup.

and
We don't have folks “in the know” about VPN stuff.Compare it to the car dealer not really having a way to trouble shoot an “aftermarket” NOS kit someone installed on a car.Like how a realtor can't troubleshoot the new outdoor pool construction being done on a house they sold , that didn't have one at the time of sale. Does your helpdesk have a specific item to address? I don't think we can help you with your customization of the service. VPN is beyond our knowledge.

Thank goodness I had some idea what I was doing and was able to figure out that the problem was the modem I had. After proving that my VPN connection could be established from a neighbor's house who also used CenturyLink as their ISP I managed to get them to send me a replacement and all was well.

Friday, May 30, 2014

Note to self: bash and git setup

Some notes on my bash and git setup

As much as I like IntelliJ and its git integration, it makes me feel like I'm just dealing with the tip of the git-iceberg. So I like to try and use the command line as much as possible as a way to increase my understanding of git and all the mad-clever things it can do. That's led me to want command line completion, including for git-flow, (and if you're not using git-flow, why? ;-) visibility of what branch I'm on in my prompt and a few aliases and other configuration items to make things easier. It's actually not all that much but it's made a big difference to me.

Here's what the salient parts of my ~/.bash_profile and ~/.gitconfig look like:


The git ls alias is a succinct answer to, "What's been committed lately?" at a high level. It simply lists the hash, how long ago it was committed, by whom and the commit message. I find it useful with --no-merges (which unsurprisingly omits merge commits from the output) and --author=<author> to limit things to just my work. It helps you answer the questions, "When did I commit feature X?" and "What did do lately?" The git ll variation gives a little more detail by listing the files changed along with an indication of the type and size of change. Useful when the commit message alone doesn't help me answer "What did I do lately?" ;-)

Spending more time in a console made me want to be more adept at editing commands here too; I had for years made do with Home and End and the arrow keys. I even increased my key repeat rate to speed up moving back to correct errors with the left-arrow key. Now I've added the couple things I really needed to improve things:
  • Enabling vi mode; I knew about this, but hadn't found myself on the command line quite enough to care about it until recently. (Vi is another thing like to "make" myself use just because proficiency in it just seems to really pay dividends).
  • Figuring out how to navigate back and forward one word at a time while editing a command -- crucial for crazy long commands
All that's required is a few lines in your ~/.inputrc file:


Monday, May 19, 2014

GREP_OPTIONS='--color=always' is evil

Changing jobs and finding myself doing a lot more in the shell has had me tweaking my working environment a fair bit lately. Part of this involved wanting to highlight matches when using grep since I found myself grepping more in the last few weeks than the entire year prior to that.

The most pedestrian way of doing this is to invoke grep with the --color option, which according to man grep may be set to "never", "always" or "auto". What it fails to point out is the important difference between "always" and "auto" (more of which later) but having experimented briefly with "always" and seen what I expected I proceeded with the assumption that it was a perfectly reasonable way to do things.

The prospect of typing --color=always all the time was not appealing however, and so after a quick search I hit upon the fact that one can set the environment variable GREP_OPTIONS to contain options to be passed when grep is invoked. My .bash_profile was swiftly modified to include export GREP_OPTIONS='--color=always'. And everything was good.

Except not really.

Later that week I was experimenting with some things from an old BASH book I had hanging around the house for years but never really dug into. One of said things was a function to create an ls command that filtered by date. Call it lsd (the book did) and it'd work thus:

> lsd 'may 18'
May 11 10:01 foo-report
May 11 10:42 cats.db

Except the weird part is that it didn't quite work right for me. The function was defined like this:

date=$1
ls -l | grep -i "^.\{38\}$date" | cut -c39-

For the uninitiated, that cut command is saying to cut the output from column 39 on through to the end of the line, the idea being to transform the regular output from ls -l which looks like this:

-rw-r--r--  1 jarcher  530219169   76 May 11 10:01 foo-report
-rwxr-xr-x  1 jarcher  530219169  127 May 11 10:42 cats.db

to the shortened date/filename form. Column 39 being where the date part starts. What was weird though was that my output didn't seem to be cutting in the right place. I found I had to cut further to the right, which at the time puzzled me, but not enough to spend enough cycles thinking about why this might be.

The following week I was tooling around with git flow since it seems like feature branches are the order of the day at the new gig (and doing it all through IntelliJ makes me feel like a charlatan). It seems pretty damn good too, although there was a rather obnoxious amount of typing involved such as git flow feature start foo-feature just to get working on a new feature. I suspected some command line completion magick was available and indeed it is.

Here's when things got weird, although I had no clue this was related to --color=always at this point. It seemed as though, suddenly, my git command completion was hosed, big time. One or two things worked, but much did not. Typing git following by some double-tab action to show possible completions revealed the reason why:

jarcher-mbp15r:bash $ git
Display all 106 possibilities? (y or n)
^[[01;31m^[[K                  flow
a^[[m^[[Kdd                    g^[[m^[[Kc
a^[[m^[[Km                     g^[[m^[[Ket-tar-commit-id
a^[[m^[[Knnotate               g^[[m^[[Krep
a^[[m^[[Kpply                  g^[[m^[[Kui
a^[[m^[[Krchimport             h^[[m^[[Kash-object
etc...

Yup, almost all the subcommands were funky like this. No wonder I couldn't choose between checkout and cherry-pick. Neither of them began with ch anymore...

Believing git flow completion to be the culprit, I googled along those lines. Luckily I turned up one (and only one!) page of interest, where somebody else reported the same symptoms, and a day later that they'd determined the following line in their bash configuration was the culprit: alias egrep='egrep --color=always'

Alarm bells rang. I remembered that I'd recently set things up so my grep command would be invoked with --color=always; I even had this vague recollection that I'd read always meant that when the output was piped to another command the color-inducing control characters would be passed along too. By contrast, auto would only include those color control characters when the results of grepping were destined for immediate output on screen.

I unset my GREP_OPTIONS variable et voila, suddenly it all worked as it should. A quick look at the git completion script confirmed that grep is used in there, explaining to my satisfaction why --color=always was screwing things up.

Since this was such an evil little trap I thought it was worth blogging about. Maybe it'll save somebody else from suffering this confusing problem.



Friday, May 3, 2013

Rule 30

Fun with cellular automata, specifically Wolfram's Rule 30.

HTML canvas + Javascript. Other than that and what you see below, no interesting write up (sorry :-)

Monday, August 20, 2012

A (very) brief foray into Spring 3.1

Introduction

As mentioned in my last post, I'm reacquainting myself with Java web application development. This post covers my very brief foray into using Spring 3.1.

I created a basic starter app using a Maven archetype. The generated starter app included Spring MVC 3.1, Spring Security (didn't need this, but it was easy to control what URLs were protected or not), Tiles and JPA/Hibernate/HSQLDB for persistence.

In terms of content and code, there was an index page and its corresponding controller class. Most useful to me though was a User POJO (set up as an entity that worked with persistence) and the corresponding UserRepository and UserService classes. The basic pattern seems to be that controllers interact with services to get done what they need done. In turn, Services (may) talk to Repositories (Repository being Spring-speak for a good old fashioned DAO) to perform the usual create, read, update and delete operations.

The way this starter app was configured was primarily Java based (rather than the more traditional XML based configuration). This was interesting to see, although when googling around for examples one tends not to yet find so many using this approach.

OK, moving on. As mentioned in my original post, I am going to just put together a page that lists job data from a database, as well as providing a form to add more job data.


Building a page to display a table of information retrieved from a database

I needed to create several items in the project to achieve this:

  • A Job POJO – Job.java
  • A controller corresponding to that page – JobController.java
  • A service for the controller to interact with – JobService.java
  • A repository (i.e. DAO) to manage the persistence of jobs – JobRepository.java
  • And of course a web page. Went with a regular JSP called jobs.jsp
The Job POJO had to be annotated appropriately with JPA annotations, thus:

After which it just worked. (Schema creation was covered by the hibernate.hbm2ddl.auto=create setting in persistence.properties – good enough for this experiment).

The JobController had to be annotated as a controller and instructed what request to kick in for with the @RequestMapping annotation. Note also the @Autowired JobService – this is how we inject the JobService here:

Note the index method also has a @RequestMapping annotation. At this point I could have either just annotated the class or the method itself with both the value and method attributes like this:    @RequestMapping( value = "/jobs" method = RequestMethod.GET ) -- but instead they are separated in anticipation of a method that will handle the post from the form later on.

Another key thing here is to note our list of jobs is added to that Model. This is how it gets into scope for use on the view (JSP page) you’ll see below.

Moving on to the JobService, which shook out like this:

Note the @Service annotation, which I understand is just a specialized form of @Component, though obviously the intent is clearer. Its presence ensures the class gets picked up during initialization and can participate in dependency injection. (Additionally, @Controller and @Repository fulfill the same duty, ensuring classes marked as such are picked up during the class path scanning of initialization).

Just as the JobController used @Autowired to inject JobService, so JobService uses @Autowired to bring in JobRepository.

The @PostConstruct annotation is an EE annotation that is pretty neat. You can mark one method of a class with this and after dependency injection is complete it will be called, allowing constructor-like behavior. For here I am using it in a slightly wacky way, as a means to stuff in a few sample jobs to my database (only one example shown above for brevity).

Finally, the listJobs() method simply gets a list of Jobs from the repository (i.e. the DAO).

This brings us to the JobRepository which looks like this:

First off, note the @Repository class annotation which indicates that this is basically a DAO (as well as also ensuring it gets picked up and is able to be part of the whole DI setup) and @Transactional class annotation and able to participate in Spring transaction handling.

Next, the JPA EntityManager is defined with the @PersistenceContext annotation.

Then the two DAO methods we need follow; one for saving Jobs and one for getting a list of them all. The listJobs() method is annotated with @Transactional again, with the readOnly attribute set to true presumably ensuring this operation is inhibited from ever modifying data.

Finally there’s the JSP which looks like this (just showing the table used for presenting the list of jobs, obviously there’s a little more than this):

I don’t think there’s really anything to explain here assuming familiarity with JSTL

Adding a form to allow the user to add new jobs

Mostly as an excuse to write a little JavaScript I wanted the form for adding jobs to be hidden until a user chose to add one. At that point it should be revealed, they could fill it out, submit and the page would refresh with the new job added to the list and the form hidden again.

To achieve this I ended up adding:

  • A form with appropriate fields and a submit button
  • A small JavaScript function to toggle the visibility of the form
  • A model attribute to transport the form’s values over to my controller
  • An addJob method to handle the form post

Here’s two fragments from the JSP with the JavaScript and the form for adding jobs:

The JavaScript is completely unexciting as I just knocked out the first thing that came to mind.

More interesting is the use of the Spring form taglib (declared with <%@ taglib uri="http://www.springframework.org/tags/form" prefix="f"%>). The first key thing is the modelAttribute attribute of the <f:form> tag which marries up with the following in the JobController:

Through this we have an empty Job POJO ready and waiting to collect the values entered to the form.

The second key thing is that the path attributes of the <f:input> tags match up with the property accessors on the Job POJO.

Finally, here’s the method in the JobController to handle the form post:

This seems pretty straightforward. Note the @RequestMapping annotation indicating this should handle POSTs. The method simply takes the incoming Job and uses the JobService to persist it before redirecting to the jobs page (redirecting after a POST preventing subsequent resubmits of the form).

Thoughts...

Initially I was a little frustrated with Spring since my starter app was clearly geared to be done in the latest and greatest style, with little to no configuration taking place in XML, whereas the vast majority of tutorials out there aren't trying to teach you that. Eventually, after finding enough resources to get me going though I really quite liked how it shook out. There's a lot of power in those annotations, leading to very little cruft, and pretty simple, readable code. Granted I suppose there's a lot of older applications still heavy with dense XML and other complications, but if it's headed away from that I like it quite a lot.

Sunday, August 19, 2012

Reacquainting myself with Java web application development

After a few years in a technical management role, I am reacquainting myself with hands-on Java web application development. After all, nothing says "can do" like "have done" -- however there's an almost bewildering amount of choice when it comes to picking frameworks.

I decided I would start by looking at Spring, Tapestry, Wicket and maybe plain EE6 as that would cover the most popular setups people seem to be using. Additionally I have chosen to use NetBeans as my IDE (I always had a soft spot for NetBeans, and it's looking pretty good these days), Maven as my build tool and JPA/Hibernate for persistence.

I decided to build a web app for managing information about job opportunities. Something that might help keep one on top of things amid the phone calls from recruiters, talent acquisition managers and the various phone screens and interviews.

Minimally, this app would have a way to enter information about jobs, and review the list of jobs. This would all be on a single page. Beyond that there are a few other things I may add once I have the basics in the various different frameworks figured out including:

  • Making the form submit / addition of a job AJAX based to eliminate page refresh 
  • Editing job information 
  • Changing status of jobs (applied, phone screen, in person interview etc.) 
  • Bringing in jQuery and looking at some sexy transitions to reveal and hide the "add job" form, sortable table etc. 
  • Paging of the jobs table (granted I hope nobody needs to apply for so many jobs that paging is required…but interested to see how this pans out) 
  • An entirely gratuitous web service of some kind…
OK, so first up is Spring. A blog post on my experiences there will follow shortly.