Author Archives: mikeh

Five social login providers quickly reviewed, and one selected for trial

I need to put together a website quickly and at minimal cost. Join the queue you say. I want to use hosted services as much as possible for functions of my site that are not differentiators. Login and registration is one such area. So I quickly reviewed what Google told me were the main hits for the search term “social login providers”. The first two made me so angry I felt compelled to write about it. Then as I found better offerings, my temper settled, so now this small write up might be useful to people, rather than being a rant against poor selling-to-developers techniques.

Gigya

You need to request a “buyers kit”. Dear god. I’ll judge your solution against this criterion – can I integrate it in a couple of hours. Requesting, obtaining and reading whatever this “buyers kit” is burns your budget of a couple of hours to impress me.

Janrain

After a bit of digging, I get to this. Free for first 2500 users. Great – but no download or sign up link. Just a link to see pricing of greater use. When I click it I have to fill in a multi-field to have to contact me. Do you not see the irony in this? Also, who is this page serving – me or you? Again, I’ll judge your solution against this criterion – can I integrate it in a couple of hours. So again, sorry, you have used your budget of a couple of hours.

Oneall

These guys have pricing on the front page and code samples in the integration guide. After 10 minutes on this site, I figured I’d be able to integrate it into my system in a couple of hours. And the price is reasonable. So possibly on the short list.

LoginRadius

Pricing available to view at a single click, excellent documentation. After 10 minutes on this site, I figured I’d be able to integrate it into my system in a couple of hours. It’s expensive though at min $500 a month. I used this a few years back and it was $8 a month I think. It was good at the time, so don’t forget where you come from loginradius 🙂

Auth0

Pricing available to view at a single click, excellent documentation. After 10 minutes on this site, I figured I’d be able to integrate it into my system in a couple of hours. Its reasonable value too. I think this is the winner for initial trial.

Terms of engagement for a fractious technology choice

I’ve just been through the umpteenth difficult technology choice discussion in my career. You know the scene. A team split into two “camps”, each with their own favoured choice of technology. Each camp getting progressively more insular with each passing day of the “debate”. Frosty encounters at the coffee machine cooler.

Its just a crap way for adults to carry on, and it’s not a work environment I enjoy being in.

So this time I jotted down some reflections that I hope I will recall writing the next time I find myself in this situation, so that I can pull them out and offer them as “terms of engagement”. These seem obvious when you read them. But that’s because you’re probably out of the “heat of battle” now as you read them. Team dynamics, ego and the workings of human minds seem to produce this scenario too often for it to be obvious all the time. So it seems we need to remind ourselves over and over.

  • Every debated technology choice is a trade off. There is not an objectively right answer. By definition this is the case: if there was a clear-cut “right” choice, there would be no debate. But if there is debate, it’s a trade off. One tool likely appeals to some of the team for characteristics they value highly. The other tool appeals to other members, for characteristics they value, but which are different. Acknowledge this as a team.
  • So making a choice in a debated technology decision is a compromise. Call it as such. Instead of calling it a “technology choice” or “stack choice” or “framework choice”, call it a “technology compromise” or “framework compromise”. The language will communicate the reality of what the team is doing, it will emphasise that it’s a team decision, and it will help to de-escalate emotion.
  • There should be no sulking following the decision. There should be no “I told you so” when the weaknesses of the chosen technology surface.
  • Nobody on your team is stupid. Nobody on your team is “not listening”. They are listening and choosing to offer an alternative view point. And they are plenty smart – just know there are many types of intelligence, and while you might excel in one measure, your team members might excel in others.
  • Realise if you chose to dwell in the narrative that “they” don’t get it, it’s likely that “they” might be dwelling in a narrative of you not getting it. Rise up one level of thinking.
  • If you can, use both technologies for a trial period. Probably at some stage you will need to make the compromise.
  • If the “losing” side cannot bring themselves to work with the technology supported by “winning” side, the “losing” side are welcome to leave the team. This should be mentioned. A successful team has an inner culture. A team of divided mind will deliver suboptimal results, and working “together” will be a pain. So this needs to be surfaced as a discussed reality too. There might be a team next door who made the opposite technology choice, and who are of united-mind and who are successful. That team is a better place for the disgruntled members of your team. An organisation should acknowledge this reality, and support the settlement of people into tribes that are aligned with their mindset.

A defintion of “devops” written on a train

Here are some essential questions when it comes to delivering software:

• How can I tell when something has gone wrong with my software?
• How can I diagnose what went wrong
• How can I be sure my fix works
• How can I get my fix into the live environment
• How can I tell how much delivery of my system costs

Devops is the process of making the answers to all of these questions be an assured “easily, and quickly”.

So, it means monitoring, alarming and alerting. Of infrastructure, so you know if CPU is exhausted or a disk fails. But also of your app’s behavior – HTTP 5xx error codes, failed delivery of messages, exceptions with particular inputs. Something has to go red when there is a failure.

It means having access to infrastructure and application logs. The ELK stack or Graylog are good and popular examples. To diagnose what went wrong, I want to view the logs around the time frame of the failure, and get a good error message. Writing good error messages is a cornerstone of Devops strategy. Don’t look down on it. The Ops part of you will thank the Dev part of you one day in the future.

It means having tests for your code, and it means investing the effort in making code testable. When it comes to fixing an issue, you write tests to prove the fix. This is not strictly a Devops practice, more an Agile one, but I’m struggling more and more these days to see the difference. And also less inclined to care about the difference. Devops is about caring holistically about continued delivery of service to customers, not about process names.

It means having a continuous integration and delivery process. An automated one. So when you check in your fix, it gets built, it gets tested, packaged, and deployed onto an environment that is identical to production, but maybe on a smaller scale. Thereafter the package is promoted to environments progressively more like production, until it gets to production itself.

It means having tools or reports that tell you how much your infrastructure costs. It used to be called Total Cost of Ownership. Whatever happened to that? Every cloud platform I have worked on so far offers this. I’m paying $5,000 a month.

What’s less quantifiable in many systems, is how much money the system is making the business. Maybe this will have its day – “FinDevOps”, where the finance team are involved in the software delivery as equal partners, and they want to see a taxi-meter style “profit for the day” gauge on Grafana. I’ve seen such a thing once.

Anyway, enough day dreaming.

Devops is also a culture. One of end-to-end caring. Having devs on front line support is a super way to bootstrap this mindset. “If you wrote it, you run it”. Suddenly it gets personal. If a dev realizes he or she is going to be call on Saturday night, suddenly unit testing makes sense! A fast build makes sense. Continuous Delivery makes sense. Monitoring makes sense. Automation makes sense. Retrospectives following an incident make sense.

I have worked in dev environments that were disconnected from delivery and support, and there has been a cynical attitude to “best practices”. Especially retrospectives. Devs can be very cynical about retrospectives. But if you’re going to get a call in the middle of the night on a Saturday to fix your stuff, you want it done fast and reliably so you can get back to bed. Or back to the pub.

And then on Monday morning you will be very motivated to say “Hey folks, we screwed up with that last delivery. How can we do better”. Now you’re very motivated to have a retrospective all of a sudden. And to implement the findings!

So Devops is about caring. Its about not being siloed. It’s not “somebody else’s problem”. It’s a spirit of collective responsibility. And if you go the whole hog and involve the finance Finance people too – give them a taxi meter so they can get excited and join you – you might be starting a new movement.

Tech-Ignorant Journalists are misinforming the public

Tech-ignorant journalists are allowed to misinform public opinion about important matters like encryption just because they have a platform.

As a tech-head I feel the need to say current government initiatives to back-door encryption are flawed, due to the simple fact that a end-to-end secure messaging app can be written by a competent programmer in a few hundred lines of code. Maybe all us coders should just make one each and open source it?

What’s more, 100% secure email can be configured by anybody who can follow instructions.

So the encryption cat is out of the bag.

This journalist, and all others similar, and the policy makers are getting this one wrong. They are going to impose unnecessary and pointless expense on companies, weaken security for all of us, and snoop on the regular Joe whilst those who really want to remain private will use other means.

So for some reason, whether it makes a jot of difference or not, suddenly I just feel the need to speak up, having never been even remotely political before.

Runtime configuration for AWS Lambda functions

I have an AWS Lambda function that is scheduled to run once an hour (as described here).

The function FTPs files from a data provider and copies them to S3.

I have a test environment, and a production environment. For each environment, the ftp address and credentials are different.

How can I configure the lambda function so it can be aware of which environment it’s running in, and get the ftp config accordingly?

The best way I can currently find to do that is as follows.

For the test version of the function, I am calling it `TEST-CopyFtpFilesToS3` and for the production version of the function I am naming the function `PRODUCTION-CopyFtpFilesToS3`. This allows me pull out the environment name using a regular expression from the environment variable `AWS_LAMBDA_FUNCTION_NAME`.

Then I am storing `config/test.json` and `config/production.json` in the zip file that I upload as code for the function. This zip file will be extracted into the directory `process.env.LAMBDA_TASK_ROOT` when the function runs. So I can load that file and get the config I need.

Some people don’t like storing the config in the code zip file, which is fine – you can just load a file from S3 or use whatever strategy you like.

Code for reading the file from the zip:

    const readConfiguration = () => {
      return new Promise((resolve, reject) => {
        let environment = /^(.*?)-.*/.exec(process.env.AWS_LAMBDA_FUNCTION_NAME)[1].toLowerCase();
        console.log(`environment is ${environment}`);

        fs.readFile(`${process.env.LAMBDA_TASK_ROOT}/config/${environment}.json`, 'utf8', function (err,data) {
          if (err) {
            reject(err);
          } else {
            var config = JSON.parse(data);
            console.log(`configuration is ${data}`);
            resolve(config);
          }
        });
      });
    };

Fast locally, slow on AWS – a systematic approach to solving

Its fast on my machine, and slow on AWS

We built a system consisting of a bunch of processes running on the jvm, a mix of rabbitmq and http for interprocess communication, graylog for log aggregation, and mysql and postgresql as the data stores. We hosted the development, test and production environments on AWS, and we configured the entire stack using Cloudformation with all kinds of neat continuous deployment etc. The curse of it was that, under load, one of the processes was so much slower on the AWS platform that it was on my 2012 Macbook Pro. Basically, on AWS the app ground to a stand-still, started failing EC2 instance health checks, and eventually got pulled down by a number of failed ELB health checks. The next instances that came up would invariably meet the same fate. And locally it ripped through the work.

We tried many things to fix the issue, in a pretty haphazard fashion, until we lost patience with it, and decided to get systematic in getting to the bottom of the issue. Whats presented here is a summary of that process, which I reckon is a neat approach to solving these kind of issues.

Isolate the environment

Because my Macbook Pro is so different to the Ubuntu environment we run on AWS, I decided to make a Docker image containing the entire stack required by the slow system. That means installing RabbitMQ, Graylog, Mysql, Java 8 and the Jar file of the app itself. Also restoring the latest backup of production. Once I had that image defined, I ran through a sizeable job and timed it with a stop watch. Relative to AWS, it was fast – it took 3:50 (3 minutes 50 seconds) to do the work. Just to give you an idea, the AWS setup was taking 25 to 30 minutes, and seldom worked without dead letter messages and a host of other failure cases.

So I ran up a c4.xlarge, which was closest in spec to my laptop, to see if it was a problem with the basic performance of AWS itself. I installed the docker image and fired up the container and ran the same test. It took 2:39 to run the same work, so there was not a problem with AWS itself. This was exciting because it was clear that whatever the problem was, it was in the set of differences between my docker image and my AWS config – or it was due to the use of docker itself.

Introduce the old environment, one step at a time

So all I had to do was start introducing parts of the AWS environment into my docker setup. My first guess was the RDS database. So I edited my config file to use the RDS Mysql instead of the one in the docker image and re-ran my test. It took 3:31, which was a good bit slower but not the smoking gun. Darn it! All that was left to me was to spend the next ten hours systematically going through the following:

  • Our AWS setup used Magnetic disks, my docker c4.xlarge was using SSD. Surely this was it. Nope – it ran in 3:40
  • Was it the jvm args? In AWS we ran with the G1 garbage collector, and with no jvm args in docker. Nope, made almost no difference – it run in 3:43
  • What about external Graylog instead of the one local to docker? Yes, no? Well, no – it ran in 3:45
  • There was a difference in the instance sizes. Docker was on c4.xlarge, the app on AWS was on a t2.medium. Surprisingly not much difference – it ran in 3:52
  • Ok, next. What about external RabbitMQ instead of the local one? Also no – it ran in 3:58
  • Was it due to the fact that in AWS the app was in a private subnet, and my docker image was in a public one? Nope – it ran in the same time.
  • Could it be due to the fact that the ELB health check against my AWS app was using a database connection for each call, and docker was not fronted by an ELB at all? Sticking in a static health check page served by “python -m SimpleHTTPServer” made no difference at all.

(Using docker images and AWS AMIs made the change-test-repeat cycle faster and relatively painless.)

At this stage my docker image was getting very close to the setup of my AWS app, and was still kicking its ass. And I was starting to tear my hair out. So I started to switch tack.

Move the old environment towards the new

What differences remained? I wondered it it was something to do with the packer-created AMI my app was running on. Or due maybe to the fact that the stack was created using Cloudformation and for the docker test I just launched a plain Ubuntu 14:04 AMI.

So I logged onto one of the AWS app machines, stubbed out its health check with my static python server, and killed the app process. Now the machine was mine to do with as I pleased. I stuck docker on it, and ran up the app in docker. Then I tore down my static health check page, and allowed the docker process to be the app. Amazingly, the app worked really fast. About 4 minutes, which was in the ballpark. At the stage I started to doubt myself, so I went back to run the test using just the AWS setup, and almost straight way it started grinding to a halt and failing messages and just getting killed by ELB health checks.

So it had something to do with the way the app was launched by my Cloudformation arrangement. Getting close, getting close! And then it hit me. As I was watching the stdout of the struggling AWS app crawling along, I realised that all this output was streaming to /var/log/cloud-init-output.log. And these were magnetic disks and it might be taking ages to flush to disk. So I changed the launch line in the UserData section of my LaunchConfiguration from

nohup java -jar app.jar

to

nohup java -jar app.jar > /dev/null

I was not concerned about losing log information because everything was being piped to Graylog anyway. Then I re-ran the test, and it came in at the same time as my docker image – around 4 minutes. God darn it, this was the issue. To much logging to slow disk, totally stalling the machine. What? Really, something as dumb as that? Oh dear.

Small systematic steps

So anyway, we’re out of the woods with this issue, and I’m mighty glad. The thing I hope I remember in future when faced with this kind of issue again is to isolate the environment using docker and creep towards the broken environment one step at a time, measuring the effect of each change until the offender reveals itself. No bug can survive small, systematic steps.

Ethereum for normal devs

Here is what I learned during a hackathon on Ethereum on Saturday. We started with theory and ended with an example. Personally I find it easier to go from the specific to the general, and found that things came together more once we got concrete, so thats what I’ll do here.

Starting with a very concrete thing, Ethereum has a browser that is a forked version of WebKit, with the Ethereum Javascript API embedded. It’s called AlethZero.

AlethZero.

The panel in the middle is the browser, showing the Google home page, and all around it are panels showing information about the Ethereum network, and an area to define and execute contracts.

So lets say we want to make our own coin, MikeCoin. Making your own coin seems to be the Hello World of Ethereum.

Write up the contract as shown below.

init:
	contract.storage[msg.sender] = 10000
code:
	to = msg.data[0]
	from = msg.sender
	value=msg.data[1]
	if contract.storage[from] >= value:
		contract.storage[from] -= value
		contract.storage[to] += value

Code for Contract

The code you write is in the middle left, starting with init:. Underneath this you see your code compiled into opcodes for the Ethereum virtual machine. If there is a problem with your code, you will see error messages in this panel.

What this contract does is define some storage, a slot in a distributed key-value store, with an initial value of 10000, and the key being the address of the person who sent the message. Who is the person? It will be me, because as soon as I press the Execute button, I will be sending this contract to the blockchain, so I am the sender of the message.

The init: block gets run once when the contract is getting setup on the blockchain. In effect we are defining a wallet with initial funds, all of which are owned by me.

The code: section of the contract is what you subsequently transact with. Essentially its stating that “this contract takes two parameters, the first is who we want to send coin to, the second is how much”. The if statement is saying “If the from wallet has enough in it, transfer the nominated value to the nominated wallet”.

So we have coded up a very simple contract that stores funds and allows transfers. When we press Execute, this contract is sent to the blockchain. You will see it in the Pending tab, middle right.

Contract Pending

What this is saying is that my account, starting “607c3” is sending the contract code to the network, and when mining finishes, my contract will have the address starting with “3726”. When I enable mining (menu item, top left), I see the Pending message disappear, and my contract appear in the contracts tab. I can double click this contract to copy its address to the clipboard. So I can see its full address is 37261aa159eb8999164e487a3d29883adc055d9d .

So now lets write a web page to allow people send MikeCoin. I can use my existing website development skills to layout the UI, the many users of MikeCoin will interact with it using AlethZero (a forked browser, remember). So I can layout a simple form using boostrap:

<div class="container">
  <div class="header">
    <h3 class="text-muted">Sub currency example</h3>
  </div>

  <div class="jumbotron ">
    <div>Amount: <strong id="current-amount"></strong></div>

    <div id="transactions">
      <div class="form-group">
        <input id="addr" class="form-control" type="text" placeholder="Receiver address"/><br>
        <input id="amount" class="form-control" type="text" placeholder="Amount"/><br>
      </div>

      <button class="btn btn-default" onclick="createTransaction();">Send Tx</button>
    </div>
  </div>
</div>

And now I can crank out my javascript skills to show my balance in this wallet:

var contractAddress = "37261aa159eb8999164e487a3d29883adc055d9d"
eth.watch({altered: {at: eth.key, id:contractAddress}}).changed(function() {
        document.getElementById("balance").innerText = eth.toDecimal(eth.stateAt("0x" + contractAddress,eth.secretToAddress(eth.key)))
});

This is saying “watch for changes in the contract, and when they happen, get my state in that contract and display it”.

What about sending funds to somebody else? Here is the createTransaction() code:

function createTransaction() {
  var addr = ("0x" + document.querySelector("#addr").value).pad(32);
  var amount = document.querySelector("#amount").value.pad(32);

  var data = (addr + amount).unbin();
  eth.transact({
      from:eth.key,
      to:"0x" + contractAddress,
      data:data,
      gas:10000,
      gasPrice:10
  },function(receipt){
      alert(receipt);
  });
}

You can find docs on the transact function on the wiki, but basically this is saying “send a message from me to the contract, with the data of the message being two items: the destination address and an amount, in keeping with the params the contract expects.”

This is how the page looks in AlethZero.

Simple TX Page in AlethZero

Lets say I want to send some coin to somebody else. You will notice in the screenshots that in the bottom left quarter there is an Owned Accounts panel, and I have in there a second account I created, beginning with f024. Rather than bother other people with MikeCoin, I will send coin from my main account beginning 607c to this second one.

So I put in the destination account and an amount, and press Send Tx.

About to send funds

I will see the transaction appear in the Pending panel, then, assuming mining is running, it will disappear. Unfortunately my watch code is not live-updating and I am not sure why, but if I refresh the page I can see the money has left my wallet.

money left wallet

How can I check that my second account has the money? Well I will cheat a bit and change the code of the app to show the balance of my second account:

    eth.watch({altered: {at: eth.key, id:contractAddress}}).changed(function() {
        document.getElementById("balance").innerText = eth.toDecimal(eth.stateAt("0x" + contractAddress,eth.secretToAddress(eth.keys[1])))
    });

And when I refresh the page, I see that I own 45 MikeCoin in my second account.

coin in second wallet

And that is about the most basic Ethereum contract and app, and its a very concrete, understandable thing. But what is it we are really dealing with here?

My contract exists as an addressable entity in a distributed blockchain that nobody owns or can shutdown, and it enforces rules about ownership of data that is also distributed. I have a browser that knows how to interact with these contracts and data. The Ethereum team are working towards putting the apps themselves in the blockchain, so they too will be distributed and decentralised – no url to a server that somebody owns.

I went to this hackathon expecting to learn about a “better bitcoin”, but pretty soon I started to think that this is in fact a re-envisioning of the internet, where centralised servers are replaced by a network of peers, urls are replaced by addresses on the blockchain, http is replaced by a low-latency torrent protocol (its called Swarm), and wesbites are replaced by distributed apps. No individual owns this kind of network, nobody controls it. That seems to me to be the vision.

My feeling towards the end of the day was that its less about learning the APIs of Ethereum, and more about getting with what can be built using this kind of tech. For me this requires unlearning some of what I know, and seeing what it is that I am assuming. If the internet can or will be rebuilt in a more decentralised, more ownerless way, what kinds of apps and business and economies will grow out of this?

The social media built on this kind of platform will not be Facebook or Twitter. In functional terms it might, but you will not be going to Facebook owned servers and they will not own your data. Also, there will be less need for a centralised search engine company, that implements the rules around what we get to see, and controls the information around what we search for.

Even if Ethereum does not end up re-defining the internet, the ideas contained in it show that its possible. My taken-for-granted world of http and urls and webservers is tenuous and I would be well served to not get too attached to them. Everything changes. So that was “bigger picture lesson one”.

But you know what the biggest bigger picture lesson for me was? The social element thats around this technology. I met one guy that no longer has a bank account and lives 100% on bitcoin. He paid for his coffee with @changetip. I met another that is 50% in bitcoin. Poeple are actually doing this, NOW.

The conversations that were happening around the table were like “Today, the people who get to write the contracts have the power. This way, we all get to write contracts”. It kinda felt to me like the Zeitgeist is shifting towards decentralisation, and Ethereum and tech like it is a lagging artefact bubbling up out of this mind shift. It was very stimulating and rewarding to immerse myself in this mind space. I’m going again on Oct 5th.

Write up by Chris Ellis on the day

A design sketch for a data aggregator and reporting tool

We chatted today in work about generating reports that aggregate many peoples trading positions over many stocks. The way we do it quickly slows down as data grows. So I wondered if we could use some tricks to make a faster reporting system, that also might be more general, and also independent of storage technology.

We can easily enough produce an endpoint that provides a single days trades in json, a bit like this: /trades/22-March-2014 produces

[
	{client-id: joe, 
	 trades:[
		{quantity:101, price:34.5, amount: 3484.5, stock:AAPL},	
		{quantity:50, price:32.65, amount: 1632.5, stock:AAPL},	
		{quantity:-30, price:35.1, amount: -1053,  stock:AAPL}	
	]},
	{client-id: mary, 
	 trades:[
		{quantity:-1000, price:2.78, amount:-2780, stock:BPL}	
	]}
]

What we chatted about is providing another service that hits this url each day and auto-aggregates the data for us. This aggregating system would be configured as follows (using pseudo-json):

{
	client-id: {
		trades.stock: {
			sum: quantity,amount
		}
	} 
}

so now we could go to

/aggregate/trades/22-March-2014

and it would show us the summed position of each client for each stock like this:

[
	{client-id: joe, 
	 trades:[
		{quantity:121, amount: 2685.5, stock:AAPL}	
	]},
	{client-id: mary, 
	 trades:[
		{quantity:-1000, amount:-2780,  stock:BPL}	
	]}
]

What about if we wanted to see an aggregated report of trades between 01-Nov-2012 and 22-March-2014? This aggregating system could also auto-aggregate in blocks of time, so there would be an aggregate for each week of the year, each month of the year, and each year. If we coupled this with another little restful service – lets say

/range?from=01-Nov-2012&to=22-March-2014

which would return how many days, weeks, months and years there are between these dates:

[
	{month:Nov-2012},
	{month:Dec-2012},
	{year:2013},
	{month:Jan-2014},
	{month:Feb-2014},
	{week:1-March-2014},
	{week:2-March-2014},
	{week:3-March-2014},
	{day:22-March-2014}
]

We can now go down through this list, getting the aggregate for each block of time, and aggregating with the previous one – folding the results into each other. All the constituent aggregates are prepared, so its a quick look up for each, and a quick process to aggregate them. It should be possible to make a system like this work for any json data, and it should be able to support several kinds of aggregating functions.

I’m sincerely hoping the demand for this requirement in our system continues, as it would be fun to build something like this. If feels like the right separation of concerns. Of course, there is every chance something like this is already out there – in which case I hope that gets pointed out to me.

How about a reputation system for bitcoin wallets?

Would it be possible to build a system that maintains reputation for a bitcoin wallet address? The reputation system would itself be in the blockchain, so no central authority. Wallet addresses could be claimed by individuals or organisations (no sure how).

Wo when I’m making a payment using my wallet, it could say “The wallet you are transferring to is owned by amazon.com and has a 100% reputation”. Versus “The wallet you are transferring to is unclaimed and has a 23% reputation”. This would make wallets more user friendly and tactile – one of the biggest challenges bitcoin has if its to achieve mass adoption.

Spring and Maven reduce feedback

I got a moment of clarity today on why I am generally against things like maven and spring.

Our project used to be assembled using a massive Builder class. It was maybe a thousand lines long, certain methods had to be called before other methods, to make sure the relevant objects were created in a proper sequence, and it was hard to follow. Spring advocates asserted that this abomination would be solved by going the Spring route.

Around the same time, our build was becoming unmanageable. Specifically the number of dependencies was getting too large, and too complex to understand. We had jars shared across projects, and ran into divergent needs. Maven advocates asserted that this abomination would be solved by going the maven route.

Both situations have something in common. In the first, the Builder abomination was telling us “your app is too complex, split it up, or simplify it”. In the second, the awful build script with all the dependencies was saying the same thing – “your app is too complex, split it up, or simplify it”.

In both cases we experienced pain, we knew something was hurting. But rather than listen to the pain and try to understand what it was saying to us, we chose to medicate the pain away using tools.

Maven kinda seems to help with dependency management, with declarative and transitive dependencies, but now we have a 50MB WAR file. It contains libraries totally unrelated to what we are doing – like jfreechart and we chart nothing – that come in transitively and are never used. Few people on the team know this, or seem to care. Mentioning that we have such a fat app is met with a shoulder shrug. We prefer to keep away from the pain.

Similarly, now that we’re on Spring, there is no single horrible Builder class that you swear at every time you have to change it. Instead there are many smaller xml files, and autowiring and annotations that make the wire-up happen. The organisation has invested in an artefact repository with people looking after it etc. All these smaller parts and activities seem to feel less painful. But I think the sum of pain is at least the same, the complexity is at least the same. But it all has the seductive quality of being less in our face.

So as I sat there today for several minutes watching maven download jars, I realised I want the pain that its shielding me from back in my face. Don’t medicate me away from pain with these abstractions. In the human body pain is feedback calling attention to something that needs to be fixed. The wise response is to pay attention, not to medicate. So this is why I am against maven and spring and the like. They attempt to cover over things that I want direct contact with, things that I want to feel, things that give me feedback. If my app is hard to configure, I want the feedback. If my app is a 50MB war with a ton of dependencies, I want the feedback.

So I’d prefer to strip these things out and get more down to the metal. It would be painful, certainly, but I’d welcome that. The app would be the better for it.