A design sketch for a data aggregator and reporting tool

April 3rd, 2014 No comments

We chatted today in work about generating reports that aggregate many peoples trading positions over many stocks. The way we do it quickly slows down as data grows. So I wondered if we could use some tricks to make a faster reporting system, that also might be more general, and also independent of storage technology.

We can easily enough produce an endpoint that provides a single days trades in json, a bit like this: /trades/22-March-2014 produces

[
	{client-id: joe, 
	 trades:[
		{quantity:101, price:34.5, amount: 3484.5, stock:AAPL},	
		{quantity:50, price:32.65, amount: 1632.5, stock:AAPL},	
		{quantity:-30, price:35.1, amount: -1053,  stock:AAPL}	
	]},
	{client-id: mary, 
	 trades:[
		{quantity:-1000, price:2.78, amount:-2780, stock:BPL}	
	]}
]

What we chatted about is providing another service that hits this url each day and auto-aggregates the data for us. This aggregating system would be configured as follows (using pseudo-json):

{
	client-id: {
		trades.stock: {
			sum: quantity,amount
		}
	} 
}

so now we could go to

/aggregate/trades/22-March-2014

and it would show us the summed position of each client for each stock like this:

[
	{client-id: joe, 
	 trades:[
		{quantity:121, amount: 2685.5, stock:AAPL}	
	]},
	{client-id: mary, 
	 trades:[
		{quantity:-1000, amount:-2780,  stock:BPL}	
	]}
]

What about if we wanted to see an aggregated report of trades between 01-Nov-2012 and 22-March-2014? This aggregating system could also auto-aggregate in blocks of time, so there would be an aggregate for each week of the year, each month of the year, and each year. If we coupled this with another little restful service – lets say

/range?from=01-Nov-2012&to=22-March-2014

which would return how many days, weeks, months and years there are between these dates:

[
	{month:Nov-2012},
	{month:Dec-2012},
	{year:2013},
	{month:Jan-2014},
	{month:Feb-2014},
	{week:1-March-2014},
	{week:2-March-2014},
	{week:3-March-2014},
	{day:22-March-2014}
]

We can now go down through this list, getting the aggregate for each block of time, and aggregating with the previous one – folding the results into each other. All the constituent aggregates are prepared, so its a quick look up for each, and a quick process to aggregate them. It should be possible to make a system like this work for any json data, and it should be able to support several kinds of aggregating functions.

I’m sincerely hoping the demand for this requirement in our system continues, as it would be fun to build something like this. If feels like the right separation of concerns. Of course, there is every chance something like this is already out there – in which case I hope that gets pointed out to me.

Categories: Uncategorized Tags:

How about a reputation system for bitcoin wallets?

April 3rd, 2014 No comments

Would it be possible to build a system that maintains reputation for a bitcoin wallet address? The reputation system would itself be in the blockchain, so no central authority. Wallet addresses could be claimed by individuals or organisations (no sure how).

Wo when I’m making a payment using my wallet, it could say “The wallet you are transferring to is owned by amazon.com and has a 100% reputation”. Versus “The wallet you are transferring to is unclaimed and has a 23% reputation”. This would make wallets more user friendly and tactile – one of the biggest challenges bitcoin has if its to achieve mass adoption.

Categories: Uncategorized Tags:

How to find where a person is singing in an audio recording

August 29th, 2012 No comments

I need your help. This is an audio waveform of somebody singing the first four lines of Happy Birthday. Click if you want to see if bigger.

The first line should begin about 1.5 seconds in, and should be finished before 4.5 seconds in. This is because the singer is singing in response to a karaoke style lyrics scroller. There should be about a 1.5 second gap between lines, so the second line of the song should start at about 6 seconds in, and each line should be no more than 3 seconds long. I say should here, but its somewhat inexact, due to peoples sense of timing, and machine performance characteristics etc. There could be a 0.5 second variation in any of these timings, based on testing.

Now, what I want to so, is identify in the waveform where singing starts for each of the four lines of the song. So I want 4 pairs of numbers. The first might be (1.3, 4.1), meaning the singing starts 1.3 seconds into the waveform, and ends 4.1 seconds into the waveform, making a 2.8 second singing clip. The second pair might be (6.1, 8.6) etc.

How can you help? I’ve been hand-rolling all kinds of amateur algorithms to pull this off, with a modicum of success (that I feel chuffed about :) ), but I need something more robust than what I have been able to do. So what I am looking for is pointers either to papers or code for algorithms that are suited to this problem, or even the name of potentially suitable algorithms and I’ll do the research myself. Or else pointers to open-source tools that can do this, so I can dig around and see what they do.

My promise to you is, if anything interesting learning comes from this process, I will report back on what I learned.

Keep well!

Categories: Coding Tags:

A little fantasy

June 16th, 2012 No comments

you know the way powerful people in movies say “Leave Us” when they want to be alone with somebody they want to give a right good talking too. I want to do that in my next meeting. To somebody talking shite, or being treasonous in some other way. Following the offending statement, I’d go all quiet, fix them with a steely stare and, in a deep and steady voice, say “Leave Us”, keeping the stare going, while the others scuffle out, backwards. Now we’re alone. I imagine the room dimly lit, by candlelight or burning torch ideally, but I suppose the dimmer switch will have to do. Health and Safety. I’d arrange the lighting further so only one side of my face is visible, the other in shadow, eye sockets deepened, maybe a bit of stubble. Nobody messes with somebody framed like that. I suppose at this stage it would be appropriate for me to tut-tut ominously, stand up and step slowly the long way around the table (still in half-shadow) stopping behind the seat of my wide-eyed subject. And pause. Unsheath my company pen, fix it purposefully across their throat, lean in and whisper menacingly…well, whatever it is I want to be clear about. “You WILL come for extra hours on Monday” or “I KNOW you took the last of my pink post-it notes.”, it depends. And now, another episode of Game of Thrones. Leave Us!

Categories: Uncategorized Tags:

Spring and Maven reduce feedback

April 5th, 2012 No comments

I got a moment of clarity today on why I am generally against things like maven and spring.

Our project used to be assembled using a massive Builder class. It was maybe a thousand lines long, certain methods had to be called before other methods, to make sure the relevant objects were created in a proper sequence, and it was hard to follow. Spring advocates asserted that this abomination would be solved by going the Spring route.

Around the same time, our build was becoming unmanageable. Specifically the number of dependencies was getting too large, and too complex to understand. We had jars shared across projects, and ran into divergent needs. Maven advocates asserted that this abomination would be solved by going the maven route.

Both situations have something in common. In the first, the Builder abomination was telling us “your app is too complex, split it up, or simplify it”. In the second, the awful build script with all the dependencies was saying the same thing – “your app is too complex, split it up, or simplify it”.

In both cases we experienced pain, we knew something was hurting. But rather than listen to the pain and try to understand what it was saying to us, we chose to medicate the pain away using tools.

Maven kinda seems to help with dependency management, with declarative and transitive dependencies, but now we have a 50MB WAR file. It contains libraries totally unrelated to what we are doing – like jfreechart and we chart nothing – that come in transitively and are never used. Few people on the team know this, or seem to care. Mentioning that we have such a fat app is met with a shoulder shrug. We prefer to keep away from the pain.

Similarly, now that we’re on Spring, there is no single horrible Builder class that you swear at every time you have to change it. Instead there are many smaller xml files, and autowiring and annotations that make the wire-up happen. The organisation has invested in an artefact repository with people looking after it etc. All these smaller parts and activities seem to feel less painful. But I think the sum of pain is at least the same, the complexity is at least the same. But it all has the seductive quality of being less in our face.

So as I sat there today for several minutes watching maven download jars, I realised I want the pain that its shielding me from back in my face. Don’t medicate me away from pain with these abstractions. In the human body pain is feedback calling attention to something that needs to be fixed. The wise response is to pay attention, not to medicate. So this is why I am against maven and spring and the like. They attempt to cover over things that I want direct contact with, things that I want to feel, things that give me feedback. If my app is hard to configure, I want the feedback. If my app is a 50MB war with a ton of dependencies, I want the feedback.

So I’d prefer to strip these things out and get more down to the metal. It would be painful, certainly, but I’d welcome that. The app would be the better for it.

Categories: Uncategorized Tags:

Acceptance Testing a Web Application – Part 2

July 1st, 2010 2 comments

In this episode we morph the simple acceptance test we wrote in the previous episode into an acceptance test that specifies what our application should do when quoting a stock price. We learn that sometimes initial acceptance tests are, in a sense, about exploration of the domain, and how to specify it, and that giving yourself permission to explore is important. We get to see what it means if an acceptance test can be satisfied with a hard coded implementation and why “Given/When/Then” is a sensible structure for such tests. We discover that test driving like this uncovers domain concepts and relationships, and we introduce “stubbing” to help us control our external environment.

When playing this video, make sure to play it fullscreen, so you can read the text.

Categories: Announcement, Screencasts Tags:

Software development is deeply personal

May 28th, 2010 8 comments

“I think we should use Spring.”
“Over my dead body…”

“Maven is the only way to go.”
“Maven makes me want to hurt people…”

“I think we should have simple data objects and put all the logic in services.”
“No way, Eric Evans is a god and you need to read his bible…”

I’ve seen, heard and been involved in too many of these “discussions” recently. They’re wars really. I’ve not seen one come to a satisfying close. At best the warring parties walk away from each other without coming to blows, grumbling about how stupid is the other, and go on with whatever approach they happen to prefer anyway – the “right” way.

The damage from these kind of wars can be contained if they are across teams, but when they are between team members it can destroy the team. If half the team wants Spring, and the other half don’t, it can make for some very awkward pairing, “code wars”, terrible morale, and impaired productivity. And a loss of that very precious life experience – happiness.

For a long time this bothered me a lot, and I banged my head trying to find a solution. Surely there is some way “we can all just get along”? Well, recently I’ve stopped. I saw that there’s no point. There is no solution. Because there is no problem. Some people like Windows, some like Linux, some like OS X. Is there a solution to this? No, because its not even a problem. Its a blessing. A Linux die-hard could be a Linux die-hard for life, and good luck to them. Some people prefer BMW, some swear by Mercedes Benz, others by Skoda. Some people are Hindu, some are Protestant, others Muslim. Good luck to the lot of them.

Therefore, coming back to software development, its more important, in my mind, to be honest about a team’s tooling and development culture, and to hire members that fit.  If a team is all about Spring, there is no point hiring a developer who openly states that they hate it. If a team is all about domain driven design, there is no point hiring a developer that states that separating data objects and service objects is the only way to go.

And for you as an individual developer, you need to be honest with yourself and with others about what your instincts are, and find teams in which you’re a fit, rather than an annoyance. And if you do hate Spring (presumably with reason), be honest enough to say it. Have the courage of your convictions. There can be a fear accompanying this kind of personal honesty that makes you think “damn, if I say I hate Spring, maybe I won’t get this job I’m interviewing for. Or maybe they will think I am close-minded.” Well maybe you won’t get the job and maybe they will think you’re close-minded. As long as you know your reasons for hating Spring, then you can rest assured that you have made a personal, informed choice. As for not getting the job, well – phew! – you’ve saved yourself and the interviewing company much unhappiness, and you stand a chance of finding a team on which you’ll flourish.

One qualification that is essential to add, and which keeps open the door to development and learning, is this: any individual or team that finds themselves aligned to a particular approach must still be open to listening respectfully to advocates of other approaches, and maybe even being friends with them :-) Keep questioning and reading and studying and talking to people and trying things out, and stay honest with yourself, because at the end of the way, maturing as a software developer is never about becoming finally right, but by becoming increasingly less wrong.

Categories: Uncategorized Tags:

Acceptance Testing a Web Application

March 21st, 2010 Comments off

In this episode we put in place the framework necessary to acceptance test a web application. The purpose of an acceptance test is explained. Then we evolve some code to start and stop our WAR file in embedded jetty, and drive the application using WebDriver, making some simple asserts. We check in and make sure it goes green on our continuous integration server.

When playing this video, make sure to play it fullscreen, so you can read the text.

Categories: Announcement, Screencasts Tags:

Introducing Continuous Integration

January 17th, 2010 2 comments

Its easy to make mistakes when you’re checking in software to a source code control system. You can forget to include necessary files, you can break tests, you can introduce compilation errors. This is bad enough when you are working alone, but can be very annoying in a team context, as several people may have to down tools to locate a problem that may have been introduced days before it was found.

Continuous integration addresses this issue, but setting up an automated team member, a machine, whose sole purpose is to continuously check out the code, compile it, test it, and make sure its of the required quality.

This screencast explains all you need to know to get continuous integration understood and implanted as a practice with your team. It covers the following topics:

* where continuous integration fits in the software development process
* what problems it helps to solve or avoid
* what is a build server
* downloading and installing TeamCity
* build breakage notification strategies, especially large build monitors
* downloading and installing the Piazza TeamCity plugin
* and as a side issue, removing duplication in an ant file.

The quality of the text in the video is substandard. Subsequent videos are HD quality. Its best to watch in fullscreen.

Categories: Announcement, Screencasts Tags:

Introducing Source Code Control

December 18th, 2009 Comments off

YouTube Preview Image
YouTube Preview Image
This episode of Software Success Disciplines talks about the essential component of source code control.  It comes in two parts.

The first part describes the theory of source code control, the benefits it offers, the kinds of processes it involves, and how it enables teamwork.  This information is communicated using a mix of “blackboard” sessions, and a simplistic on-disk model of what a source code control server might do internally.

The second part of the screencast settles on Subversion as our source code control tool of choice.  It talks about repository management, and then shows how to import your project, and check it out again so you can avail of Subversion facilities.

To best view the videos, watch them on YouTube. Right-click on one of the thumbnails above, and select “Watch on YouTube”.

Total run time: 15 minutes

Categories: Announcement, Screencasts Tags: