The cat lingers, and a story of data analysis

This is a story about data, and about how we collect data, but more about the tools we use to analyse that data and how apt they are to the results we want.

It is a story about finding the limits in analysis and trying to discover and write new ways of analysing data. It is a story about how data can define the tools we use to analyse it.

And it is a story about cats.

Leo Hector

I track my cat using a collar based tracker that records his position every ten seconds to an XML file which can be taken off the device when he returns home. It is a fairly low-technology solution but Leo is a fairly low-technology cat.

Leo quickly lost his first tracker – he has lost twelve of sixteen he has had in his two years – but using the second for a period of time showed me how ill suited to the task it was performing it was.

Cat trackers – as a rule – save data as GPX files. GPX files are most commonly used by cyclists and drivers for route planning data. A GPX file describes points inferring the distance between those points by their position in the dataset. That “M606” proceeds “M62” which in turn proceeds “M1” in the code is how we use a GPX file for route mapping.

Route mapping is a lot above movement and almost entire unconcerned with stopping. If we had a GPX file of the M606/M62/M1 journey we could assume that there would be a correlation between the timestamps in the data and the speed of the car.

As a file format GPX is about movement.

Cats are often still.

When Movement Gets Warm

It is very useful and not unsurprising to see where a cat goes when he or she is alone. The distance covered is remarkable and will give you a newfound respect for your moggy after you slob on the sofa following a ten yard walk from the car and he has done 10km in a day.

The early tracking of Leo shows a good idea of where he does go. We see a line that draws into the field behind some houses. These short tracks gave some reasonable information but when we began to track Leo for full days the information became less useful.

An all day wander if a spaghetti junction of movement around the same points. It is not uninteresting but rather than looking at a day like a route we look at it more like a heat map. This is the failing of the tools of analysis of GPX when used for cats and not for cars.

Heat maps for single car journeys are useless – you never go over the same point twice – but for cats a heat map tells you some interesting information. Where does your cat most like to go? Where does he avoid? Does he or she really go into the neighbour’s garden as often as the neighbour suggests?

Let It Linger

Another way GPX data is poor is that it assumes that the aim is to move between points while cats often do not move at all. Often they find a place outside to sleep and can remain still for minutes, even hours, which using the tracking solutions that are most used for cars and bikes barely registers.

In Leo’s tracking there is no way of noticing that he might have stopped.

Also the tracker which Leo uses assumes a sight of the sky most of the time. The reasons for this are fairly obvious but the fact that while cars do not go into houses cats do gives gaps in the gathered data. The tracker can go off line for hours at a time and if one traces where this happens it normally occurs within ten seconds of a cat flap.

And so we looked at tracking two events. The linger, and The undercover.

The Linger

The linger is a time when Leo remains in one position. GPS tracking (at this cost) is not sophisticated enough to not give jumps so a tolerance of slight moves in his fine Latitude and Longitude data has to be made. This is also useful for showing times when Leo has slowed perhaps when stalking something.

A linger has to have some length too. To stop for a second or two is not the same as waiting to investigate something. A linger has to last at least 90 seconds with a catch that ignores any bad data that may come in the middle of a linger making it appear as if Leo has teleported 100m East and then returned in ten seconds. One bad data event will be tolerated.

The linger need only be a view of the map where the linger occurred.

The Undercover

The undercover is when the dataset has no entries for a period of time. If Leo goes into a house, or under tree cover, and the tracking signal is lost then the gap between data is not obvious when using GPX analysis tools based on route planning.

We need to know where Leo has gone undercover – probably into a cat flap – and where he has come out of being undercover – the same cat flap – and how long he spent away. This will tell us if he has been home for a two hour nap or raided a neighbour’s house for a two minute food steal.

This is a simple job of looking for any time when the difference in timestamps on the data is longer than it should be and then noting where this occurred.

And so I created Leo Lingers which is used to show some pen pics of linger and undercovering and we get a picture of a cat who likes to sleep and hang about a lot.

No surprises. When you are this photogenic you do stop to strike a pose.

What this tells us about data

Data has an increasing importance in our society. On a surface level one understands that – for example – Amazon uses it to match one to a book one might like and that Supermarkets use it to try understand why people buy more sprouts on a Thursday afternoon and we are fine with that.

Fine in that we assume that if that data does not lead to anything especially interesting – such as the wrong book – or insightful – such as not knowing why sprouts are sold then the problems are distant and commercial. If Amazon or Tesco fail someone will fill their place.

But data is being used in financial markets, and to model financial markets, and those models are used to shape societies and even to misshape societies. If the analysis of the data is good – and if the tools used to analyze it are apt – then we can expect good things to follow.

But what about a fundamental assumption such as that route tracking and cat tracking work in the same way? It is a benign issue but what if somewhere in a complex model of an economy there is a similar assumption that is flawed? What if we are missing something in our analysis of data because our meta-analysis of how we can analyze data is based around too simple assumptions and not nuanced around the data itself?

We get better at data. We get better at collecting data and better at analysing it and creating analysis which is specific to the data but we only do this when we find previous analysis unsuitable for our needs, as I have with tracking Leo using GPX files, and while Evgeny Morozov’s Click To Save Everything is a flawed book his assertion that as a society we are becoming more enamoured with digital solutionism to the point where we begin to accept it unquestioningly is hard to deny.

We need to get better at understanding the limits of the analysis of the datasets we have, and better at creating better tools for analysis.

Or we could sleep for four hours. I know what Leo would do.

Fourteen: Where does Leo linger?

Leo has been poorly. His tummy has been bad and he has been listless and he has not been eating his dinner. I’ve been worried about him and what could be making him poorly and so I thought I might see if this data – which tells us where Leo goes – might also be used to tell us where he stays. If we know where he stays then we might note if he is eating something bad somewhere or he is stopping in unusual places.

And so while I have the Leo movement data today I have also written something to tell me where he stops.

Download

It was a wet day and overcast.

First we start off which what might be bad data – Leo is all the way to the East of the map – but he seems to start running around his own back garden before heading off in a new direction South down Bullgrave Woods and towards St. Anthony’s School (Stay away Leo, I hated it!) where he heads into the large playing field behind the houses at the bottom of Crescent Walk.

He does not go as far up as “Pannies House” – the home of my mate Richard when we were kids – but he is behind that row of houses. He exits pretty quickly and works his way back to the gardens around Crescent Walk. He has an explore of the back gardens going further West than normal too.

He stops at the edge of the houses rather than going to his favourite field and then returns back to run around my garden again.

So far, so Leo, but let us look further.

First Leo went just under 5k. That is just over half his usual distance on full day tracking underlining his being not that well, or at least not himself.

Lingering

I’ve written something to find out where Leo lingers, and were he goes off tracking (into a house, for example, or under tree cover)

Some technical stuff for those interested. Lingering is when Leo’s GPS ticker – which ticks every ten seconds – reports being within the four meters according to the Long/Lat reported. If Leo has twelve ticks all within 4ms he has lingered.

If the tick is outside of four meters there is a tolerance (for bad data) but as soon as a second/third tick is outside then he is no longer considered to be lingering.

Being undercover detects when Leo’s tracker does not report for an extended period of time (around eight minutes) by subtracting the time between ticks. I’m refining the plug in and will stick it on GitHub soon.

  1. 07:53 Leo Lingers here
  2. 09:39 Leo Lingers here
  3. 09:54 Leo Lingers here
  4. 02:56 Leo Lingers here
  1. 08:01 Leo under cover for 28 minutes.

  2. 08:32 Leo under cover for 10 minutes.

  3. 09:05 Leo under cover for 6 minutes.

  4. 09:11 Leo under cover for 13 minutes.

  5. 09:56 Leo under cover for 11 minutes.

  6. 10:12 Leo under cover for 38 minutes.

  7. 10:51 Leo under cover for 30 minutes.

  8. 11:25 Leo under cover for 70 minutes.

  9. 12:36 Leo under cover for 6 minutes.

  10. 12:42 Leo under cover for 34 minutes.

  11. 01:21 Leo under cover for 5 minutes.

  12. 01:34 Leo under cover for 66 minutes.

  13. 02:41 Leo under cover for 7 minutes.

  14. 03:22 Leo under cover for 179 minutes.

If we long along the maps we see Leo hanging around in a garden, then on a road (daft Leo) and then at the corner house at the bottom of Crescent Walk. These are the locations where Leo has stopped moving long enough that he must have shown an interesting in something. His last Linger is in the bottom of the garden. Remember this is where the linger started. I linger could be being stopped, or just walking slowly or circling a point.

An improvement to the lingering would be to note how long he has lingered in a location.

If we look where he has gone under cover we see the first two are for dense tree cover. The third he seems to go under a roof (perhaps) and the forth may be similar as may the fifth. It seems that tracking drops out at a few points and it might be a good improvement to show where it comes back. I suspect that the under cover shows more the limitations of the tracker than much about the kitten.

Mystery solved

Leo is going into a neighbours house and eating biscuits and meat that has been left down for too long. Hopefully we can get him back on his own diet and he can get back to being himself.

Thirteen: A year of living outside

Monday the 8th of June 2015 was Leo’s first anniversary of his cat flap being installed, and his life of a cat who can come and go as he pleases starting. For one year we have watched him dart out on a morning, stroll back at night and – if we are very good – bound up the garden to see us.

On the first year anniversary Leo was tracked for an all day wander. Shall we see where he went?

Download

Leo was out for seven hours. We found him in his donut on bed when we got home, and in that time he went over 9km. He has a long route today so we will look at his highlights.

Sometime in the morning Leo goes into Bullgrave Woods. He path takes him over the road and down to where Bradford Beck runs through the Wood. Leo spends a good deal of time on the banks of the beck running back and forth. He must be playing at this time, probably at hunt, and he must feel safe. The area itself is a little exposed to other animals and people and Leo would not spend so long in it if he did not feel safe.

Just as he does in the back garden he will be wary if anyone else is present and find cover.

He may have been “playing” with small animals (mice, etc) at this point.

He finishes his game and climbs up the Scholemoor side of Bulgrave Woods but only briefly and never breaking tree cover. He rather quickly comes back and breaks for home where (and this is hard to see above) he goes into the house for a couple of hours and probably goes to sleep.

When he comes out he returns to where his patrol left off which is the edge of Hunters Park Avenue where the Woods reach a field and walks a different route to return to the side of the Beck before returning to the street for a fairly lengthy wander around some houses and probably another sleep.

His last mission of the day takes him to the higher part of the wood near the field where he seems to climb trees. He altitude differs by a lot of meters for him to just be going up and down the banking. It was a hot day and he did not stray into the field or towards the Quarry. Instead he headed back for another sleep.

So a very Leo-like day. A little play, a lot of wandering, and a good few sleeps.