Request: Statistics for Active Cache Types

Discussion about the Geocaching Australia web site
Post Reply
andrewbt
3000 or more caches found
3000 or more caches found
Posts: 22
Joined: 12 July 15 11:59 pm
Location: Canberra

Request: Statistics for Active Cache Types

Post by andrewbt » 07 July 17 11:28 pm

Hi GCA peeps

I'm not sure if there is one - I had a good look, but I couldn't find it - but I'd like to have a graph of each showing the amount of cache type?

and also if possible, type by status - active, active/disabled, all, archived etc.

Also, is it possible to get the raw data used in the graphs available?

Not that I'm asking for much :-" :lol: :lol:

User avatar
CraigRat
850 or more found!!!
850 or more found!!!
Posts: 7015
Joined: 23 August 04 3:17 pm
Twitter: CraigRat
Facebook: http://facebook.com/CraigRat
Location: Launceston, TAS
Contact:

Re: Request: Statistics for Active Cache Types

Post by CraigRat » 08 July 17 9:48 am

http://geocaching.com.au/stats/graphs/all/au/

Then go to 'hides by type' in the dropdown.

We don't typically break it down by status however.

I'll add it to the development list.

andrewbt
3000 or more caches found
3000 or more caches found
Posts: 22
Joined: 12 July 15 11:59 pm
Location: Canberra

Re: Request: Statistics for Active Cache Types

Post by andrewbt » 09 July 17 6:33 pm

Thanks CraigRat!

No rush, I'm just curious as to the breakdown of caches on GCA is all.. I could just run queries and get the counts on that too...

I will have to learn how to automate some of the stuff not yet in the API :)

User avatar
caughtatwork
Posts: 17013
Joined: 17 May 04 12:11 pm
Location: Melbourne
Contact:

Re: Request: Statistics for Active Cache Types

Post by caughtatwork » 10 July 17 9:18 am

If you want data that is not in the API, please let us know what you are looking for. We would prefer to provide a simple delivery mechanism rather than have you scrape data.

You can't have everything at once however because that is simply too much data. The cache table is just under 1GB and the logs are just over 8GB. We can probably provide the basis of caches and logs without descriptions and text as that's the largest parts of the data.

We would not be in an easy position to provide the graphs in data / download form. The class we use doesn't cater for a data output. It assumes we are always drawing a graph. This can be done, but the question comes back to effort for return. If you can draw the raw data and perform your own calculations you can avoid delays in the future where a new piece of data is required.

If you do want the data as you have asked in the OP, then you can use the API.

http://geocaching.com.au/api/services/s ... 0&offset=0
This will return 500 moveable geocaches from -37|145 (which is near me). If you want more, just update the offset parameter in groups of 500. Don't forget to provide your API key.

Only those which are available
http://geocaching.com.au/api/services/s ... =Available

Only those which are archived
http://geocaching.com.au/api/services/s ... s=Archived

Only those which are temporarily unavailable
http://geocaching.com.au/api/services/s ... navailable

You can change up the type to see what you want. It may be a little more effort to set up, but you then have a very powerful data provision tool for your analytics. You can then use these geocache codes to drill down into the cache attributes and logs.

I doubt the cache description and log text are important to you for statistics purposes so I'll see if I can include an "exclude" parameter to stop you having to download all of the text data for the caches and logs. Then you should be able to use the API for just stats.

andrewbt
3000 or more caches found
3000 or more caches found
Posts: 22
Joined: 12 July 15 11:59 pm
Location: Canberra

Re: Request: Statistics for Active Cache Types

Post by andrewbt » 11 July 17 8:08 pm

Thanks caughtatwork

I was just thinking of making a caching dashboard, and I'd like to pull the counts etc of queries... rather than the list of the caches etc.

eg... I want to write something that tells me every day "You have x caches left in your home state, there are x active moveables you haven't found based off your query etc.. there's x caches total.. etc etc etc"

Keep in mind that's me being a total novice at this sort of stuff, and for me it's a great way to learn. I agree though - the API is much better than a scraper.

User avatar
caughtatwork
Posts: 17013
Joined: 17 May 04 12:11 pm
Location: Melbourne
Contact:

Re: Request: Statistics for Active Cache Types

Post by caughtatwork » 12 July 17 9:45 am

There are a number of ways you can achieve the same outcome, some more painful than others :-)

We have been looking into having these sorts of user generated statistical queries run against the database but we always come back to a root issue in that if someone screws up the way they construct the query, they send the DB into an endless search and retrieve. A poorly constructed query of logs and caches could return up to 5,000,000,000,000 rows.

Some of these statistics may be available from your cacher page. e.g. http://geocaching.com.au/cacher/statist ... t/general/ then if you click (or call directly) the XML data source (click on the orange icon next to the General heading http://geocaching.com.au/cacher/statist ... eneral.xml ) you will be presented with an XML file of those statistics which you can then parse.

Of course that only covers those statistics which we have precalculated. We can always add interesting statistics that can be downloaded by this method but you will need to wait in a queue for the developers to build them.

We are also looking at providing two data sources. Once of logs (without the log text) and geocaches (without short and long descriptions). If we provide these as CSV files, ZIPped, they should check in around 10MB or less. I haven't checked what the size would be in JSON form but I'm thinking that CSV might be better for the non-technical to manipulate in Excel. We would probably restrict access to once a day (to avoid flooding) and would be restricted to GCA geocaches only (all GC caches would be too much).

This might not address your requirements though as it will only look at GCA caches. We are not likely to ever produce access to the raw data for GC geocaches. The data is our to use, but not ours to share.

If you are interested in exploring some more of the GCA statistics, please let me know what you would like and we'll see if we can make it happen.

andrewbt
3000 or more caches found
3000 or more caches found
Posts: 22
Joined: 12 July 15 11:59 pm
Location: Canberra

Re: Request: Statistics for Active Cache Types

Post by andrewbt » 13 July 17 10:49 pm

Thanks caughtatwork, I'll have a play around and let you know :D

Post Reply