There are a number of ways you can achieve the same outcome, some more painful than others
We have been looking into having these sorts of user generated statistical queries run against the database but we always come back to a root issue in that if someone screws up the way they construct the query, they send the DB into an endless search and retrieve. A poorly constructed query of logs and caches could return up to 5,000,000,000,000 rows.
Some of these statistics may be available from your cacher page. e.g.
http://geocaching.com.au/cacher/statist ... t/general/ then if you click (or call directly) the XML data source (click on the orange icon next to the General heading
http://geocaching.com.au/cacher/statist ... eneral.xml ) you will be presented with an XML file of those statistics which you can then parse.
Of course that only covers those statistics which we have precalculated. We can always add interesting statistics that can be downloaded by this method but you will need to wait in a queue for the developers to build them.
We are also looking at providing two data sources. Once of logs (without the log text) and geocaches (without short and long descriptions). If we provide these as CSV files, ZIPped, they should check in around 10MB or less. I haven't checked what the size would be in JSON form but I'm thinking that CSV might be better for the non-technical to manipulate in Excel. We would probably restrict access to once a day (to avoid flooding) and would be restricted to GCA geocaches only (all GC caches would be too much).
This might not address your requirements though as it will only look at GCA caches. We are not likely to ever produce access to the raw data for GC geocaches. The data is our to use, but not ours to share.
If you are interested in exploring some more of the GCA statistics, please let me know what you would like and we'll see if we can make it happen.