I am having trouble accessing the GitHub timeline from BigQuery
.
I was using the following query:
SELECT repository_name, actor_attributes_company, payload_ref_type, payload_action, type, created_at FROM githubarchive:github.timeline WHERE repository_organization = 'foo' and created_at > '2014-07-01'
and everything was working great. Now, it looks like the githubarchive:github.timeline table is no longer available. I've been looking around and I found another table:
SELECT repository_name, actor_attributes_company, payload_ref_type, payload_action, type, created_at FROM publicdata:samples.github_timeline WHERE repository_organization = 'foo' and created_at > '2014-07-01'
This query works but returns zero rows. When I remove the created_at restriction it worked but only returned a few rows from 2012 so it looks like this is just sample data.
Does anyone know how to pull live timeline data from GitHub?
Indeed,
publicdata:samples.github_timeline
has only sample data.For the real GitHub Archive documentation, look at http://www.githubarchive.org/
I wrote an article yesterday about querying it:
Sample query:
As Mikhail points out, there's also another dataset with all of GitHub's code: