Changelog Master Feed

GitHub and Google on Public Datasets & Google BigQuery (Changelog Interviews #209)


Listen Later

Arfon Smith from GitHub, and Felipe Hoffa & Will Curran from Google joined the show to talk about BigQuery — the big picture behind Google Cloud’s push to host public datasets, the collaboration between the two companies to expand GitHub’s public dataset, adding query capabilities that have never been possible before, example queries, and more!

Join the discussion

Changelog++ members support our work, get closer to the metal, and make the ads disappear. Join today!

Sponsors:

  • Toptal – Take control of your career and join the best at Toptal. Email Adam at [email protected] for a personal introduction to our friends at Toptal.
  • LinodeOur cloud server of choice! This is what we built our new CMS on. Use the code changelog20 to get 2 months free!
  • Full Stack Fest 2016 – Early Bird tickets available until July 15. Use the code THECHANGELOG after July 15 to save 75 EUR (before taxes).
  • Featuring:

    • Arfon Smith – Website, GitHub, X
    • Felipe Hoffa – GitHub, X
    • Will Curran – Website
    • Adam Stacoviak – Website, GitHub, LinkedIn, Mastodon, X
    • Jerod Santo – Website, GitHub, LinkedIn, Mastodon, X

    Show Notes:

    This show was produced in collaboration with GitHub and Google to announce the big expansion to GitHub’s public dataset on BigQuery.

    • The Changelog #144: GitHub Archive and Changelog Nightly with Ilya Grigorik
    • GitHub announcement
    • Google Cloud Blog announcement
    • Google Open Source Blog announcement
    • Felipe Hoffa - GitHub on BigQuery: Analyze all the code
    • GitHub public dataset — This 3TB+ dataset comprises the largest released source of GitHub activity to date. It contains a full snapshot of the content of more than 2.8 million open source GitHub repositories including more than 145 million unique commits, over 2 billion different file paths, and the contents of the latest revision for 163 million files, all of which are searchable with regular expressions.
    • NOAA Global Surface Summary of the Day Weather Data
    • USA Name Data
    • Google BigQuery
    • Gist: BigQuery Examples from Arfon Smith
    • Shawn Pearce (Google) - the unsung hero at Google who did all the hard work getting the data pipeline working for this new dataset
    • Email [email protected] to talk with Will and BigQuery’s public dataset team
    • Something missing or broken? PRs welcome!

      ...more
      View all episodesView all episodes
      Download on the App Store

      Changelog Master FeedBy Changelog Media

      • 4.4
      • 4.4
      • 4.4
      • 4.4
      • 4.4

      4.4

      29 ratings


      More shows like Changelog Master Feed

      View all
      Software Engineering Radio - the podcast for professional software developers by team@se-radio.net (SE-Radio Team)

      Software Engineering Radio - the podcast for professional software developers

      272 Listeners

      Hanselminutes with Scott Hanselman by Scott Hanselman

      Hanselminutes with Scott Hanselman

      382 Listeners

      The Changelog: Software Development, Open Source by Changelog Media

      The Changelog: Software Development, Open Source

      289 Listeners

      Software Engineering Daily by Software Engineering Daily

      Software Engineering Daily

      623 Listeners

      Talk Python To Me by Michael Kennedy

      Talk Python To Me

      581 Listeners

      Soft Skills Engineering by Jamison Dance and Dave Smith

      Soft Skills Engineering

      289 Listeners

      Thoughtworks Technology Podcast by Thoughtworks

      Thoughtworks Technology Podcast

      44 Listeners

      The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) by Sam Charrington

      The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

      437 Listeners

      Syntax - Tasty Web Development Treats by Wes Bos & Scott Tolinski - Full Stack JavaScript Web Developers

      Syntax - Tasty Web Development Treats

      989 Listeners

      CoRecursive: Coding Stories by Adam Gordon Bell - Software Developer

      CoRecursive: Coding Stories

      188 Listeners

      Kubernetes Podcast from Google by Abdel Sghiouar, Kaslin Fields

      Kubernetes Podcast from Google

      180 Listeners

      Practical AI by Practical AI LLC

      Practical AI

      204 Listeners

      The Stack Overflow Podcast by The Stack Overflow Podcast

      The Stack Overflow Podcast

      64 Listeners

      Big Technology Podcast by Alex Kantrowitz

      Big Technology Podcast

      510 Listeners

      Oxide and Friends by Oxide Computer Company

      Oxide and Friends

      67 Listeners