The Python Podcast.__init__

Working In The Code Mines: Mining Software Repositories With PyDriller


Listen Later

Summary

A large portion of the software industry has standardized on Git as the version control sytem of choice. But have you thought about all of the information that you are generating with your branches, commits, and code changes? Davide Spadini created the PyDriller framework to simplify the work of mining software repositories to perform research on the technical and social aspects of software engineering. In this episode he shares some of the insights that you can gain by exploring the history of your code, the complexities of building a framework to interact with Git, and some of the interesting ways that PyDriller can be used to inform your own development practices.

Announcements
  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!
  • You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For more opportunities to stay up to date, gain new skills, and learn from your peers there are a growing number of virtual events that you can attend from the comfort and safety of your home. Go to pythonpodcast.com/conferences to check out the upcoming events being offered by our partners and get registered today!
  • Your host as usual is Tobias Macey and today I’m interviewing Davide Spadini about PyDriller, a framework for mining software repositories
  • Interview
    • Introductions
    • How did you get introduced to Python?
    • Can you start by describing what PyDriller is and how the project got started?
      • How is Pydriller different from other Git frameworks?
      • What kinds of information can you discover by mining a software repository?
        • Where and how might the collected information be used?
        • What are the limitations of the capabilities offered by Git for investigating the repository?
        • What are the additional metrics that you are able to extract using PyDriller?
        • Can you describe how PyDriller itself is implemented?
          • How has the project evolved since you first began working on it?
          • I noticed that for testing PyDriller you crafted a set of repositories to serve as test cases. What has been the most complex or challenging aspect of writing meaningful tests to ensure a reasonable coverage of this problem domain?
          • What would be required to add support for other version control systems?
          • How have you used PyDriller in your own research?
          • What are some of the most interesting, unexpected, or innovative ways that you have seen PyDriller used?
          • What are some of the most interesting, unexpected, or challenging lessons that you have learned while working on and with PyDriller?
          • What do you have planned for the future of PyDriller?
          • Keep In Touch
            • Website
            • ishepard on GitHub
            • @DavideSpadini on Twitter
            • Picks
              • Tobias
                • pre-commit
                • Davide
                  • Fall guys
                  • Closing Announcements
                    • Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management.
                    • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
                    • If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story.
                    • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
                    • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
                    • Links
                      • PyDriller
                      • Delft
                      • Git
                      • GitPython
                      • PyGit2
                      • RepoDriller
                      • Mining Software Repositories Conference
                      • Lizard
                      • Hadoop
                      • Mercurial
                        • Podcast Episode
                        • Subversion
                        • CVS
                        • Neo4J
                        • GraphRepo
                        • The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

                          ...more
                          View all episodesView all episodes
                          Download on the App Store

                          The Python Podcast.__init__By Tobias Macey

                          • 4.4
                          • 4.4
                          • 4.4
                          • 4.4
                          • 4.4

                          4.4

                          100 ratings


                          More shows like The Python Podcast.__init__

                          View all
                          The Changelog: Software Development, Open Source by Changelog Media

                          The Changelog: Software Development, Open Source

                          283 Listeners

                          Data Skeptic by Kyle Polich

                          Data Skeptic

                          483 Listeners

                          Chat With Traders by Tessa Dao

                          Chat With Traders

                          1,979 Listeners

                          Talk Python To Me by Michael Kennedy

                          Talk Python To Me

                          592 Listeners

                          Software Engineering Daily by Software Engineering Daily

                          Software Engineering Daily

                          625 Listeners

                          The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) by Sam Charrington

                          The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

                          444 Listeners

                          Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

                          Super Data Science: ML & AI Podcast with Jon Krohn

                          298 Listeners

                          Python Bytes by Michael Kennedy and Brian Okken

                          Python Bytes

                          213 Listeners

                          Data Engineering Podcast by Tobias Macey

                          Data Engineering Podcast

                          142 Listeners

                          Machine Learning Guide by OCDevel

                          Machine Learning Guide

                          764 Listeners

                          Syntax - Tasty Web Development Treats by Wes Bos & Scott Tolinski - Full Stack JavaScript Web Developers

                          Syntax - Tasty Web Development Treats

                          981 Listeners

                          DataFramed by DataCamp

                          DataFramed

                          266 Listeners

                          Practical AI by Practical AI LLC

                          Practical AI

                          190 Listeners

                          The Real Python Podcast by Real Python

                          The Real Python Podcast

                          140 Listeners

                          Hard Fork by The New York Times

                          Hard Fork

                          5,422 Listeners