Topics covered in this episode:
Making PyPI’s test suite 81% fasterPeople aren’t talking enough about how most of OpenAI’s tech stack runs on PythonPyCon Talks on YouTubeOptimizing Python Import PerformanceExtrasJokeWatch on YouTube
Sponsored by Digital Ocean: pythonbytes.fm/digitalocean-gen-ai Use code DO4BYTES and get $200 in free credit
Michael: @[email protected] / @mkennedy.codes (bsky) Brian: @[email protected] / @brianokken.bsky.social Show: @[email protected] / @pythonbytes.fm (bsky) Join us on YouTube at pythonbytes.fm/live to be part of the audience. Usually Monday at 10am PT. Older video versions available there too.
Finally, if you want an artisanal, hand-crafted digest of every week of the show notes in email form? Add your name and email to our friends of the show list, we'll never share it.
Brian #1: Making PyPI’s test suite 81% faster
Alexis ChallandeThe PyPI backend is a project called WarehouseIt’s tested with pytest, and it’s a large project, thousands of tests.Steps for speedupParallelizing test execution with pytest-xdist67% time reduction--numprocesses=auto allows for using all coresDB isolation - cool example of how to config postgress to give each test worker it’s on dbThey used pytest-sugar to help with visualization, as xdist defaults to quite terse outputUse Python 3.12’s sys.monitoring to speed up coverage instrumentation53% time reductionNice example of using COVERAGE_CORE=sysmonOptimize test discoveryAlways use testpathsSped up collection time. 66% reduction (collection was 10% of time)Not a huge savings, but it’s 1 line of configEliminate unnecessary importsUse python -X importtimeExamine dependencies not used in testing.Their example: ddtraceA tool they use in production, but it also has a couple pytest plugins includedThose plugins caused ddtrace to get imported Using -p:no ddtrace turns off the plugin bitsNotes from Brian:I often get questions about if pytest is useful for large projects.Short answer: Yes!Longer answer: But you’ll probably want to speed it upI need to extend this article with a general purpose “speeding up pytest” post or series. -p:no can also be used to turn off any plugin, even builtin ones.Examples includenice to have developer focused pytest plugins that may not be necessary in CICI reporting plugins that aren’t needed by devs running tests locallyMichael #2: People aren’t talking enough about how most of OpenAI’s tech stack runs on Python
Original article: Building, launching, and scaling ChatGPT ImagesTech stack: The technology choices behind the product are surprisingly simple; dare I say, pragmatic!Python: most of the product’s code is written in this language.FastAPI: the Python framework used for building APIs quickly, using standard Python type hints. As the name suggests, FastAPI’s strength is that it takes less effort to create functional, production-ready APIs to be consumed by other services.C: for parts of the code that need to be highly optimized, the team uses the lower-level C programming languageTemporal: used for asynchronous workflows and operations inside OpenAI. Temporal is a neat workflow solution that makes multi-step workflows reliable even when individual steps crash, without much effort by developers. It’s particularly useful for longer-running workflows like image generation at scaleMichael #3: PyCon Talks on YouTube
Some talks that jumped out to me:Keynote by Cory Doctorow503 days working full-time on FOSS: lessons learnedGoing From Notebooks to Scalable SystemsAnd my Talk Python conversation around it. (edited episode pending)Unlearning SQLThe Most Bizarre Software Bugs in History The PyArrow revolution in PandasAnd my Talk Python episode about it.What they don't tell you about building a JIT compiler for CPythonAnd my Talk Python conversation around it (edited episode pending)Design Pressure: The Invisible Hand That Shapes Your Code Marimo: A Notebook that "Compiles" Python for Reproducibility and ReusabilityAnd my Talk Python episode about it.GPU Programming in Pure Python And my Talk Python conversation around it (edited episode pending)Scaling the Mountain: A Framework for Tackling Large-Scale Tech DebtBrian #4: Optimizing Python Import Performance
Mostly pay attention to #'s 1-3This is related to speeding up a test suite, speeding up necessary imports.Finding what’s slowUse python -X importtimeEx: python -X importtime ptyestTechniquesLazy importsmove slow-to-import imports into functions/methodsAvoiding circular importshopefully you’re doing that alreadyOptimize __init__.py filesAvoid unnecessary imports, heavy computations, complex logicNotes from BrianSome questions remain open for meDoes module aliasing really help much?This applies to testing in a big wayTest collection imports your test suite, so anything imported at the top level of a file gets imported at test collection time, even if you only are running a subset of tests using filtering like -x or -m or other filter methods.Run -X importtime on test collection.Move slow imports into fixtures, so they get imported when needed, but NOT at collection.See also:option -X in the standard docsConsider using import_profilePEPs & Co.PEP is a ‘backronym”, an acronym where the words it stands for are filled in after the acronym is chosen. Barry Warsaw made this one up.There are a lot of “enhancement proposal” and “improvement proposal” acronyms now from other communitiespythontest.com has a new themeMore colorful. Neat search featureNow it’s excruciatingly obvious that I haven’t blogged regularly in a whileI gotta get on thatCode highlighting might need tweaked for dark modegit-bugPyrefly follow up