These are my notes from pycon 2018 videos. I love the python community and especially the conference talks. This year I am going to take some notes from my favorite talks and post them here.
This is an Incomplete working post.
- Always profile before making any optimizations.
Vectorize with Numpy
- Looping in python can be slow
Use specialized data structures.
- sparse package
- Add types
- Fortran Like Speed
- heavy dependencies
- distributed tasks
- Can be executed locally or on a cluster
Look for an existing package
- resist the urge to reinvent the wheel
This was a great talk about not only test driven development on existing code bases, but how to be a good steward of code. Justin talks about how to clean up an existing code base, and leave it better than you found it. Start by improving the parts that you touch, write tests, and improve docstrings whenever you make a change to a particular feature. As you clean up the code base and it matures consider taking a sprint day to write tests and imporove documentation. Doing it after you have familiarity with the project will make it much easier to do. You will also improve your understanding of the parts that you have not touched along the way.
One of the biggest takeaways that I heard in this talk, was do not assume that last person to touch the code was any less than yourself. They likely did what they did for a reason, so before you have strong test coverage accross the project take it easy with rewriting everything they did, and only make the necessary changes. Your changes could have an impact on other parts of the code base that you are not familiar with.
Jason had a great talk about teaching kids to code through his experiences with First Lego League. He found that the event has the best of intentions, but does lend itself to schools with a larger budget that is able to order many different kits. He has found himself deep down a rabbit hole of finding an affordable alternative that can be done with the inexpensive raspbery pi zero, and controlled with the cheapest tablets. He is currently working on a programming language called wildcard, that can be programmed with paper. This really reminds me of a game that I play with my 5 year old son Robot Turtles. He really likes to play it. I will definitely be following this project to see if this is something that I can do with him when its ready.
Asks the right questions before writing the first line of code. Even the simplest questions such as averages have many possible pitfalls along the way. Alex discusses how to prepare your data before averaging in this talk. He brings some new "Jargon" . I am not sure that this jargon made this any easier for me to understand or discuss. It may take some time for this one to sink in to become effective. I feel like using plain english is more effective as it is more easily understood by anyone. "find the daily average sales by seller"
- the collapsed/aggregated data relevant to this analysis
- we are overriding the primary key (i.e. what a table defines as an observation)
- the original number of rows
Grouping key: the key defining a group**
- eg. "for each Seller" is (seller), "for each Country and city is (Country, City)
- this defines how many rows are in the result
** Obvervation key: a unit of observation for this analysis**
- eg. "daily average" is (Date), "across regions" is (Region)
- this defines how many rows are in the denominator
Collapsing Key - Grouping Key = Observation Key
Calculate the Average Daily Sales for each Seller.
Collapsing Key: (Date, Seller) Grouping Key: (Seller) Observation Key: (Date)
SELECT Seller, AVG(total) FROM ( SELECT DATE, SELLER, SUM(ApplesSold) AS total FROM Apples GROUP BY DATE, SELLER -- Collapsing Key ) as t GROUP BY Seller -- Grouping Key
I am interested in trying out this technique of using the second groupby. I typically use an unstack instead, but that relies on having the order of the Collapsing key correct.
(pd .groupby(['Date', 'Seller']) # Collapsing Key ['ApplesSold'] .sum() .groupby(level='Seller') # Grouping Key .mean() )
I was really impresssed by the professional level of presentation from Devishi from such a young age! She had a great talk about teaching python to young people. This talk really resonated with me as a father of two young children. She was advocating for python to be taught more frequently and earlier in schools. In her opion onece students have a basic grasp of algebra they should be starting to use python over a higher level abstraction like scratch. She also advocated that on the other Java tends to make computer science unaproachable and too difficult for students. It is too large of a jump and tends to steer students away.