Category Archives: Programmer Notebook

Report Card: Python Essential Training on Lynda

To refresh and reinforce some of the basics, I recently completed the Python Essential Training course on Lynda.  This is very similar to the Python 3 Essential Training course I had taken previously, but with a different section on Classes and Object Oriented programming in Python.  To be honest, I can’t tell which of the two courses is supposed to be the newer one.

This course is taught by Bill Weinman the IT educator, not to be confused with Bill Weinman the Hollywood film and sound editor or Bill Wyman of the Rolling Stones.  Bill has done a good job here.  The lectures are flawlessly delivered.  His presentation materials are simple, clear, and direct.

I liked Bill’s approach to starting out with Object Oriented programming.  He starts with working examples of the initial building blocks, but with some starting “best practices.”

In my post about the previous course, I expressed disappointment with the section on regular expressions and some confusion in the section on conditionals.  This course avoids those disappointments by avoiding the topics.  Conditionals don’t appear to be covered at all.  The section on conditionals is shorter and doesn’t appear to cover the unusual code structure I pointed out before.

The course incorporated 4 and three quarters hours of video in 73 short lectures, with 18 supporting example program  files.

This course has no exercises, final exams, grades, or certifications.


(Image courtesy of jarmoluk at Pixabay)

Advertisements

My Favorite MatPlotLib and Seaborn References

Here are some references I’ve found particularly useful when developing or debugging Python code with MatPlotLib and Seaborn.

User’s Guides

These focus on techniques for using specific methods.  They are generally stronger than beginner’s introductions.

https://matplotlib.org/users/beginner.html

http://matplotlib.org/users/index.html

Reference Guides

These provide encyclopedic reference on the details of function and syntax.

API Reference:

https://matplotlib.org/api/pyplot_api.html

Quick Reference:

This one starts out as a tutorial, but the bottom quarter is a great Quick Reference:

https://github.com/rougier/matplotlib-tutorial

Glossary

http://matplotlib.org/glossary/

Cookbooks

http://scipy.github.io/old-wiki/pages/Cookbook/Matplotlib.html

https://matplotlib.org/users/recipes.html

http://scipy-cookbook.readthedocs.io/

Tutorials

The better organized on-line tutorials easily serve as user’s guides.

https://matplotlib.org/users/tutorials.html

https://matplotlib.org/users/pyplot_tutorial.html

https://www.labri.fr/perso/nrougier/teaching/matplotlib/

https://pythonprogramming.net/matplotlib-python-3-basics-tutorial/

http://www.scipy-lectures.org/intro/matplotlib/matplotlib.html

https://github.com/rougier/matplotlib-tutorial


(Python logo provided courtesy of Python Software Foundation, used here under the “nominative use rules” of their policy.)

My Favorite NumPy and SciPy References

Here are some references I’ve found particularly useful when developing or debugging Python code with NumPy and SciPy.  (I can’t avoid the temptation to use the Australian pronunciation:  “Skippy.”

User’s Guides

These focus on techniques for using specific methods.  They are generally stronger than beginner’s introductions.

https://docs.scipy.org/doc/numpy/user/index.html

https://docs.scipy.org/doc/numpy-1.11.0/user/

http://www.scipy-lectures.org/intro/numpy/index.html

http://csc.ucdavis.edu/~chaos/courses/nlp/Software/NumPyBook.pdf

 

 

Reference Guides

These provide encyclopedic reference on the details of function and syntax.

Language Reference:

https://docs.scipy.org/doc/numpy/

https://docs.scipy.org/doc/numpy/numpy-ref-1.12.0.pdf

https://docs.scipy.org/doc/numpy/reference/index.html

 

Style Guides

These provide some best practices on structuring the code.

http://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_numpy.html

https://github.com/numpy/numpy/blob/master/doc/example.py

https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMENT.rst.txt

 

Glossary

https://docs.scipy.org/doc/numpy/glossary.html

Cookbooks

http://scipy-cookbook.readthedocs.io/

Tutorials

The better organized on-line tutorials easily serve as user’s guides.

https://docs.scipy.org/doc/numpy/user/quickstart.html

http://www.tutorialspoint.com/numpy/

http://cs231n.github.io/python-numpy-tutorial/

https://github.com/rougier/numpy-tutorial

http://www.scipy-lectures.org/intro/index.html

https://www.labri.fr/perso/nrougier/teaching/numpy/numpy.html

https://www.dataquest.io/blog/numpy-tutorial-python/

 

 


(Python logo provided courtesy of Python Software Foundation, used here under the “nominative use rules” of their policy.)

Tell Me I’m an Idiot!

I am eagerly studying new math and science, learning new methods, and tinkering with new tools.  I’m showing my work in public (here and on GitHub.)  I welcome coaching, constructive criticism, and insight into more efficient and  effective ways of accomplishing results!

I would especially welcome mentoring in data science, and guidance on more important or effective data sets to work upon.

So, call or write and tell me I’m an idiot.  But please show me where I’m being an idiot and how to work smarter!

Sincerely,

Carl Gusler

Austin, Texas

carl.gusler@gmail.com


(Image courtesy of stevepb at Pixabay)

Sharing My Python Student Notebooks

I am sharing my “work in progress” data science student notebooks.  These are my notes and test cases from various classes and projects I’ve been working on.  These are built using the Jupyter interactive notebook technology, so you can read, run, revise, and explore.

To share, I have placed my student notebooks on GitHub.  I will be adding future notebooks to the collection and will increase the content and readability of the existing notebooks.

I have taken a good number of on-line courses and read a fair number of books on Python and data science.  However, I find the only truly effective way to learn a new language and new techniques is to invent new and unique scenarios and work through them myself.  These notebooks show the scenarios I created from scratch to practice certain Python features and data science techniques.  I certainly welcome you to look over my shoulder, but encourage you to press forward and create  your own learning scenarios.

To date, the notebooks are organized as follows:

Python Basic Topics

  • Code Formatting
  • Strings
  • Lists
  • Tuples
  • Dicts
  • Sets
  • Iterators, Iterables, and Generators
  • Lambdas, map(), and reduce()
  • Printing
  • Useful Functions

Python Intermediate Topics

Default Dictionaries

NumPy

MatPlotLib

I hope you find these useful!


(Python logo provided courtesy of Python Software Foundation, used here under the “nominative use rules” of their policy.)

Report Card: Python 3 Essential Training on Lynda

To reinforce some of the basics, I recently completed the Python 3 Essential Training course on Lynda.

This course is taught by Bill Weinman the IT educator, not to be confused with Bill Weinman the Hollywood film and sound editor or Bill Wyman of the Rolling Stones.  Bill has done a good job here.  The lectures are flawlessly delivered.  His presentation materials are simple, clear, and direct.

I only experienced two disappointments.  The section on regular expressions was too brief.  This is certainly an area requiring its own long course, but adding a couple of additional short videos and examples would have been more useful.

The second disappointment concerned a lack of closure on conditionals.  Bill explained how Python does not include the case or switch command.  These are popular conditional logic structures found in C, Perl, Java, and the various Unix/Linux shells.  Bill promised to show how to create something similar in Python.  I expected him to demonstrate something simple or interesting with the if / eflif / else structure, but he ended up talking about populating dictionaries.  I couldn’t tell if he went off on a tangent or I missed something subtle but brilliant.  (Afternote:  I was still scratching my head on that one until I read Mark Lutz book Learning Python and came to understand how it was possible to use a dictionary to build a conditional structure for setting a variable at run time.)

The course incorporated 6 and a half hours of video in 85 short lectures, with 18 supporting example program  files.

This course has no exercises, final exams, grades, or certifications.


(Image courtesy of jarmoluk at Pixabay)

Guns, Armor, Speed, and Steel

 

One of the enduring arguments among naval historians is the correct categorization of the German capital ships Scharnhorst and Gneisenau.   Were these “small” battleships, as the Germans maintained, or were they actually the last battlecruisers, as their British adversaries claimed?  I have made a separate posting covering my data analysis of this question.

In order to practice some Excel chart-building skills, and to explore some of the more detailed Excel controls, I have prepared a workbook that includes a number of scatter charts.  I created these charts from a blank workbook in an empty file, so all the work is my own.  My humble work is available for your inspection on both GitHub and Google Drive.

The Data

Naturally, I begin with an Excel table with the key characteristics of 20th century heavy gun capital ships.  The data in the table comes from the book The Complete Encyclopedia of Battleships by Tony Gibbons.  The table contains three pieces of fictional data, all shown in red.  These three red data points represent two key points in “alternate history” for two significant planned ships that never put to sea.

  • The German battlecruiser Mackensen was built and launched, but was never crewed and commissioned due to resource constraints and changing priorities with the onset of World War I.
  • The entry named Schlactshiff represents an alternate history where the Scharnhorst and Gneisenau would have been completed a year later after waiting for the availability of the planned 15-inch guns.  This would have allowed them to be built as originally planned.

These two fictions are highlighted in the charts to allow comparison but prevent mis-representation.

The class of ships in question is included in the table twice with the same data.  One entry is included with the name Gneisenau classified as battleships, as the Germans claimed.  The second entry is named with the generic word Schlactkreuzer as battlecruisers, as the British claimed.

The Charts

Battleship Gun Timeline Chart

This timeline highlights the main issue with the Gneisenau class.  Their 11-inch guns make them the most visible data outlier for ships designated as battleships.  Excel scatter plots are chosen almost exclusively in this analysis as bar charts give an erroneous impression of “sales volume” and line charts are not suitable for this data series.

I intentionally chose a different color scheme for each chart in the workbook.  My objective is more about practice with the tools and less about optimal readability.  I chose a “battleship gray” color scheme for this first chart.  I chose a gray background with black axes, black labels, and with “sea blue” data points.  I included black grid lines for major scale and white grid marks for minor scale.  I like the “doughnut” option for data points as it resembles looking down gun barrels.  The darker “doughnuts” signify multiple table entries for that data point.  I used the “glow” chart element to highlight our class as well as the fictional data point.  I added text labels with the ship class names for some of the outliers and some of the most interesting data points.  I chose a linear trend line automatically calculated by Excel, in the same color as the data points.  Years of school have drilled into me that one should include axes titles and units, preferably starting at zero for the vertical axis.

German Battlecruiser Speed History Chart

This timeline highlights one dimension of the most interesting British claim about the Gneisenau class.  Key characteristics of the Gneisenau class closely align with an extrapolation over time of those same characteristics for the previous generations of German battlecruisers.  In other words:  whether intentional or not, in 1939 the German naval architects had built ships with characteristics that their fathers would likely have contemplated for the next generation of  battle cruisers.  In this chart, that key characteristic is the most important capability of a battle cruiser:  speed.

I chose an “ocean” color scheme here.  I selected white circular data points.  My high-school chemistry teacher taught that we should use large data points in charts for visibility and because the precision implied in a tiny data point is usually illusory.  (Appropriate in this case as maximum ship speed varies over time, depending on such things as engine maintenance and how many barnacles have attached to the hull.)  I included white grid lines for major scale and no grid marks for minor scale.  Again, I used the “glow” chart element to highlight our class as well as the fictional data point.  I added text labels with the names for the key classes.  I chose a linear trend line automatically calculated by Excel, in the same color as the data points.  The outlying position of the ship class in question forces the position of the trend line.  This could be creating a kind of bias in the calculated trend.  Therefore, I created a line object with black to white color gradient and hand placed it as a linear extrapolation of the local maxima of the relevant data in order to show a more independent trend line.  (I believe my added line reinforces the main point that extrapolating past speeds would lead to a speed close to the actual maximum speed of the Gneisenau class.)

German Battlecruiser Armor History Chart

This timeline highlights another dimension of the interesting British claim that the Gneisenau class had key characteristics that align with an extrapolation over time of those same characteristics for the previous generations of German battlecruisers.  In this chart, that key characteristic is the one that historically most distinguished German battlecruisers from British ones:  armor.

Merely to practice making chart object color choices in Excel, I chose a “German flag” color scheme here.  (The flag of German republics has been gold, black, and red.)  I selected square black data points, and made a deliberate exception of black diamond data points for the reference class Mackensen.  I included gray grid lines for major scale and no grid marks for minor scale.  Again, I used the “glow” chart element to highlight our subject class as well as the fictional data point.  I added text labels with the names for the key classes.  I chose a linear trend line automatically calculated by Excel, in the same color as the data points.  This time I made the trend line dotted.

German Battlecruiser Gun Size History Chart

This timeline highlights another characteristic for German battlecruisers.  In this chart, that key characteristic under review is main gun size.

Merely to practice making chart object color choices in Excel, I chose a “Prussian flag” color scheme here.  (All the changing flags of the German navy during this historical period were red, black, and white.)    I selected white “doughnut” data points, and made a deliberate exception of white diamond data points for the reference class Mackensen.  I included very light gray grid lines for major scale and no grid marks for minor scale.  Again, I used the “glow” chart element to highlight our subject class as well as the fictional data point.  I added text labels in white with the names for the key classes.  The data pattern discouraged the use of a linear trend line.  The 11-inch gun size of the Gneisenau class clearly fell well below any trend of the previous generations.

Contemporary Battleship Table

This table and associated charts shows and compares the characteristics of battleships built within about a decade of the Gneisenau class.  I used the default Excel chart settings, but used the “glow” chart element to highlight our subject class, the fictional data point, and the characteristics of two specific contemporary battleship classes (the Dunkerque class and the Yamato class.)

Clustering Data

As the main question concerns proper categorization, I used the k-means statistical technique to mathematically group all the ships into categories.  This table provides the key characteristics input into the clustering algorithm.  The “Country” and “Type” columns are included in the table, but were not used as inputs to the clustering algorithm.

This worksheet also provides a pre-visualization of the relationship between the key characteristics in order to help the eye see how clustering might unfold.  I used one of the Excel alternative charting styles to provide a simple clean appearance.

Clustering Results

This table shows the results of the clustering analysis performed to define five clusters.  I have provided nicknames to help distinguish the clusters as defined.  There are additional separate columns showing the results of clustering analysis for up to twelve clusters.

The results are analyzed in more detail in the accompanying separate post.  However, the results are interesting and pretty clearly defined.  The analysis clearly and successfully segregates all the historical “true” battlecruisers (Cluster 3.)  The analysis does not include the Gneisenau class in this cluster.

Though the clustering analysis did not incorporate dates, the analysis pretty clearly segregates a last “generation” of battleships.  (Cluster 2.)  To my surprise, the analysis does include the Gneisenau class in this cluster.

Afterword

I welcome any thoughts or suggestions you might have on the data, the use of the Excel tool, the analysis, and the results.  Feel free to contact me directly.


(Image courtesy of United States Navy.)