Category Archives: Uncategorized

I love Visio! I hate Visio!

I love Visio.  It produces so many great diagrams with great detail and clarity.  You can’t build, manage, operate, or support a data center without Visio.

There are a few things I really hate about using Visio with MIS organizations (especially MIS management.)

The main thing my clients hate about Visio is the same thing my previous employer hated:  the licensing cost combined with the closed file format.  The more good VSDX files you have circulating in an MIS organization, the more people are clamoring for a licensed software copy so that they can even see the danged diagrams.  Microsoft has shrewdly used some of their traditional product packaging techniques to make it so that your typical end-user can’t see the contents without thinking he or she needs the full Visio software package.  (Yes, there are a couple of solutions for your more experienced user.)  Circulate an updated diagram of the data center floor plan and suddenly everybody in the MIS organization and in the data center building will be clamoring for Visio on their laptop or desktop (or both.)

The most amusingly annoying thing I hate about using Visio is “architect elitism.”  Due to the licensing cost, many organizations choose to distribute Visio licenses only to the few people actually expected to be producing VSDX diagrams.  The assumption is that the producers will distribute PDF copies to the diagram viewers.  At IBM, licenses were only distributed to a short list of IT architects.  This quickly established a concept of elitism connected to who had or did not have the software.  I attended many design meetings where the “peons” would complain about not being able to see the design and the favored few responded “You must not be an architect.  Only architects use Visio, and architects only use Visio.”  I truly believe that if software was still distributed in boxes, some of these guys and gals would have carried around the empty Visio product box so that everybody could tell they were one of The Chosen™.

One quick way to embarrass yourself in front of clients is to distribute VSDX files at an Executive-level design review.  Executives don’t tolerate the attitude that they would be able to appreciate the brilliance of your design if only they had the foresight to load Visio on their laptop.  Executives have long loved Powerpoint.  Naturally, this leads to large populations of Partners, Managers, and IT Sellers wanting to see everything in Powerpoint.

After spending enough tedious hours stuffing Visio diagrams into Powerpoint presentations, I concluded that most diagrams should start out as, and remain Powerpoint diagrams.  Visio is really indispensible when you need “everything on one chart,” such as a data center floor plan, a rack elevation diagram, or some network diagrams.  However, many presentations and discussions either take a semi-abstract view of the big picture, or only need to show the details about one small aspect of the big picture.  For drawing IT diagrams, Powerpoint has the features that Visio designers use 80% of the time, and about 60% of Visio drawing tasks are essentially the same as in Powerpoint.  Frankly, the best rack elevation diagrams that I’ve seen were built in spreadsheets.  (Which would you rather revise 5 times per day during a deployment project: a cabling spreadsheet or a cabling diagram?)

I believe that there are really only a few situations when you really need Viso:

  • If the diagram will be physically plotted on giant paper and posted on the wall for everybody to review,
  • If the client demands detailed network, cabling, or elevation diagrams that must be simultaneously complete at the large scale and the small scale.
  • If the client demands detailed diagrams that much be dimensionally accurate (such as floor plans and some rack elevations.)
  • If you are preparing high resolution detailed diagrams for publication.

In my dad’s day, quality detailed diagrams were the work of skilled draftsmen and graphics artists, but that is a topic for another day.  May all your diagrams be true and clear, and may your graphic arts skills ever increase!


(Historical image courtesy of NASA.)

 

Advertisements

Austin R User Group Monthly Meetup – May 2017

Jon Loyens gave a good overview and demonstration of data.world.

Jeff Jenkins gave a good overview on machine learning, especially unsupervised learning.  (I liked his chess movie analogies, especially the message “Don’t be afraid to bring your queen out early.”)

Jesse Sampson gave a good talk and demonstration of predictive analytics providing real business value.

Emily Bartha gave an encouraging lesson on how to create your first R package (theoretically in only one line of code.)

Jennifer Davis gave a entertaining presentation on machine learning.

Thanks to data.world for hosting us, and all the sponsors (R Studio, Continuum, Bamboo, data.world) for feeding us.

PyData Austin Meetup – April 2017

Jaya Zenchenko and Andy Terrel gave two solid introductions to real-world uses of Jupyter Notebooks.

Timothy Renner gave a fast but interesting and entertaining demonstration on how to find Bigfoot using clustering and Bokeh.

Thanks to Andy Terrel and Katrina Riehl  at Home Away for hosting and feeding us!

Mining, Data Mining, Earthquakes and Explosions

I love learning new things every day!  Today, filtering data let me to some fun artifacts about mining and earthquakes.

I’m currently working on some personal projects on data analysis and visualization.  I downloaded some earthquake data from the US Geologic Survey.  I requested only data for earthquakes in the US, but a fair number of seismic events in Canada and Mexico crept into the data.  While filtering out that data, I discovered that many of the events were recorded as happening near Princeton, British Columbia.  I love hiking and exploring in the great Pacific Northwest, so I took an interest in what might be generating this activity.  The Cascades are home to many volcanic features, and I was excited to think that one I hadn’t heard of might be restive.  A very cursory glance at the map showed no such features, but a cursory Google search led to a few interesting discoveries.

The first thing Google led me to was an indication that though these events may not be generating attention, they have not gone unnoticed.  One website shows that there have been about two dozen events in a specific area near Princeton over the last month.  An older article in Canadian media showed previous earthquakes in the same area in 2015, with the intriguing statement that earthquakes in the interior of British Columbia are “quite rare.”  Google eagerly led me to a YouTube video whose producer had observed an earlier cluster of seismic events and concluded they were evidence of earthquakes caused by “fracking.”  (Here in Texas, we hear quite a bit about fracking.)

The video producer was pretty adept at showing the indicated location on Google Earth, where the presence of some structures led him to instantly conclude that we were viewing “fracking on top of a mountain.”  He somehow managed to segue to a more valid observation that the mountain site was actually a copper mine (which was confirmed by a Canadian government press release and other sources.)  The producer then moved on to some commentary about mining waste, coal production, and eventually back to fracking.  However, I returned to my own research with the US Geologic Survey (USGS.)

I learned that there are a number of mining activities that can be detected by seismographs.  The USGS makes an interesting distinction between mining and earthquakes caused by mining.  The USGS calls energetic mining events that are directly detected by their seismographs “mining seismicity.”  The USGS calls true geologic events triggered by mining “mining induced earthquakes.”  Over the years, USGS has adapted their rules on when such detected blasting and other events are included in which of their earthquake catalogs and other seismic event catalogs.  Likewise, they have adapted their event catalogs for underground nuclear testing events.

The language used is precise because of the need for clarity on cause and effect.  There is no question that earthquakes can trigger events in mines, such as collapses.  There is no question that some mining activity (such as blasting) is a seismic event.  There is also substantial evidence that mining (and other earth transforming activities like dam-building) can trigger earthquakes by redistributing forces or changing rock properties..

In returning to my own project, I discovered that the events in the USGS data I was working with were actually tagged as earthquake, “explosion,” or “quarry blast.”  I have not yet determined how USGS makes a distinction between those last two tags.

As a bonus factoid, I learned that the hypothetical limit to earthquake power would be a 12 on the Richter scale, as it would take a fault larger than the Earth to create a quake more powerful.

When I have completed my earthquake project, I will share the results.


(Image courtesy of Sebastian Pichler at Unsplash.)

Time for Special Ethics in Machine Learning?

Will Oremus posted an interesting article yesterday on Slate called Move Fast and Break Trust about the risks of pilot-phase machine learning.  The article touches on machine language deployments that are drawing attention and a little concern.

The first deployment described is Google Home, the new mischievous “elf on the shelf” that listens instead of watching you.  In its product description, Google Home is described as “… a voice-activated speaker powered by the Google Assistant” and Google encourages you to “ask it questions.”  If you ask it objective questions about the time and the weather, you could expect reasonable answers.  Addriane Jeffries reports in The Outline that if you ask it questions about politics, current events, public figures, or any number of other interesting topics, Google Home will be happy to confidently answer you based on a broad range of erroneous information and conspiracy theories that are likely to show up in Google searches.  In other words, Google Home can be a wonderful font of in-home “fake news.”

The second deployment described is Uber’s controversial deployment of self-driven cars in San Fransciso last year.  One self-driving car was filmed running a red light and driving through a pedestrian crosswalk.  Oremus’ article talks about how machine learning algorithms require a pilot-phase where they are expected to make many errors, with the expectation that their performance will improve to desirable (acceptable?) levels after sufficient “real world experience” has been acquired.  San Francisco would clearly be a dangerous place for humans if a large number of autonomous vehicle producers were all doing early phase deployments at the same time.  Likewise, if a single producer was doing a single deployment with a large number of vehicles in order to speed up the “learn-in” time.  Oremus makes a soft call for regulation of these kinds of testing.

Oremus’ article touches on one particularly interesting detail.  Google has been working on autonomous vehicles in a methodical way, including the machine learning and testing parts.  Google has presumably capture plenty of real-world street level data, not only from its autonomous project but in its long-established street mapping operations.  In order to catch up quickly, Uber and Tesla chose a more aggressive path, putting code into production and on the streets quickly in order to accelerate data collection and machine learning.

We can be optimistic that new machine learning methods will make this kind of pilot phase faster and safer.  However, perhaps it is time for a code of ethics drafted specifically for machine learning.  I would welcome your thoughts on what that should contain and how it would work.  In these scenarios, it is common to think of Isaac Asimov’s Laws of Robotics, with limited immediate utility.  However, we need something more concrete that can be used to directly influence the coding and the testing.


(Image provided by geralt at Pixabay.)