All posts by Greg Laden

Which month has the most snow in Minnesota? Not March

March 3, 2019Severe Weather and Other DisastersMinnesota Weather, Snow, Snowy marchGreg Laden

But in a way, March might be the snowiest month anyway. Or not. You can be the judge. Continue reading Which month has the most snow in Minnesota? Not March →

How to extract pages from a PDF file

March 3, 2019Technologyextract PDF, Linux, PDFGreg Laden

If you have a PDF file and need to extract a subset of pages, creating a new PDF file with those pages in it, you can do that.

I like PDF Lab‘s PDFtk aka PDF toolkit. This is not OpenSource and there is both a non free pro and free version of it. I’ve tried the free version (example below) and was impressed. Next time I need to do a lot of PDF work I’ll probably fork out the 399 for the pro version. (That’s 399 pennies, quite cheap. It is developed by Sid Steward, the author of PDF Hacks: 100 Industrial-Strength Tips & Tools.

So, for example, I can get pages 11-20 of a larger file called big.pdf extracted into a smaller file called extracted.pdf like this:

pdftk A=bigpdf cat A11-20 output extracted.pdf

That line of code makes almost no sense to me, but it works.

I learned about this tip at a Linux Journal Tech Tip page on extracting pages from a PDF, where you will find several other approaches.

The Umbrella Academy

March 1, 2019BooksNetflix, Umbrella AcademyGreg Laden

How can you not love a TV series where the most together character in the room is a chimpanzee that can talk.

I recommend the Netflix series The Umbrella Academy, and I hope Netflix does not do to this what they did to the Marvel shows they recently created, which was to sell them to a different pay to stream network that I do not subscribe to.

The Umbrella Academy is based on a set of graphic novels. There are multiple versions, but this is a suggested in-order set to look at if it interests you.

Number 1: The Apocalypse Suite, which includes numbers 1 through 5,thought this volume goes through number 6 and has other material in it:

The Umbrella Academy, Vol. 1

Number 2: Dallas. Yes, there is a Kennedy connection.

The Umbrella Academy: Dallas

Number 3: Hotel Oblivion.

The Umbrella Academy Volume 3: Hotel Oblivion

Key Moments in the Cohen Testimony to the House Oversight Committee

March 1, 2019PoliticsAOC, House Oversight, Michael Cohen TestimonyGreg Laden

… with a special focus on some of the moments not otherwise noted in the press.

8:One of the best exchanges. For context this is after some Republican yammered on and on for a while repeating their message that the Congress has better things to do than to preserve Democracy and stuff:

10: Among the best moments. This is @AOC’s coup de grace:

11:

12: This is the moment when we get to see the angry and offended white men scolding the black women. For context, before you throw Cummings under the bus, it is in fact against the rules for a member of Congress to say anything about another member of Congress while in session or a hearing.

13: The big largely ignored finish, in which Congressman Cummings pawns them all. Starts off slow, turns into one of the Great Speeches. Not a dry eye in the House. As it were.

Minnesota Winter Myths

March 1, 2019Physical Science and Math, Severe Weather and Other DisastersMinnesota WinterGreg Laden

Minnesota established its national reputation as a snowy and cold state because of a series of real and fictional events. During this time, the population of Minnesota has grown considerably. I’ll tell you why this matters after I show you the important data. We will then use this new found understanding to evaluate a recent viral video in the light of changing climate.

1940, Armistice Day Blizzard (145 dead). Population: 2.7 million

1970 Blizzard episode of Mary Tyler Moore show (no casualties). Population: 3.8 million

1991, Halloween Blizzard (22 dead, 100 injured). Population: 4.3 million

2019 The Great Snows of 2019 (casualties not yet counted). Population: 5.7 million

The average total snowfall for the Twin Cities is 47 inches over the winter, over the last century or so. Prior to 1979 (inclusively) the average was 43.7 inches. After that date, the average has been 53.4 inches. That is an expected increase of 20% owing likely to added moisture in the atmosphere caused by global warming.

For comparison, the average total snowfall in Buffalo, New York is 94 inches. The average annual snowfall in Boston is 42 inches, more like Minnesota. It is said that Minnesota gets a lot of snow. But really, Minnesota is mostly a semi-dry state, where agriculture only happens with irrigation, and the snowfall is half what it is on the other side of the Great Lakes, and about the same as the east coast. (The east coast is wetter, but more of that falls as rain or, as is the case of Boston, dense slush.)

Since the famous Armistice Day blizzard, which surely contributed significantly to Minnesota’s reputation, the population of the state has doubled. Since the Mary Tyler Moore days, when Minnesota became known to most other Americans, population has gone up by something like 30%. Indigenous Minnesotans don’t reproduce that fast, and many move away (to California, mostly) so that is a much larger number that are totally new to the area, often from tropical or at least warmer, areas, than one might think.

Plus, Minnesotans are known to be masters of passive-aggressive. But this also means they are masters of another trait: Deep denial.

For all these reasons, the weather of Minnesota matters little, and the reputation not at all, as a foundation for the ability of Minnesotans to handle winter. Which brings us to the following video, which YOU MUST WATCH TO THE END:

Conclusions: Look out the window before you leave your garage!

End Robocalls

February 26, 2019Technologyban robocalls, robocallGreg Laden

I don’t know about you, but I’m getting an increasing number of robocalls. Most of the calls I get are robocalls. I have stopped answering my phone unless it is from my wife, daughter, or other relative or person who’s phone is IDed.

I was speaking with someone last week who works in a business where phone calls are critically important, and the office has several phone lines. Each phone line gets a continuous stream of robocalls. There are times, frequently, when this business that relies on phone contact with clients turns off its entire phone system for an hour or two. That seems to temporarily reduce the number of robocalls, allowing for a brief period when customers can get through.

This madness must end!

I have a proposal to end robocall.

The US Congress passes a law that eliminates robocalls entirely. You simply can’t ever do them.

The robocall lobby will object, and fight, and make it impossible for such a bill to be passed. So, I have an additional set of provisions to help to get a bill like his through.

1) The ban on robocalls can not be lifted in any way for five years. That should give time for all the equipment to get old and all the people in the business to drift off.

2) If an exception is allowed, say for emergency calling systems, it can only be allowed on a state by state basis and only for a maximum of six months, but extendable. This way, any lifting of the ban will require re-evaluation and thus, it is possible that it will be less abused than similar laws have in the past.

In addition to banning all robocalling, it will be necessary, likely, to ban phone communication to or from countries that send out illegal robocalling.

How to do science with a computer: workflow tools and OpenSource philosophy

February 25, 2019Science, TechnologyBook review, Books, Computers, LaTeX, python, r, science work flow, scripting, software, TechnologyGreg Laden

I have two excellent things on my desk, a Linux Journal article by Andy Wills, and a newly published book by Stefano Allesina and Madlen Wilmes.

They are:

Computing Skills for Biologists: A Toolbox by Stefano Allesina and Madlen Wilmes, Princeton University Press.

Open Science, Open Source, and R, by Andy Wills, Linux Journal

Why OpenSource?

OpenSource science means, among other things, using OpenSource software to do the science. For some aspects of software this is not important. It does not matter too much if a science lab uses Microsoft Word or if they use LibreOffice Write.

However, since it does matter if you use LibreOffice Calc as your spreadsheet, as long as you are eschewing proprietary spreadsheets, you might as well use the OpenSource office package LibreOffice or equivalent, and then use the OpenSource presentation software, word processor, and spreadsheet.

OpenSource programs like Calc, R (a stats package), and OpenSource friendly software development tools like Python and the GPL C Compilers, etc. do matter. Why? Because your science involves calculating things, and software is a magic calculating box. You might be doing actual calculations, or production of graphics, or management of data, or whatever. All of the software that does this stuff is on the surface a black box, and just using it does not give you access to what is happening under the hood.

But, if you use OpenSoucre software, you have both direct and indirect access to the actual technologies that are key to your science project. You can see exactly how the numbers are calculated or the graphic created, if you want to. It might not be easy, but at least you don’t have to worry about the first hurdle in looking under the hood that happens with commercial software: they won’t let you do it.

Direct access to the inner workings of the software you use comes in the form of actually getting involved in the software development and maintenance. For most people, this is not something you are going to do in your scientific endeavor, but you could get involved with some help from a friend or colleague. For example, if you are at a University, there is a good chance that somewhere in your university system there is a computer department that has an involvement in OpenSource software development. See what they are up to, find out what they know about the software you are using. Who knows, maybe you can get a special feature included in your favorite graphics package by helping your new found computer friends cop an internal University grant! You might be surprised as to what is out there, as well as what is in there.

In any event, it is explicitly easy to get involved in OpenSource software projects because they are designed that way. Or, usually are and always should be.

The indirect benefit comes from the simple fact that these projects are OpenSource. Let me give you an example form the non scientific world. (it is a made up example, but it could reflect reality and is highly instructive.)

Say there is an operating system or major piece of software competing in a field of other similar products. Say there is a widely used benchmark standard that compares the applications and ranks them. Some of the different products load up faster than others, and use less RAM. That leaves both time (for you) and RAM (for other applications) that you might value a great deal. All else being equal, pick the software that loads faster in less space, right?

Now imagine a group of trollish deviants meeting in a smoky back room of the evile corporation that makes one of these products. They have discovered that if they leave a dozen key features that all the competitors use out of the loading process, so they load later, they can get a better benchmark. Without those standard components running, the software will load fast and be relatively small. It happens to be the case, however, that once all the features are loaded, this particular product is the slowest of them all, and takes up the most RAM. Also, the process of holding back functionality until it is needed is annoying to the user and sometimes causes memory conflicts, causing crashes.

In one version of this scenario, the concept of selling more of the product by using this performance tilting trick is considered a good idea, and someone might even get a promotion for thinking of it. That would be something that could potentially happen in the world of proprietary software.

In a different version of this scenario the idea gets about as far as the water cooler before it is taken down by a heavy tape dispenser to the head and kicked to death. That would be what would certainly happen in the OpenSource world.

So, go OpenSource! And, read the paper from Linux Journal, which by the way has been producing some great articles lately, on this topic.

The Scientists Workflow and Software

You collect and manage data. You write code to process or analyze data. You use statistical tools to turn data into analytically meaningful numbers. You make graphs and charts. You write stuff and integrate the writing with the pretty pictures, and produce a final product.

The first thing you need to understand if you are developing or enhancing the computer side of your scientific endevour is that you need the basic GNU tools and command line access that comes automatically if you use Linux. You can get the same stuff with a few extra steps if you use Windows. The Apple Mac system is in between with the command line tools already built in, but not quite as in your face available.

You may need to have an understanding of Regular Expressions, and how to use them on the command line (using sed or awk, perhaps) and in programming, perhaps in python.

You will likely want to master the R environment because a) it is cool and powerful and b) a lot of your colleagues use R so you will want to have enough under your belt to share code and data now and then. You will likely want to master Python, which is becoming the default scientific programming language. It is probably true that anything you can do in R you can do in Python using the available tools, but it is also true that the most basic statistical stuff you might be doing is easier in R than Python since R is set up for it. The two systems are relatively easy to use and very powerful, so there is no reason to not have both in your toolbox. If you don’t chose the Python route, you may want to supplement R with gnu plotting tools.

You will need some sort of relational database setup in your lab, some kind of OpenSource SQL lanaguge based system.

You will have to decide on your own if you are into LaTex. If you have no idea what I’m talking about, don’t worry, you don’t need to know. If you do know what I’m talking about, you probably have the need to typeset math inside your publications.

Finally, and of utmost importance, you should be willing to spend the upfront effort making your scientific work flow into scripts. Say you have a machine (or a place on the internet or an email stream if you are working collaboratively) where some raw data spits out. These data need some preliminary messing around with to discard what you don’t want, convert numbers to a proper form, etc. etc. Then, this fixed-up data goes through a series of analyses, possibly several parallel streams of analysis, to produce a set of statistical outputs, tables, graphics, or a new highly transformed data set you send on to someone else.

If this is something you do on a regular basis, and it likely is because your lab or field project is set up to get certain data certain ways, then do certain things to it, then ideally you would set up a script, likely in bash but calling gnu tools like sed or awk, or running Python programs or R programs, and making various intermediate files and final products and stuff. You will want to bother with making the first run of these operations take three times longer to set up, so that all the subsequent runs take one one hundredth of the time to carry out, or can be run unattended.

Nothing, of course, is so simple as I just suggested … you will be changing the scripts and Python programs (and LaTeX specs) frequently, perhaps. Or you might have one big giant complex operation that you only need to run once, but you KNOW it is going to screw up somehow … a value that is entered incorrectly or whatever … so the entire thing you need to do once is actually something you have to do 18 times. So make the whole process a script.

Aside form convenience and efficiency, a script does something else that is vitally important. It documents the process, both for you and others. This alone is probably more important than the convenience part of scripting your science, in many cases.

Being small in a world of largeness

Here is a piece of advice you wont get from anyone else. As you develop your computer working environment, the set of software tools and stuff that you use to run R or Python and all that, you will run into opportunities to install some pretty fancy and sophisticated developments systems that have many cool bells and whistles, but that are really designed for team development of large software projects, and continual maintenance over time of versions of that software as it evolves as a distributed project.

Don’t do that unless you need to. Scientific computing often not that complex or team oriented. Sure, you are working with a team, but probably not a team of a dozen people working on the same set of Python programs. Chances are, much of the code you write is going to be tweaked to be what you need it to be then never change. There are no marketing gurus coming along and asking you to make a different menu system to attract millennials. You are not competing with other products in a market of any sort. You will change your software when your machine breaks and you get a new one, and the new one produces output in a more convenient style than the old one. Or whatever.

In other words, if you are running an enterprise level operation, look into systems like Anaconda. If you are a handful of scientists making and controlling your own workflow, stick with the simple scripts and avoid the snake. The setup and maintenance of an enterprise level system for using R and Python is probably more work before you get your first t-test or histogram than it is worth. This is especially true if you are more or less working on your own.

Culture

Another piece of advice. Some software decisions are based on deeply rooted cultural norms or fetishes that make no sense. I’m an emacs user. This is the most annoying, but also, most powerful, of all text editors. Here is an example of what is annoying about emac. In the late 70s, computer keyboards had a “meta” key (it was actually called that) which is now the alt key. Emacs made use of the metakey. No person has seen or used a metakey since about 1979, but emacs refuses to change its documentation to use the word “alt” for this key. Rather, the documentation says somethin like “here, use the meta key, which on some keyboards is the alt key.” That is a cultural fetish.

Using LaTeX might be a fetish as well. Obliviously. It is possible that for some people, using R is a fetish and they should rethink and switch to using Python for what they are doing. The most dangerous fetish, of course, is using proprietary scientific software because you think only if you pay hundreds of dollars a year to use SPSS or BMD for stats, as opposed to zero dollars a year for R, will your numbers be acceptable. In fact, the reverse is true. Only with an OpenSource stats package can you really be sure how the stats or other values are calculated.

And finally…

And my final piece of advice is to get and use this book: Computing Skills for Biologists: A Toolbox by Allesina and Wilmes.

This book focuses on Python and not R, and covers Latex which, frankly, will not be useful for many. This also means that the regular expression work in the book is not as useful for all applications, as might be the case with a volume like Mastering Regular Expressions. But overall, this volume does a great job of mapping out the landscape of scripting-oriented scientific computing, using excellent examples from biology.

Mastering Regular Expressions can and should be used as a textbook for an advanced high school level course to prep young and upcoming investigators for when they go off and apprentice in labs at the start of their career. It can be used as a textbook in a short seminar in any advanced program to get everyone in a lab on the same page. I suppose it would be treat if Princeton came out with a version for math and physical sciences, or geosciences, but really, this volume can be generalized beyond biology.

Stefano Allesina is a professor in the Department of Ecology and Evolution at the University of Chicago and a deputy editor of PLoS Computational Biology. Madlen Wilmes is a data scientist and web developer.

Hawking’s Black Holes and Baby Universes Cheap

February 25, 2019UncategorizedCheap Book, hawkingGreg Laden

Black Holes and Baby Universes: And Other Essays in kindle form cheap right now.

In his phenomenal bestseller A Brief History of Time, Stephen Hawking literally transformed the way we think about physics, the universe, reality itself. In these thirteen essays and one remarkable extended interview, the man widely regarded as the most brilliant theoretical physicist since Einstein returns to reveal an amazing array of possibilities for understanding our universe.

Building on his earlier work, Hawking discusses imaginary time, how black holes can give birth to baby universes, and scientists’ efforts to find a complete unified theory that would predict everything in the universe. With his characteristic mastery of language, his sense of humor and commitment to plain speaking, Stephen Hawking invites us to know him better—and to share his passion for the voyage of intellect and imagination that has opened new ways to understanding the very nature of the cosmos.

Book Note: Preet Bharara Doing Justice

February 24, 2019BooksBook, Preet Bharara, SDNYGreg Laden

This is available for pre-order and it is probably going to be great. I’ve not seen it, but Bharara was a highly accomplished SDNY prosecutor and here he is writing about that role. This isn’t about the Trump Crime Family prosecutions and investigations, as so many books these days are, but this may be an important book to read to understand the bigger picture. Thought you’d like to know about it.

Doing Justice: A Prosecutor’s Thoughts on Crime, Punishment, and the Rule of Law

Preet Bharara has spent much of his life examining our legal system, pushing to make it better, and prosecuting those looking to subvert it. Bharara believes in our system and knows it must be protected, but to do so, we must also acknowledge and allow for flaws in the system and in human nature.
The book is divided into four sections: Inquiry, Accusation, Judgment and Punishment. He shows why each step of this process is crucial to the legal system, but he also shows how we all need to think about each stage of the process to achieve truth and justice in our daily lives.
Bharara uses anecdotes and case histories from his legal career–the successes as well as the failures–to illustrate the realities of the legal system, and the consequences of taking action (and in some cases, not taking action, which can be just as essential when trying to achieve a just result).
Much of what Bharara discusses is inspiring–it gives us hope that rational and objective fact-based thinking, combined with compassion, can truly lead us on a path toward truth and justice. Some of what he writes about will be controversial and cause much discussion. Ultimately, it is a thought-provoking, entertaining book about the need to find the humanity in our legal system–and in our society.

By the way, for those who enjoyed this movie and/or book, Preet Bharara is the real life person of one of the key characters in it.

More Proof That Donald Trump Is A Con Artist

February 24, 2019Politics4th of July parade, Con Artist, Don the Con, Donald Trump, PT Barnum, Salmon, West Wing ClipGreg Laden

The moment this tweet of Donald Trump’s came out, everyone saw it for what it was and laughed at him. I have no additional insight beyond the obvious, other than to say that it will be difficult for the editors of the American Heritage Illustrated Dictionary to decide if a picture of Don the Con, or this tweet, should be displayed next to the definition of “Grifter.”

Anyway, here is the tweet:

HOLD THE DATE! We will be having one of the biggest gatherings in the history of Washington, D.C., on July 4th. It will be called “A Salute To America” and will be held at the Lincoln Memorial. Major fireworks display, entertainment and an address by your favorite President, me!

— Donald J. Trump (@realDonaldTrump) February 24, 2019

And here is a small West Wing clip that I am somehow reminded of, from Season 3, Episode 8, “The Indians in the Lobby”:

Peculiar Children and Murder on a Train

February 24, 2019OtherCheap Book, Miss Peregrine, poirotGreg Laden

Two books suddenly cheap in Kindle form that may interest you:

Miss Peregrine’s Home for Peculiar Children (Miss Peregrine’s Peculiar Children Book 1)

Murder on the Orient Express: A Hercule Poirot Mystery (Hercule Poirot series Book 10)

JK Rowling Book Cheap

February 24, 2019BooksCheap Kindle Book, Cormoran Strike, Cuckoo's Calling, JK Rowling, Robert GalbraithGreg Laden

Cormoran Strike is author Robert Galbraith’s fictional character, a UK Military Police veteran now eking out a living as a private eye. He works with an assistant who some would mistakenly regard as totally out of her league in the hard boiled noir world of private detecting. (But they would be wrong.) Each of the stories about Cormoran Strike and Robin Ellacott is set in a different subworld of Great Britain, including London’s version of Hollywood, and the world of writers and agents. The stories are fiendishly clever, the bad guys cleverly fiendish, and the protagonists compelling and disarming. I very strongly recommend reading all of them.

Author Robert Galbraith is, of course, JK Rowling, author of the Harry Potter series and a few other books.

At this moment, the first of the books is on sale cheap (just under 4 bucks) in Kindle form. You should read The Cuckoo’s Calling (Cormoran Strike Book 1).

Then, not on sale at this time but for your information, read:

The Silkworm (Cormoran Strike Book 2)

Career of Evil (Cormoran Strike Book 3)

Lethal White (A Cormoran Strike Novel)

Cheap early Carl Hiaasen book!

February 23, 2019UncategorizedCarl Hiaasen, Cheap BookGreg Laden

Fans of Carl Hiaasen who have not yet read his book Trap Line can get it right now cheap in Kindle form.

Though he is one of Key West’s most skilled fishing captains, Breeze Albury barely ekes out a living on the meager earnings of his trade. Meanwhile, Cuban and Colombian drug smugglers thrive all around—and they have their sights set on Albury and his fishing boat.

After the smugglers cut his three hundred trap lines and crush his livelihood, Albury is forced to run drugs to survive. But when he gets busted by the crooked chief of police and becomes a target of the drug machine’s brutal hit men, Albury becomes a vigilante on the seas of Florida, unleashing a fiery and relentless vengeance on the most dangerous criminals south of Miami.

Along with Powder Burn and A Death in China, this is one of the early suspense thrillers written by Carl Hiaasen and Bill Montalbano, a writing team praised for their “fine flair for characters and settings” (Library Journal). Perfect for fans of the Doc Ford novels by Randy Wayne White, Trap Line is an action-packed preview of Hiaasen’s stellar Florida-set crime novels including Sick Puppy, Tourist Season, and Razor Girl.

See THIS for more info on the author.

How the current attack on the Democratic Party candidates works

February 21, 2019PoliticsElection 2020, Hacking, Russian trollsGreg Laden

Dune, ersatz Holmes, cheap books

February 21, 2019UncategorizedDune, Holmes Cheap BooksGreg Laden

A Study in Charlotte (Charlotte Holmes Novel Book 1) by Brittany Cavallaro is the second in a series of Holmes Not Holmes books. The series consists of:

A Study in Charlotte (Charlotte Holmes Novel Book 1)

The Last of August (Charlotte Holmes Novel Book 2)

The Case for Jamie (Charlotte Holmes Novel Book 3)

And, Frank Herbert’s Dune Messiah also cheap on Kindle.