{"id":26890,"date":"2009-08-14T11:15:23","date_gmt":"2009-08-14T11:15:23","guid":{"rendered":"http:\/\/scienceblogs.com\/gregladen\/2009\/08\/14\/taming-twitter-with-the-comman\/"},"modified":"2009-08-14T11:15:23","modified_gmt":"2009-08-14T11:15:23","slug":"taming-twitter-with-the-comman","status":"publish","type":"post","link":"https:\/\/gregladen.com\/blog\/2009\/08\/14\/taming-twitter-with-the-comman\/","title":{"rendered":"Taming Twitter with the Command Line"},"content":{"rendered":"<p>I thought I was done with the command line for the week, but then I did something cool that I thought I&#8217;d share with you.  Linux users only &#8230; others will think this is silly &#8230;  join me below the fold.<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/i0.wp.com\/scienceblogs.com\/gregladen\/wp-content\/blogs.dir\/472\/files\/2012\/04\/i-f30dea614b07d492a8c391a4cdad55a3-a-tweetoutput.jpg?w=604\" alt=\"i-f30dea614b07d492a8c391a4cdad55a3-a-tweetoutput.jpg\" data-recalc-dims=\"1\" \/><\/p>\n<p><!--more--><br \/>\nOK, are we alone?  Good.  It&#8217;s nice to be away from all those Windows Symps for a while.  Oh, I see there&#8217;s a couple of Mac OSX users lurking in the back of the room.  Come right over and join us, this will probably work for you too!.<\/p>\n<p>Here is my problem.  I use twitter to promote my blog, and I therefore follow almost 3,000 people on twitter.  That means that every few seconds there is an update on my &#8220;people I follow&#8221; list, and it is almost always something I am not even slightly interested in.<\/p>\n<p>This makes Twitter kind of useless for me.  Just the other day, I did something that a lot of people apparently found terribly objectionable (I have no idea what it was) and later got an email from a friend that said &#8220;Hey, did you see what so and so and so and so and so and so said about you on Twitter?&#8221;<\/p>\n<p>&#8220;Of course not&#8221; was my reply.  &#8220;I can&#8217;t use Twitter.  Too much data, not enough good data.  If something happens on Twitter there is no way for me to know it.&#8221;<\/p>\n<p>Then, yesterday, I heard: &#8220;Hey, did you hear that so and so and so and so are getting married?  It was the talk  on Twittter!&#8221;<\/p>\n<p>&#8220;Of course not&#8221; was my reply&#8230; and so on and so forth, you get the point.<\/p>\n<p>I know there are many possible solutions to this problem (feel free to mention them in the comments if you like) but having a second Twitter account (one of the obvious solutions) with my actual &#8216;friends&#8217; on it was not one I wanted to do.  Instead I wanted  to filter the data.<\/p>\n<p>Yes, there are some software solutions out there to filter data, and maybe some day I&#8217;ll use one of them, but I really felt that this was a case of too much mucking around for a simple solution. What I really wanted to do was no more than this:<\/p>\n<ol>\n<li>Produce a file of recent tweets by those I &#8220;follow.&#8221;  <\/li>\n<li>Manually maintain a file of the names of people whose tweets I want to actually see.  An a-list, if you will. <\/li>\n<li>Write a script that would use these two data sources to come up with a list of tweets by the smaller list of people, culled from the larger fire hose list.  <\/li>\n<li>Format that list minimally so it shows up in a web browser as a local page.<\/li>\n<li>Put a button on my Gnome toolbar that makes all this happen.<\/li>\n<\/ol>\n<p>I call it a-tweet.  Example results are depicted in the graphic above, and the script looks like this:<\/p>\n<p><code>#! \/bin\/bash<\/p>\n<p>twidge lsrecent -lsu > ~\/twidge_data\/data<\/p>\n<p>awk 'NR==FNR {u[$1];next} ($2 in u)' ~\/twidge_data\/a-list ~\/twidge_data\/data > ~\/twidge_data\/a-tweets<\/p>\n<p>cut -f2,4 ~\/twidge_data\/a-tweets |<\/p>\n<p>sed 's\/^&#40;.<em>&#41;\\t&#40;.<\/em>&#41;\/&lt;DT>\\1&lt;\\\/DT>&lt;DD>\\2&lt;&lt;\\\/DD>&lt;\\\/br>&lt;\\\/br>\/g<\/p>\n<blockquote><p>\n  ~\/twidge_data\/a-tweet.html\n<\/p><\/blockquote>\n<p>firefox ~\/twidge_data\/a-tweet.html<\/code><\/p>\n<p>This script is not all squishy and one-liney like it could be.  I use intermediate files and drawn out constructs so that I can later mess with it more easily.  This also makes it easier to explain.<\/p>\n<p>The first line uses Twidge.  Twidge is a great find.  <code>sudo apt-get install twidge<\/code> will add it to your system.<\/p>\n<p>After you install twidge you will run<\/p>\n<p><code>twidge setup<\/code><\/p>\n<p>which will ask you for your user name and password for the account you want to twidge around with.<\/p>\n<p>Twidge lets you do cool things with Twitter on the command line.  <a href=\"http:\/\/software.complete.org\/software\/projects\/show\/twidge\">Learn about it here.<\/a>  Among other things, you can post tweets from the command line, or get a list of your followers or followees, and so on.  I found it by just searching for &#8220;twitter&#8221; in my package manager.<\/p>\n<p>So let&#8217;s break it down.<\/p>\n<p><code>twidge lsrecent -lsu > ~\/twidge_data\/data<\/code><\/p>\n<p>Twidge always works by using &#8216;twidge&#8217; followed by a command.  In this case, lsrecent, which lists (ls) recent tweets.  There are arguments one can use to make this list come out differently, but the -l argument results in a one-tweet-per-line tab delimited list of tweets, and is essential for thi script to work.  The -s and -u paramaters make Twidge keep track of where it last looked and gives you stuff only since then.  Not shown but available is the -all option.<\/p>\n<p>The -all option (not shown) is &#8230; optional.  If you have a couple of hundred followees and want to see the most recent among, say, a couple of dozen, you probably don&#8217;t need it.  However, I&#8217;m scanning for about twenty twits among nearly three thousand, and I may check only every few days.  Twidge tends to be conservative, only fetching the most recent several dozen tweets.  I believe the -all option may work better for the scenario I just described.   In any event, you can play with it because it is your command line.  You can even create different versions of the script to run under different circumstances.<\/p>\n<p>Here, I dump the output into a file in a special subdirectory where I keep the twidge-related data.  The data file that holds these data is called, enigmatically, &#8220;data.&#8221;  After the guy on star Trek.<\/p>\n<p>Next line:<\/p>\n<p><code>awk 'NR==FNR {u[$1];next} ($2 in u)' ~\/twidge_data\/a-list ~\/twidge_data\/data > ~\/twidge_data\/a-tweets<\/code><\/p>\n<p>This line takes two files, a-list and data, and using the programing language awk filters the data list based on matches between the second field (which happens to be user name) and the list of user names in the file a-list.  The output, the filtered subset of tweets that are only by &#8220;a-list&#8221; twits (if that is what they are called) is then dumped into the file &#8220;a-tweets&#8221; with any old content in that file blotto&#8217;d out of existence.<\/p>\n<p>The next line is:<\/p>\n<p><code>cut -f2,4 ~\/twidge_data\/a-tweets |<\/p>\n<p>sed 's\/^&#40;.<em>&#41;\\t&#40;.<\/em>&#41;\/&lt;DT>\\1&lt;\\\/DT>&lt;DD>\\2&lt;&lt;\\\/DD>&lt;\\\/br>&lt;\\\/br>\/g'<\/p>\n<blockquote><p>\n  ~\/twidge_data\/a-tweet.html<\/code>\n<\/p><\/blockquote>\n<p>This is two (well, really, more, but I&#8217;m simplifying slightly) commands connected with a pipe.  The first command uses cut to isolate the second and fourth fields from the data, which happen to be the user name and the tweet itself.  Left off, then, is a blank field, an ID number, and the date\/time stamp.  One could argue for leaving on the date\/time stamp, but I chose not to.<\/p>\n<p>This stream of data, consisting of one username\/tweet pair per line, is then sent to sed, which inserts code to make an HTML definition list.  This formats the data as I like it.  It also inserts two HTML line breaks after each tweet.<\/p>\n<p>This sed command could be used in a computer science textbook to illustrate all the strangeness of sed.  I love sed.<\/p>\n<p>This stream of HTML formatted data is then dumped, unceremoniously, into a file where, if opened with a web browser, it will be formatted as I want it.  Since the filename is created with an &#8216;html&#8217; extension, opening the file directly will likely open it in an appropriate app, depending on how your desktop is configured.<\/p>\n<p>The last line simply opens up a Firefox instance or tab with the file.  If there were no tweets form the a-list twits, then there is nothing in the file and you get blankness.  Otherwise, you get something that looks like the picture posted above.<\/p>\n<p>There is a lot that can be done to enhance this. URL&#8217;s in the tweets could be identified (using sed) and packaged as links, for instance.  A bit of code allowing responses or retweets can be added.  Eventually, with enough mucking around, one can have a full featured application (like those that may or may not be available) but made entirely from scratch.<\/p>\n<p>Suggestions for mods welcome!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>I thought I was done with the command line for the week, but then I did something cool that I thought I&#8217;d share with you. Linux users only &#8230; others will think this is silly &#8230; join me below the fold.<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"1","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[],"tags":[67,130],"jetpack_sharing_enabled":true,"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/p5fhV1-6ZI","jetpack_likes_enabled":true,"_links":{"self":[{"href":"https:\/\/gregladen.com\/blog\/wp-json\/wp\/v2\/posts\/26890"}],"collection":[{"href":"https:\/\/gregladen.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/gregladen.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/gregladen.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/gregladen.com\/blog\/wp-json\/wp\/v2\/comments?post=26890"}],"version-history":[{"count":0,"href":"https:\/\/gregladen.com\/blog\/wp-json\/wp\/v2\/posts\/26890\/revisions"}],"wp:attachment":[{"href":"https:\/\/gregladen.com\/blog\/wp-json\/wp\/v2\/media?parent=26890"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/gregladen.com\/blog\/wp-json\/wp\/v2\/categories?post=26890"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/gregladen.com\/blog\/wp-json\/wp\/v2\/tags?post=26890"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}