There is a lot that goes into producing Talk of the Sound each day. I am pretty sure folks have no idea of what is involved. It is pretty cool if you are into that sort of stuff so I am going to share a bit on how we get it done.
As a follow up on my recent post on how Talk of the Sound is integrating social media into the site, I thought I might explain just how we get so much content into Talk of the Sound and do so on a 24/7 basis. Let me qualify this post by saying it is a somewhat technical discussion so not intended for a general audience. I am also hoping that some readers may have suggestions on how to improve the process of editing content for Talk of the Sound.
To start with, the content we publish can be broken into three groups:
(1) Talk of the Sound content generated on Talk of the Sound by Talk of the Sound contributors (i.e., registered users of the site) who publish articles (i.e., posts) and comments (i.e., user-generated content).
(2) Third-party news and information sources such as online news sites, news aggregators like Google News, government web sites and so on.
(3) Third-party user-generated content from social media sources like Twitter, Facebook, Flickr, Instagram, Tumblr, etc.
The goal with this content is to present news and information about New Rochelle (and a bit about related Westchester County and New York State news) to our readers, to provide a single-point of contact for anyone wanting a complete picture of news and information about New Rochelle.
We do that primarily by organizing a collection of news headline text-links by subject matter and pushing those news headline text-links out to readers in a variety of ways: a dynamically-updated display on the home page, a daily snapshot sent each morning at 3 am in our newsletter and continuously throughout the day on Twitter and Facebook.
The idea is that you can get this information as you like it — via a single web page (our home page), via email or via social media (Twitter/Facebook). The one area that needs work is our Facebook page, we will get around to that eventually.
For Talk of the Sound content exclusively, there are many other ways to “discover” stories.
Every single new article published on Talk of the Sound is organized under the “Talk of the Sound Articles” tab at the top of the home page, the most popular stories of the day are displayed in a “Most Popular Today” block on the home page, the most recently commented upon articles are displayed in a “Most Recent Comments” block on every page on the site. The most recent 24 hours worth of stories are included in the email newsletter. News articles are sent out as tweets throughout the day.
For all content, from Talk of the Sound and third-party news and information sources and social media sources, there are three main ways to get it.
The Talk of the Sound home page contains a series of text-link headlines, organized by subject matter, to give our readers a comprehensive view of what is happening in their world.
Mailchimp, the email newsletter software we use, automatically pulls all of the text-link headlines on the home page at 3 a.m. and incorporates them into the daily newsletter along with the last 24 hours of Talk of the Sound articles (if there are no new articles then no newsletter is sent, this sometimes happens on a Sunday or Holiday if I am taking a day off).
Each headline is sent as an individual tweet.
There is no simple way to do all this for a number of reasons but the most important reason is that every web site “packages” the information about their articles differently. Without getting too deep into the weeds, web sites are typically displaying information that is stored in multiple places on their own served and often pulling information from other companies servers. We do that — we have ads running from Google and Amazon, we have a deal with Iona College to display sports information, we pull Flickr images and Tweets and much more. This information – text, images or video – is organized for you to read in a browser by a database that keeps track of what goes where.
Where it gets messy is that every article you see on the web with a headline is placing that text into a field called “Title”, the date of publication in a field called “Date”, and so on. Almost every web site packages that data so that it can be “pulled” into another web site. For example, Twitter wants me to display Talk of the Sound tweets on our home page because that promotes Twitter; YouTube allows me to embed code they provide into an article on Talk of the Sound because my displaying video from their site on my site promotes YouTube. The New York Times, Journal News, City of New Rochelle web site and pretty much every other has enabled RSS feeds which are based on information on each article on their site organized into a data format called XML. Social media services offer something called an API which allows them to share information with programmers and developers of applications that integrate their service with a web service or application. For example, iPhone/iPad apps like Tweetbot uses the Twitter API.
So, lots of sharing of data but not necessarily a lot of standardization. Well, there is standardization but there are often competing standards much the same way video tape machines fought a standards war between BetaMax and VHS. And even with standardization, people make errors input data so sometimes you can get blank fields or invalid information types placed into fields (e.g, letters placed in a field for MM/DD/YYYY). And there are always those folks who use ALL CAPS which is considered bad form (and annoying to some readers).
The net effect of this is that if I want to monitor every possible source that might mention or reference New Rochelle or certain proper nouns (e.g, the word “Wykagyl” or the name “Noam Bramson”) then not only will I be culling through a lot of sources but once I identify a text-link I want to share there is a virtually unlimited number of ways in which that data might have been organized so I cannot simply dump it into Talk of the Sound. It would be a mishmash of styles and very distracting to readers.
Every link I wish to manually share with readers needs to be “cleaned” and “tagged” so that it can be organized properly by category (i.e. “tag”) and will display an actual headline, formatted correctly and without any additional clutter. This might mean shortening a headline that is too long, changing all upper case letter to be properly capitalized, stripping out promotional statements that might follow a headline such as indicating the source web site. I could also do this for the automatically generate feeds but that would be far too time consuming; someday I might be able to add this capability but I would need resources that I do not have at this time (i.e., staff) so readers will see some poorly formatted headlines on the home page in sections that automatically pull feeds from certain sources (more on that below).
Now you know that when you see a headline (in most cases) on the home page or being tweeted (always) or in the email newsletter (in most cases) that I have selected that story from among hundreds of millions we monitor each day using filters on sites like Google News or Twitter or from other sites that pre-monitor the content before we see it. Those hundreds of millions get streamlined down to several thousand a day that I personally skim looking for stories that I believe will interest Talk of the Sound readers (and occasionally because they amuse me or are about Notre Dame football – Editor’s privilege 🙂
To display the headlines on the home page, I use a service called Feedroll which provides embeddable javascript code that displays text-links in a block format. Each news category on the home page (e.g, Crime & Punishment, General News, City Hall & BID, etc.) has its own RSS feed, being organized over at Feedroll and displayed on Talk of the Sound. This can sometimes have a hiccup for various reasons which is why sometimes you see no headlines listed in a particularly category and why the headlines do not load instantly when you arrive at the site — they are being pulled from Feedroll and, I think, they are based in Italy. So there can be a distance issue and there can be a problem with how the XML has been used by a particular news source — if the XML is “broken” for a particular news item it can prevent Feedroll from displaying properly and this may remain the case until that news item drops out of the rotation displayed on our home page. We display anywhere from 4 to 10 items in a category so in a small category with four links, a bad link will mess up the entire category until that link becomes the fifth link and is thus no longer being served up by Feedroll.
The RSS feeds that flow into Feedroll have to first be organized so that they display the most recent articles at the top of the block and then cascade down in chronological order. I also need to make sure to avoid duplicates and otherwise filter the news items I select for publication via Talk of the Sound.
To do that, a series of different feeds are run through a service called Yahoo! Pipes which is a service that allows users to “mash up” RSS feeds.
In my case, I am typically mashing up multiple feeds to create one feed to Feedroll.
The first is a feed based on a category within Talk of the Sound. For example, we assign a category “City Hall” for all stories about the government of the City of New Rochelle or “Fundraisers and Events” for all non-official public events ranging from a poetry reading at the New Rochelle Public Library to the 325th Anniversary Dinner-Dance.
The second is one or more feeds from a specific source which are automatically assigned a category by virtue of being placed in a particular Yahoo Pipe. For example, all news items from the Westchester County Board of Legislators are in the “Westchester County” pipe or all items from area newspapers on local pro sports teams are in the “Pro Sports” pipe.
The third is a feed based on a manually assigned category (i.e. “tag”) assigned within a service called Delicious, a social bookmarking site. Social bookmarking is just like bookmarking in a browser when you want to save a particular web page for future reference but the pages you have bookmarked can be seen by anyone on the web.
You can see Talk of the Sound’s Delicious bookmarks here: https://delicious.com/newrochelletalk.com.
Note, you will also see the 2013 Annual Report for Talk of the Sound from Delicious — it is interesting because it provides word cloud and interactive diagrams and charts describing who we link most and the most popular links we have bookmarked. You will also see that Talk of the Sound is among the top 2% of users worldwide of this service which surprised even me.
In any case, we use Delicious for text-link headlines from third-party sources like Google News or Twitter or myself by directly adding an article I am reading in my browser. They are all gathered there where they can be “cleaned” — headlines trimmed up and tags assigned. For the City of New Rochelle I do not use “City Hall” but just “city”, for emergency responder stories I use just “cops”.
Delicious is great but has one thing I do not like — that I cannot edit links to clean and tag them using their iPhone app. It can be done on my iPhone via the web browser on the iPhone but the interface is a little clunky.
Realize that I am doing A LOT of this so even small imperfections in my workflow can mean lots of wasted time and that I am doing this at any time of the day or night and from any location. In other words, I need my work flow optimized for my iPhone.
There is another type of service out there for reading nicely formatted version of web stories, especially useful for mobile devices, such as Instapaper, Readability and Read It Later (renamed “Pocket). I use them all but for my workflow, Pocket includes the ability to edit and tag text-link headlines. That’s the good news (for me). The bad news is that they do not offer RSS feeds based on tags. So, I need to get all manually selected text-link headlines into Pocket so I can open that app on the iPhone, click a button to prepare that item to be sent to Delicious where I can clean and tag the headline and send it on to Delicious properly tagged so it will display in the correct category on the home page.
I manually identify headline text-links to share with Talk of the Sound readers in a variety of ways but the two most common are that a link is shared on Twitter (I use Tweetbot) or published as an RSS item and read by me in an RSS Reader (I use Reeder on my iPhone and iPad and ReadKit on my Mac). Two other ways are that I come across an article while surfing the web or a reader sends me an email with a story they think might interest me. Tweetbot, Reeder and ReadKit are all integrated with Pocket so I can click a button in those apps and the headline text-link appears in Pocket moments later.
Of course, if Delicious offered editing through their iPhone app or Pocket offered RSS feeds based on tags, my workflow would be one step simpler but that is not the case. These things do change all the time — I have tweaked this workflow many, many times — and I fully expect that one or the other or both apps will eventually offer these capabilities.
So, this explains how I manually identify stories (which I keep calling “headline text-links”) that I want to share and what happens once I have manually identified a link (as opposed to the other two content options where the text-link headlines automatically flow into the home page and into the email newsletter) and cleaned it up and tagged it.
Delicious becomes my repository of all manually selected, cleaned, tagged headlines text-links. From there those headlines get pulled into an RSS Feed and mashed up with other feeds and pumped through Feedroll and into javascript code embedded in my home page and then displayed to readers in a Drudge-link format. There is also the option to subscribe directly to any of these RSS feeds although I do not expect many folks will to do that in 2014.
In addition, these manually selected, cleaned, tagged headlines text-links can be automatically published, in various ways, as tweets on Twitter (and these tweets are also displayed, along with a wide-variety of retweets, in the upper right corner of the home page).
Alongside this, I am also interested in monitoring “New Rochelle” photos and videos which I wrote about the other day: As 2014 Begins, Talk of the Sound Increases Social Media Integration
In that article I describe how I was monitoring social media for mentions of New Rochelle hashtags (i.e.,#NewRo, #NewRochelle, #IsaacEYoung, #AlbertLeonard, #Iona, etc.) and by geolocation (tweets or instagrams posted within 10 km of the center of New Rochelle).
Mostly I am using these photos with Flickr – to display a Flickr Badge of New Rochelle photos in the right rail of the home page and create photo montages for events like the recent snow storm using Dopiaza’s badge maker.
I would prefer to do this directly from Instagram so as to have a more direct link to the user who uploaded the photo but they do not offer that as far as I can tell. They also do not seem to have a way to search their own geolocation service. So, to really monitor Instagram (which has lots of stuff I do not want like spam and people uploading pictures of text, usually trite messages) and filter out the good stuff I need a a bridge out of Instagram and into Flickr so I can take advantage of Flickr’s robust set of tools and apps to display photos.
Enter IFTTT.
IFTTT stands for “If This Then That” and is a quite a powerful service for my purposes.
They are partnered with dozens of companies to integrate social media services, photo services, home automation services and communications services like telephone, SMS texting, RSS and email and operators like date, time, weather, geolocation and media outlets like New York Times and ESPN.
They offer, as of today, 77 channels. By combining various actions associated with these channels, a user like me can create a “recipe” where IF this happens then THAT happens. The IF is a “trigger” and the THAT is an action. For example, IF the New York Times publishes a new story on its home page then SEND ME A TEXT TO MY PHONE, or IF a tweet in my account is favorited then archive that tweet in Evernote, or IF my phone arrives at my house, turn on the front porch light.
In my case, I am using IFTTT as part of my news web site workflow to save links from sources like Twitter (Tweetbot) or RSS (Reader), clean them up by writing a clear headline (i.e., title) without extraneous text from the source web site, tag them and then send them on to Delicious where I can make RSS feeds based on tags. The items in the feeds, now tagged based on subject matter, are then displayed on our home page grouped by subject as a series of text-links using Feedroll. I also leverage that clean feed of items with IFTTT, sending all of the links to Buffer from which they are sent to Twitter in staggered intervals about 20 times over the course of the day, 24/7.
I have three IFFFT recipes that monitor three areas of New Rochelle for geolocated Instragram photos and send them to Tumblr where I can skim through the haul of instagrams; if I am interested in one of them I can click an IFTTT link in the Tumblr post that takes me to that particular Instagram where I can favorite it.
I have another IFFFT recipe that sends all favorited Instagrams to Flickr (and Flickr automatically displays those photos in a badge on the Talk of the Sound home page). For now, to see the original Instagram where that photo came from you would have to click the image to get to Flickr and then click the image/link there to get to Instagram. Clunky but I want to make sure the person who took the photo can be credited in some way, even though this is less than ideal.
I have another IFFFT recipe that sends all Talk of the Sound articles via RSS to Buffer, and as noted above, Buffer collects text-links and then sends them out in staggered intervals to Twitter.
I have four weather recipes in IFFFT that send to Twitter a daily weather report each morning at 6 am, and an alert when it starts to rain or snow and an alert when the pollen count goes above 8.
It’s not IFTTT but I should mention that I have Mailchimp set up to send an alert to Twitter when the newsletter is published each night at 3 a.m.
In a perfect world, I would build all these capabilities from scratch and run everything on my own server to have more control and faster load times but until some benefactor decides to drop bags of cash at my doorstep this will have to do.
For now, that about covers all that goes into producing the content you see on Talk of the Sound and on our Twitter account.
As I mentioned above, if a reader has any good ideas on how to improve this workflow or otherwise enable me to do a better job covering New Rochelle I hope you will share that with me.