2010-03-06

Advantages Of FeedBurner And Some Web Feed History

Perm url with updates: http://xahlee.org/js/feedburner.html

Advantages Of FeedBurner And Some Web Feed History

Xah Lee, 2010-03-06

Today, i have to make a decision on whether to use the FeedBurner free service for managing my web feeds.

Advantages Of FeedBurner

Advantages

  • Provides statistics. For example, how many subscribers you have.
  • Possible to monetize. You can have ads on your feed. Without some type of feed service, adding ad to a web feed is quite difficult.
  • Provides better feed interface. For example, users can click on a link to add your feed directly to their feed readers, such as Google Reader. Also, feeds pages (RSS/Atom) often cannot read by browser directly. FeedBurner turns your feed into html so can be read by those who are not tech savvy.

Disadvantages

You lose total control of your feed. There are several aspects.

Your domain name is no longer in the feed url. For example, your feed address may be like this: http://xahlee.org/js/blog.xml. When using FeedBurner, your feed url is this: http://feeds.feedburner.com/XahsWebProgramingBlog. If you are a big business, you probably want the feed to be on your domain, because it gives a better sense of being YOURs.

You lose complete control of the formatting. If you are a big publisher, or a very popular blog writer, you probably want to be in total control of how your site or feed looks, down to the layout, color, date format, etc. With a feed service, you lose that.

Relying on 3rd party. If the feed management service provider goes out of business, you'd be in trouble. Because, all your feed url is known to the public by the FeedBurner address. If they go out of business, it would be problematic to update your readers about your new feed address.

A Bit History Of Web Feed

For those of you don't know, web feed is a web syndication system. It started in the late 1990s. In those dot com days, the web started to have what's called web portal sites. Earliest being Yahoo. Basically, you login to your yahoo page, and you can read all today's news, weather, local news, stock quotes, etc, in one place, and all customized to you.

Yahoo isn't a news company. So, it needs to get these info from other sources. One method, is thru email and web pages. So, yahoo staff have to constantly process email or web pages, and process them to extract some sort of summary, so that it can be displayed to you in a nice way. Often, this process is manual and not well defined, labor intensive. Any news publisher, from big ones as CNN, AP, BBC, Fox News, to small local ones, to any specialized news in science community, gaming community, or any other, have this problem. So, born is web feed format called RSS (stood for Really Simple Syndication).

RSS is a file format. What it simply do is to have clearly marked format for Entries, each with Summary, Content, Link, Date Updated, Author, etc. And this file is published on a url (public or private). So, the partners of the publisher, can just grab this file, and easily choose sections they need to republish or distribute. This saves the original publishers the trouble of various formats and communication with its partners, and it saves web portal sites endless labor of reading, re-writing, editing, to republish articles.

That is in the late 1990s. Over the past decade, the internet grew fast. We have blogs, social networks, instant messaging, online maps, ajax, youtube, Skype, twitter, and google with its array of web services. The RSS's use has changed. Remember, RSS used to mean Really Simple Syndication. The keyword is Syndication. The word is used among publishing businesses. Meaning, a piece of news articles, columns, comic strips, as a business partnership, is republished in several different publications. RSS was mostly for those publishing houses.

Today, thanks to primarily google, any web page that has a nature of constant update, has a web feed. With the popularity of feed, RSS is now used by every joe blog writer. It no longer stand for some Syndication. The term RSS itself faded into obscurity. Today, we just have “web feed”, and all major browsers has built-in ability to read feed today, directly, without reading it thru some portal site.

(Note, the tech spec of the RSS file format has also gone thru changes. For example, see: Atom Webfeed.)

How Does FeedBurner Know Reader Stat?

So, in about mid 2000, there sprang FeedBurner. What is it? There being so many web feeds, people begin to have a need to manage the feeds, starting with big web companies. They need to know, for example, how many readers they have for a feed?

This is interesting. When i looked into FeedBurner in deciding whether i should use it, one of the advantage i found is that it gives subscriber stats. But this appeared to me contradictory. For example, my web feed is published on my website. So, i know exactly how many times my feed file is accessed, from my web server log. However, why is it i can't know how many subscriber? A bit thinking resolved this. Remember, Web Feed, as a syndication system, was designed to be accessed by software agents for the purpose of republishing your article, not directly read by readers. So, you do not have a way to know how many readers you have. For example, many people read blogs from their Google account. Google grabs the feed from my site, republish it for these people to read. So, to me, there's only 1 access to my feed, but i could have hundreds of readers.

But then how does FeedBurner know how many people read your feed? Well, it can't know exactly. However, it does provide far better tracking than can be discerned from your web server log, because, FeedBurner being in the feed business, knows about all the technologies of other feed reading applications, or are in close connection with other feed using businesses. For example, FeedBurner might have access to the stat of web portals or feed reader businesses of readers who subscribed your feed. For example, i don't know how many people are reading my blog from Google, Yahoo, MSN, etc, but FeedBurner knows because it probably have connection with these companies. (Note: FeedBurner is owned by Google since 2007)

Monetizing Feeds

Another advantage is that FeedBurner can monetize feeds. Meaning, that ads will appear on your feed, and you get money from them.

The question for me is, why can't i put ads on my feed myself? For example, i can use Google's Adsense service. The problem is, most ad services today deal with ads on web pages, not web feeds. Web feed has not yet become a popular media for ads. So, putting ads on your feed isn't well supported. FeedBurner, being a feed business, has means in doing that, with respect to the business of running ads, or the technology of inserting ads into your feed file format.

Better Feed Interface

Notice today, that many feed pages has Web Share Widgets, that allow readers to easily add your feed into their feed reader. With such widget, user can just click a button and they are subscribed to your feed. Without these widgets, user usually have to copy your feed url, switch to their feed reader application or web site, click another button to add a feed, paste your link, click again to submit. Having web share widget is quite a convenience for your potential readers, and such convenience may increase your readers a lot.

feed add widgets

Web share widgets to add feeds to various web portals.

You could of course add these widgets yourself. However, FeedBurner just makes it easier. For example, the popularity of different web portals or feed readers come and go. This year, ReadMe.com might be a popular web portal, and PalmPot might be a popular device to read feeds, next year, they may fade into obscurity. You will need keep your webshare widgets up to date with the computing industry fashion. For example, there was Netscape, Lycos, AltaVista, Inktomi, Infoseek, Excite. These are top sites of the dot com days. Today, most are forgotten. As of this year, we have Twitter, iPhone, Facebook, and tens other portals or social networking sites, each with tens of thousands of users but otherwise unknown to you. It is hard to keep up. FeedReader manages the list of web share widgets and API for you.

2010-03-05

Bird Flight V Formation As Geometry Problem Of Max Visual Contact

perm url with updates: http://xahlee.org/math/bird_flight_v_formation.html

Bird Flight V Formation As Geometry Problem Of Max Visual Contact

Xah Lee, 2010-03-05

Was reading Wikipedia V formation, that is, the formation of flight of geese, ducks. The interesting thing is, why they fly in that formation. The Wikipedia article is quite interesting, but too short.

The reason given by Wikipedia are basically two: (1) for the whole group, the V formation is more efficient for (energy cost)/(distance traveled), and increases flight distance up to 71%. (2) Visual contact is part of the reason for V formation.

It is interesting to note that the birds in V Formation rotates. The bird in front is the most tiring. Though, Wikipedia doesn't say how often they rotate, or how they rotate, or by what biological behavior means the signal a rotation, i.e. instinctively by bio-clock?

I thought a bit about the visual contact reason. It is essentially a mathematical problem. Each bird has max visual contact in front or to the sides. You have a group of birds. The problem is to find a arrangement so that the visual contact between the birds as a group is maximized. Because the eyes of birds are essentially to the sides (and forward), this pretty means the best arrangement must happen in a horizontal plane the bird is in. Clearly, the max visual contact can't be the only reason for V formation, because a half circle formation is better. In a V formation, half of it is linear, so the bird in front of you blocks your view of the whole line. Also, i thought about inverted V formation. I think inverted V might be a better formation with respect to maximize the whole group's visual contact.

To be a bit precise, let's assume there are 5 birds. A goose's field of vision is 300° forward (a random guess). Human's field of vision is perhaps 160°. To compute the max visual contact, we may weight each bird's visual contact of another bird by the degree the bird is in focus. For example, if person A stands in front of you, and person B stand on your left barely visible, sure both are in visual contact, but you'd give a much higher score for your visual contact with A than B. For the 5 birds, let's name tham A B C D E. For bird A, we compute the weighted score for visual contact of A with B, then A with C, etc. So we have 4 numbers. Do this for each bird. So in the end we have 4*5 = 20 numbers. Average them would give as the visual contact score of the whole group.

Instead of solving the max visual contact problem for birds, let's change the problem to humans, which makes it more interesting to us. And, let's just simplify the problem for people standing on the ground. So, given n person, what's the best arrangement for max visual contact of the whole group? (assume we cannot turn our head)

Clearly, when the problem is put in this way, one immediately see that the size of the group matters, and the distance between person also matters. For example, if you have 1000 person, the best arrangement may not be some V formation. A more dense, grid formation may do much better. This is probably because we haven't taken into account of distance in visual contact. If a person is 1 km away, you may not consider it to be in visual contact.

So, at this point, we might want to change our problem by limit the number to say 100, then we don't have to worry about the visual distance much.

However, the math problem without the size constraint or visual distance complexity is still interesting one. Here's another try at the pure math formulation: Given n points in a plane, assume each point has forward visual field of α°, and assume point A has better visual contact of point B if B is more directly ahead of A, what is the best arrangement of the points so that visual contact is maximized for the whole group?

To give the problem a slight more practical touch, we may consider the dots to be circles of radius 1, so that we introduce into the problem of the blocking of sight issue. This change basically means whatever results will be scaled up so position between birds can be say 1000. The larger, the better. This makes the problem less interesting, so we'll introduce another constraint, that the position between 2 most distant birds cannot be more than say 100. Also, a full view of a circle is better than half view. So, we'll have to add the weight of the visual contact so that, a bird directly in front of you is better than a bird to the side, but also, a full view of a bird is better than half view. The exact weighted score formula for visual contact needs to be worked out a bit.

2010-03-04

Why Emacs is still so useful today

Perm url with updates: http://xahlee.org/emacs/emacs_power_story.html

Why Emacs is Still so Useful Today

Xah Lee, 2010-03-04

This essay tells a little anecdote about why emacs is still essential and superior among today's tools.

Problem

Today, i need to rework some of the markup in a html page Wallpaper groups: References and Related Web Sites. The page is a bibliography. Each entry had markup like this:

<div class="entry">
<span class="w">Title</span>: <b>Regular Polytopes</b>
<span class="w">Author</span>: H.S.M. Coxeter<br>
<span class="w">Publisher</span>: Dover<br>
<span class="w">Date</span>: 1973.<br>
<b>Notes:</b> 3rd ed.<br>
<span class="w">Subject</span>: Symmetry<br>
<span class="w">Comment</span>: A standard reference...<br>
</div>

This page was written in 1997, and the markup is slightly updated a few times in the past decade. The markup is not optimal. For example, all the title, author, etc info are marked up by the same “<span class="w">...</span>” The style of “span w” is just bold, used across my website. A better one, would be something like this:

<div class="entry">
<div class="title">Regular Polytopes</div>
<div class="author">H.S.M. Coxeter</div>
<div class="publisher">Dover</div>
<div class="date">1973</div>
Notes: 3rd ed.<br>
<div class="subject">Symmetry</div>
<div class="comment">A standard reference...</div>
</div>

together with CSS style sheet, like this:

div.entry {margin:1ex; padding:1ex}
div.entry > div.author {color:green}
div.entry > div.author:before {content:"Author: ";color:black;font-weight:bold}
div.entry > div.title {font-style:italic;color:#4d378b}
div.entry > div.title:before {content:"Title: ";color:black;font-weight:bold; font-style:normal}
div.entry > div.publisher:before {content:"Publsher: ";color:black;font-weight:bold}
div.entry > div.date:before {content:"Date: ";color:black;font-weight:bold}
div.entry > div.subject:before {content:"Subject: ";color:black;font-weight:bold}
div.entry > div.comment:before {content:"Comment: ";color:black;font-weight:bold}

would make it much better. Because, each of the Title, Author, etc are semantically marked. This means that machines can trivially process it and understand it, and the styling can be easily changed, on each of the Title, Author, etc. (Such a markup is called HTML Microformat, a step towards semantic web.)

There are 34 such entries. So, how does one go about this little task? If you look at the markup, they are fairly regular. So, perhaps you can write a little python script to process it. However, if the markup is not 100% regular, the scripting approach won't work. Some entry may not have a date line, some are journal and not book so may be missing publisher, some have a line about library location... Each time you run your script, your script will chock on little exceptions, then you loop back to fixing the script.

So, unless your text to be processed is a valid format with a grammar and semantic specification, the script approach will likely end up taking longer than manually doing multiple passes of find replace. If you take the time to make your text regular first then write the script to process it, that probably won't save you time.

This is where emacs comes in. Emacs has several find and replace commands, by regex or by plain string, on a text selection, or entire file (buffer), or multiple files. The beauty is that it works all in a interactive way, with the option to proceed in batch when you see a clear pattern.

Text-Soup Situation And Lumberjack Tasks

Text-Soup Situation

For a coder or sys admin, vast majority of time the text editing he needs to perform are of this text-soup nature. Sure, if your code is in some strict environment, such as coding Java in a company in a big project with strict code structure, you might use some IDE's built-in feature to “refactor”. However, vast majority of texts that exist in the world are not in some such “strict” format or environment, and the ways you need to process them do not fall into some nicely categorized transformation. (as a illustration, XML became widely adopted precisely because it avoids being a text-soup.) All those unix startup shell scripts, make file scripts, sys admin scripts, software installation scripts, all are of this text-soup nature. Almost all software's programing language source code, are of this text-soup nature. All unix config files, soup. All man page source code, texinfo, TeX/LaTeX source code, soup. All publications, journals, magazines, essays, books, their source text are text-soup. And if we look at the web, probably 99.99% of html that exist today are text-soup, and they are not even valid html.

So, essentially all text in digital form are text-soup. The only notable exception are texts in a format with a lexical grammar and at least some degree of semantic grammar, such as XML.

Lumberjack-Tasks

Even if your text is in some strict format, the task you need to do on the text will 99% likely not be some known transformation, so cannot be automated. For example, if you have hundreds of “HTML 4.01 strict” valid html , and you need to transform them into valid “XHTML 4.01 strict” format, probably 99% of time this task will not be possible with a script without some type of AI that makes human decisions. Because, the lexical structure of HTML is specified, but not semantically. (For some detailed example, see: HTML Correctness and Validators).

Today, perhaps a significant percentage of the world's texts are html, and vast majority are not valid html. Plain text is worse. In general, for automatic text processing, the text needs to be in a format with SEMANTIC grammar (of course, with a lexical grammar to begin with). The more semantic grammar, the narrower the scope, the more specific the context, the easier is automation. (but in general, even such cannot eliminate the lumberjack-tasks situation, because a format with a semantic grammar is just one point of view. Most of the time, real-world situation on text processing tasks isn't nicely academically defined.)

Why Emacs But Not Other Editors

So, majority of daily tasks on text can't be automated by some scripting or IDE built-in tools. What about other text editors?

The power in emacs is that it has a integrated scripting language designed for text processing, and its commands are all oriented by keyboard operation, together with keyboard macros that can record and repeat commands. This means, for the daily lumberjack-task on text-soup, they became semi-automatic. The integrated elisp covers parts that can be automated by machine, the interactive nature and key macros covers the part that cannot be completely automated. Vast majority of text editors don't have a integrated scripting language. Vast majority are GUI based, so isn't suitable for heavy text processing, programing oriented, tasks.

(The one that has the same nature of emacs is vi. vi, together with its unix shell tools environment, can become useful tool as emacs for dealing with text-soup tasks.)

For some examples of actual tasks and solutions done with emacs and not possible with other text editors or IDEs, see:

Was this page useful? If so, please do donate $3, thank you donors!

Flowing List Items

Perm url with updates: http://xahlee.org/js/css_flow_list.html

(css effect may not show in blogger. Visit the link above to see full effect.)

Flowing List Items

Xah Lee, 2010-03-03

This page is a html/css tutorial, showing you how to format list items.

By default, list items are formated like this:

  • Cat
  • Dog
  • Bird

Here's the code:

<ul>
<li>Cat</li>
<li>Dog</li>
<li>Bird</li>
</ul>

This is plain. Sometimes you have a list of items, but you want them displayed in different format.

Flowing List Items

For example, if you want it to flow like this:

  • Cat
  • Dog
  • Bird

Here's how you do it:

<style type="text/css">
li.mystyle { float:left; list-style-type:none;
  border:solid thin red; margin:1ex;}
li.mystyle:before {content:"♥ "}
</style>

<ul>
<li class="mystyle">Cat</li>
<li class="mystyle">Dog</li>
<li class="mystyle">Bird</li>
</ul>

<hr style="visibility:hidden; clear:both">

The trick to make it flow is the “float:left”. The float means making the content float, and when there are several content that are all floating, they flow in the same way when you have a sequence of inline images (i.e. a sequence of “<img ...>” tags.).

Also, notice that we changed the default bullet into a heart. This is done by “list-style-type:none”, which means do not automatically add any bullet. Then, we add our own, by “li.mystyle:before {content:"♥ "}”. This lets you use any unicode character as a bullet.

At the end of list, we added this:

<hr style="visibility:hidden; clear:both">

This is because we want to stop the flowing behavior. If we don't use it, anything comes after the end of the “</ul>” will be placed right after the last list item. The “clear:both” means clear any previous css float “left” or “right”.

The flowing list is particular useful if you have thumbnails and you want to flow them. For a example, see: Xah's Visual Arts Gallery.