2010-02-05

photo beauties

Architecture beauties: Hyperboloid Towers (photos)

Star Spangled Bikes.

circada lyristes plebejus

Cicada. Source

perm url: Cicada Photo.

von kar hires-s

Beautiful photography: Planetary Art.

watchtower ruin dunhuang gansu china-s

Silkroad Watchtower Ruin (photo)

2010-02-02

A Complexity In URL Encoding

perm url with updates: http://xahlee.org/js/url_encoding.html

A Complexity In URL Encoding

Xah Lee, 2009-06-02, 2010-02-01

[2010-02-03 addendum: this article is incorrect. Actually, it is simple to check if the url is a cgi script, by looking for “?”. If so, any “&” before the “?” should be “%26”, and after should be “&”]

Discovered a subtle issue with automating url encoding. In url, if you have the ampersand “&” char, and if this url is to be inside a html doc as a link, then, there's no automated procedure that determines correctly whether the char should be encoded as “%26” or “&”.

If the char is used as part of the file name or path, then it should be encoded as “%26”, but if it is used as a separator for CGI parameters, then it should be encoded as “&”.

The ampersand char is a reserved char in Percent encoding. Therefore it must be percent encoded if it is used for normal file path names. So, when it is NOT used as part of path names, but used as CGI parameter separaters, with GET method of HTTP request, then it must be left as it is. Now, in html, the ampersand char must be encoded as html entities “&” when adjacent chars are not space (basically). So, it becomes “&”.

Of course i knew the above, but my realization is that, the purpose of the char used in url cannot be syntactically determined with 100% accuracy.

This is interesting to me because i work in html and using emacs, and i have written personal elisp code that automatically turns a url into a link. The situation is that, this lisp code cannot do that with 100% accuracy in theory.

Of course, in practice, all this matters shit. Just use “&” plainly and it all works all browsers.

Python's Reference And Internal Model Of Computing Languages

perm url with updates: http://xahlee.org/comp/python_ref_problem.html

Python's Reference And Internal Model Of Computing Languages

Xah Lee, 2010-02-02

In Python, there are 2 ways to clear a hash: “myHash = {}” and “myHash.clear()”. What is the difference?

The difference is that “myHash={}” simply creates a new empty hash and assigns to myHash, while “myHash.clear()” actually clear the hash the myHash is pointing to.

What does that mean?? Here's a code that illustrates:

# python
# 2 ways to clear hash and their difference
aa = {'a':3, 'b':4}
bb = aa
aa = {}
print bb # prints {'a': 3, 'b': 4}

aa = {'a':3, 'b':4}
bb = aa
aa.clear()
print bb # prints {}

This is like this because of the “reference” concept. The opposite alternative, is that everything is a copy, for example in most functional langs. (with respect to programer-visible behavior, not how it is implemented)

From the aspect of the relation of the language source code to the program behavior by the source code, this “reference”-induced behavior is similar to dynamic scope vs lexicol scope. The former being un-intuitive and hard to understand the latter more intuitive and clear. The former requires the lang to have a internal model of the language, the latter more correspond to source code WYSIWYG. The former being easy to implement and is in early langs, the latter being more popular today.

As with many languages that have concept of references, or pointers, this is a complexity that hampers programing progress. The concept of using some sort of “reference” as part of the language is due to implementation efficiency. Starting with C and others in 1960s up to 1990s. But as time went on, this concept in computer languages are gradually disappearing, as they should.

Other langs that have this “reference” concept as ESSENTIAL part of the semantic of the language, includes: C, C++, Java, Perl, Common Lisp. (of course, each using different terminologies, and each lang faction will gouge the other faction's eyes out about what is the true meaning of terms like “reference”, “object”, “list/sequence/vector/array/hash”, and absolutely pretend other meanings do not exist. (partly due to, their ignorance of langs other than their own, partly, due to male power struggle nature.))

Languages that do not have any “reference” or “object”, or otherwise does not require the programer to have some internal model of source code, are: Mathematica, PHP, most unix shells, PowerShell. (others may include TCL, Haskell, OCaml, possibly assembly langs.)

For some detail on this, see: Jargons And High Level Languages and Hardware Modeled Computer Languages.