Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revisionBoth sides next revision
intertwingle [2008-11-15 17:09] 81.188.78.24intertwingle [2008-11-15 17:13] 81.188.78.24
Line 2: Line 2:
 reformatted from http://archive.org's [[http://web.archive.org/web/*/http://www.mozilla.org/blue-sky/misc/199805/intertwingle.html|recollection]] of http://www.mozilla.org/blue-sky/misc/199805/intertwingle.html reformatted from http://archive.org's [[http://web.archive.org/web/*/http://www.mozilla.org/blue-sky/misc/199805/intertwingle.html|recollection]] of http://www.mozilla.org/blue-sky/misc/199805/intertwingle.html
  
-blue sky: miscellaneous 
  
-vast volumes of email +====vast volumes of email==== 
- May 18th+ 
 +May 18th 
 Submitted by Jamie Zawinski <jwz@mozilla.org> to Miscellaneous. Submitted by Jamie Zawinski <jwz@mozilla.org> to Miscellaneous.
  
- ``Intertwingularity is not generally acknowledged -- people keep pretending they can make things deeply hierarchical, categorizable and sequential when they can't. Everything is deeply intertwingled.''  +"Intertwingularity is not generally acknowledged -- people keep pretending they can make things deeply hierarchical, categorizable and sequential when they can't. Everything is deeply intertwingled.-- Ted Nelson  
- -- Ted Nelson  +
  
 In the following, I outline a potential project to make it easier to deal with a massive volume of personal messages: excavating, traversing, relating, reporting, annotating. In the following, I outline a potential project to make it easier to deal with a massive volume of personal messages: excavating, traversing, relating, reporting, annotating.
Line 15: Line 15:
 I call this hypothetical program ``Intertwingle.'' I call this hypothetical program ``Intertwingle.''
  
-    * introduction. +  * introduction. 
-    * links are legion. +  * links are legion. 
-    * searches are intersections. +  * searches are intersections. 
-    * implementation. +  * implementation. 
-          parser. +    parser. 
-          database. +    database. 
-          query tool. +    query tool. 
-          presentation tools.  +    presentation tools.  
-    * future. +  * future. 
  
-introduction.+===introduction.===
  
-    Intertwingle can be seen as a unification of a search tool and an address book. It is not, however, a mail reader. The presentation of query results could be done through a mail reader, but the intention is that ones choice of mail reader should be orthogonal to the use of this tool. The two kinds of tools just happen to operate on the same data.+Intertwingle can be seen as a unification of a search tool and an address book. It is not, however, a mail reader. The presentation of query results could be done through a mail reader, but the intention is that ones choice of mail reader should be orthogonal to the use of this tool. The two kinds of tools just happen to operate on the same data.
  
-    The design philosophy is that any time there is a visual representation of an object, the corresponding object should be accessible with a gesture: That chasing links is easier than composing search terms (but both are needed.)+The design philosophy is that any time there is a visual representation of an object, the corresponding object should be accessible with a gesture: That chasing links is easier than composing search terms (but both are needed.)
  
-    The target audience is individuals who have a lot of mail. The target audience is not inhabitants of the corporation, it is people. Needs which are specific to IS Managers, or to Enterprise Directory Services are not of interest. This is about the general problem of handling lots of individual mail. (Whether in the context of personal mail, or job-related mail, the problem is the same: you've got a lot of it; now what do you do?)+The target audience is individuals who have a lot of mail. The target audience is not inhabitants of the corporation, it is people. Needs which are specific to IS Managers, or to Enterprise Directory Services are not of interest. This is about the general problem of handling lots of individual mail. (Whether in the context of personal mail, or job-related mail, the problem is the same: you've got a lot of it; now what do you do?)
  
-    Sharing is an interesting problem, and may be addressed, but I feel it is explicitly secondary in priority to solving the problem in the non-shared domain. (But, we should think about it up front, because that kind of thing tends to be hard to retrofit.) +Sharing is an interesting problem, and may be addressed, but I feel it is explicitly secondary in priority to solving the problem in the non-shared domain. (But, we should think about it up front, because that kind of thing tends to be hard to retrofit.) 
 links are legion. links are legion.
-    The sheer multitude of representations-of-objects yields a colossal number of potential links to follow, which is why I anticipate link-chasing to be a (usually) far easier method of excavation than searching. For example, here are the headers of a typical message:+The sheer multitude of representations-of-objects yields a colossal number of potential links to follow, which is why I anticipate link-chasing to be a (usually) far easier method of excavation than searching. For example, here are the headers of a typical message:
  
-    Date:  Sun, 3 Jul 94 16:40:07 PDT +Date:  Sun, 3 Jul 94 16:40:07 PDT 
-    From:  Jamie Zawinski <jwz@mcom.com> +From:  Jamie Zawinski <jwz@mcom.com> 
-    To:  eng +To:  eng 
-    Subject:  printing +Subject:  printing 
-    In-Reply-To:  Chris Houck's message of Sun 3-Jul-94 13:19:23 -0700 <9407032019.AA18853@neon.mcom.com> +In-Reply-To:  Chris Houck's message of Sun 3-Jul-94 13:19:23 -0700 <9407032019.AA18853@neon.mcom.com> 
-    Message-ID:  <19940703093034.jwz@islay.mcom.com> +Message-ID:  <19940703093034.jwz@islay.mcom.com> 
-    References:  <9407032019.AA18853@neon.mcom.com>+References:  <9407032019.AA18853@neon.mcom.com>
  
-    There is a great deal of structure there:+There is a great deal of structure there:
  
-          Sun, 3 Jul 94 16:40:07 PDT +  Sun, 3 Jul 94 16:40:07 PDT 
-              This is a representation of a point in time. From here one can envision traversing to a list of other messages within some range of that moment: that hour, that day, that month, that year.+  This is a representation of a point in time. From here one can envision traversing to a list of other messages within some range of that moment: that hour, that day, that month, that year.
  
-          Jamie Zawinski <jwz@mcom.com> +  Jamie Zawinski <jwz@mcom.com> 
-              This is a description of a particular person. From here one should be able to easily get to information related to that person: an address book entry, or a list of all messages sent by them, or sent to them, or any number of other annotations.+  This is a description of a particular person. From here one should be able to easily get to information related to that person: an address book entry, or a list of all messages sent by them, or sent to them, or any number of other annotations.
  
-          Jamie Zawinski +  Jamie Zawinski 
-              This is a name, not a person, and names are notoriously non-unique. From here it would be useful to get to a list of all known people who have claimed that name (from the set of people who are message senders or recipients.)+  This is a name, not a person, and names are notoriously non-unique. From here it would be useful to get to a list of all known people who have claimed that name (from the set of people who are message senders or recipients.)
  
-          jwz@mcom.com +  jwz@mcom.com 
-              This is an email address, not a person, and while one email address is usually not used by more than one person, it's quite common for one person to have many email addresses (or many variations on the same address.) From here it would be useful to get to a list of all known people who have used that address (from the set of people who are message senders or recipients) and from there to the set of other addresses used by that person or those people. One might also find it useful to get a list of messages associated with this address (while excluding messages from other addresses of the same person.)+  This is an email address, not a person, and while one email address is usually not used by more than one person, it's quite common for one person to have many email addresses (or many variations on the same address.) From here it would be useful to get to a list of all known people who have used that address (from the set of people who are message senders or recipients) and from there to the set of other addresses used by that person or those people. One might also find it useful to get a list of messages associated with this address (while excluding messages from other addresses of the same person.)
  
-          eng +  eng 
-              This is an email address, yet it happens to be a mailing list. There is no one person associated with it, yet the set of operations one might like to perform on it is very similar.+  This is an email address, yet it happens to be a mailing list. There is no one person associated with it, yet the set of operations one might like to perform on it is very similar.
  
-          printing +  printing 
-              This is unstructured text, and what one does with unstructured text is attempt to match patterns in it. There are any number of other properties associated with this particular piece of text: it is in a header field called Subject in a message from Jamie Zawinski, on Sunday, July 3rd, and so on. All of these are interesting properties that are within one or two link-hops of the text itself. Their proximity is what makes them interesting.+  This is unstructured text, and what one does with unstructured text is attempt to match patterns in it. There are any number of other properties associated with this particular piece of text: it is in a header field called Subject in a message from Jamie Zawinski, on Sunday, July 3rd, and so on. All of these are interesting properties that are within one or two link-hops of the text itself. Their proximity is what makes them interesting.
  
-          Chris Houck +  Chris Houck 
-              A name, as above.+  A name, as above.
  
-          Chris Houck's message +  Chris Houck's message 
-              An ambiguous reference to a message. From here, one should be able to get to the set of all messages from someone who claimed the name Chris Houck.+  An ambiguous reference to a message. From here, one should be able to get to the set of all messages from someone who claimed the name Chris Houck.
  
-          Chris Houck's message of Sun 3-Jul-94 13:19:23 -0700 +  Chris Houck's message of Sun 3-Jul-94 13:19:23 -0700 
-              Another reference to a message, probably less ambiguous.+  Another reference to a message, probably less ambiguous.
  
-          <9407032019.AA18853@neon.mcom.com> +  <9407032019.AA18853@neon.mcom.com> 
-          <19940703093034.jwz@islay.mcom.com> +  <19940703093034.jwz@islay.mcom.com> 
-              These also are references to particular messages, the least ambiguous representations so far; however, they are still slightly ambiguous, since message IDs refer to original messages: there could be multiple copies of these messages with slightly different headers or other annotations within the message-store. +  These also are references to particular messages, the least ambiguous representations so far; however, they are still slightly ambiguous, since message IDs refer to original messages: there could be multiple copies of these messages with slightly different headers or other annotations within the message-store. 
  
-    Any any time there is a link, one can imagine an equal but opposite counter-link: when we talk of reaching lists of objects above, the object by which we reached that list will always be a member of the list. And if A is three hops away from D, then D is three hops away from A, and traversal in both directions should be possible.+Any any time there is a link, one can imagine an equal but opposite counter-link: when we talk of reaching lists of objects above, the object by which we reached that list will always be a member of the list. And if A is three hops away from D, then D is three hops away from A, and traversal in both directions should be possible.
  
-    However, the object at the other end of the link does not necessarily encode the reverse path in its usual visual representation. For example, while messages point to the message to which they are a reply, the parent doesn't (in itself) point to the children. This implicit relationship must be made explicit: it must be easy to get from a message to the set of messages which refer to it. All links must be bidirectional.+However, the object at the other end of the link does not necessarily encode the reverse path in its usual visual representation. For example, while messages point to the message to which they are a reply, the parent doesn't (in itself) point to the children. This implicit relationship must be made explicit: it must be easy to get from a message to the set of messages which refer to it. All links must be bidirectional.
  
-    Further structure exists outside of the message headers themselves:+Further structure exists outside of the message headers themselves:
  
-        * Messages live in folders.+* Messages live in folders.
  
-              o Folders have names.+  o Folders have names.
  
-              o Folders are sometimes arranged in a hierarchy.+  o Folders are sometimes arranged in a hierarchy.
  
-              o Folders tend to store messages linearly, in a particular order: thus, each message has ``previous'' and ``next'' relationships with other messages. +  o Folders tend to store messages linearly, in a particular order: thus, each message has ``previous'' and ``next'' relationships with other messages. 
  
-        * Messages can contain other messages (forwarded messages, or digests.) Each such message is a message in its own right, but the containment relationship can be important.+* Messages can contain other messages (forwarded messages, or digests.) Each such message is a message in its own right, but the containment relationship can be important.
  
-        * Messages have bodies.+* Messages have bodies.
  
-              o The bodies can contain unstructured text.+  o The bodies can contain unstructured text.
  
-              o The bodies can contain text that is named, for example, an attached text file which has a file name or description specified in its attachment headers.+  o The bodies can contain text that is named, for example, an attached text file which has a file name or description specified in its attachment headers.
  
-              o The bodies can contain binary objects which, while not textually searchable, are named and described.+  o The bodies can contain binary objects which, while not textually searchable, are named and described.
  
-              o Bodies can contain hyperlinks. Plain-text messages might happen to have detectable URLs in them, and HTML messages have many mechanisms for referring to other objects. This implies that it would be interesting to traverse from a message, to information about a web page that it refers to, and back to a set of messages which refer to objects on that server. +  o Bodies can contain hyperlinks. Plain-text messages might happen to have detectable URLs in them, and HTML messages have many mechanisms for referring to other objects. This implies that it would be interesting to traverse from a message, to information about a web page that it refers to, and back to a set of messages which refer to objects on that server. 
  
 searches are intersections. searches are intersections.
  
-    Following a link only gives you one dimension of mobility. A search can be seen as following multiple links, and finding the intersection (or union) of the results of those links.+Following a link only gives you one dimension of mobility. A search can be seen as following multiple links, and finding the intersection (or union) of the results of those links.
  
-    Any link-relationship should be searchable. For example:+Any link-relationship should be searchable. For example:
  
-        * All messages from person between date and date that have pattern in the body.+* All messages from person between date and date that have pattern in the body.
  
-        * All messages from person which contain a message from person.+* All messages from person which contain a message from person.
  
-        * All messages to mailing-list which refer to URL.+* All messages to mailing-list which refer to URL.
  
-        * All messages containing text in the main body, but not in an attachment.+* All messages containing text in the main body, but not in an attachment.
  
-        * All messages with an attachment whose file name contains string. +* All messages with an attachment whose file name contains string. 
  
 implementation. implementation.
  
-    The basic components of this system are:+The basic components of this system are:
  
-       1. parser.+   1. parser.
  
-          The module which reads the existing message store (directories of BSD mbox files, or news spool directories, or whatever) and parses them into tagged, indexable data.+  The module which reads the existing message store (directories of BSD mbox files, or news spool directories, or whatever) and parses them into tagged, indexable data.
  
-          It needs to understand where messages begin and end, understand how to descend into MIME structures, how to translate HTML into indexable text, how to recognise URLs, and so on, and so on.+  It needs to understand where messages begin and end, understand how to descend into MIME structures, how to translate HTML into indexable text, how to recognise URLs, and so on, and so on.
  
-          It will presumably generate an intermediate data representation which can be more easily fed to the database. A pretty-printed version of the representation of a message might look like this (if you will excuse my lisp-centric upbringing; here in the modern world, this would presumably be done with XML):+  It will presumably generate an intermediate data representation which can be more easily fed to the database. A pretty-printed version of the representation of a message might look like this (if you will excuse my lisp-centric upbringing; here in the modern world, this would presumably be done with XML):
  
-          (:message +  (:message 
-            (:db-id "globally-unique-identifier"+(:db-id "globally-unique-identifier"
-            (:header-field (:key "from"+(:header-field (:key "from"
-            (:addr "Jamie Zawinski" "jwz")) + (:addr "Jamie Zawinski" "jwz")) 
-            (:header-field (:key "newsgroups"+(:header-field (:key "newsgroups"
-            (:news "mcom.test")) + (:news "mcom.test")) 
-            (:header-field (:key "subject"+(:header-field (:key "subject"
-            (:text "hey")) + (:text "hey")) 
-            (:link "http://url-found-in-some-textual-header/"+(:link "http://url-found-in-some-textual-header/"
-            (:attachment +(:attachment 
-              (:type "text/plain"+  (:type "text/plain"
-              (:body "message body text"+  (:body "message body text"
-              (:link "http://url-found-in-body-text"+  (:link "http://url-found-in-body-text"
-              (:addr-or-id "email@address.found.in.body"+  (:addr-or-id "email@address.found.in.body"
-              (:addr-or-id "or@maybe.its.really.a.message.id"))           +  (:addr-or-id "or@maybe.its.really.a.message.id"))   
-            (:attachment +(:attachment 
-              (:type "text/plain"+  (:type "text/plain"
-              (:name "filename"+  (:name "filename"
-              (:description "description"+  (:description "description"
-              (:text "decoded/stripped text of attachment"+  (:text "decoded/stripped text of attachment"
-              (:link "http://ijkl"+  (:link "http://ijkl"
-              (:link "http://mnop")) +  (:link "http://mnop")) 
-            (:attachment +(:attachment 
-              (:type "application/postscript")) +  (:type "application/postscript")) 
-            (:attachment +(:attachment 
-              (:type "message/rfc822"+  (:type "message/rfc822"
-              (:message-pointer "db-id")))+  (:message-pointer "db-id")))
  
-          These objects are shallow: that last "db-id" mentioned in the example is a pointer to a top-level message object that will be coming up soon (probably next in the stream.) That is, deeply nested trees of messages are flattened. (An interesting search term might be ``depth > 1'' for when you're looking for something, and you know it was in a forwarded message, but you don't remember from whom.)+  These objects are shallow: that last "db-id" mentioned in the example is a pointer to a top-level message object that will be coming up soon (probably next in the stream.) That is, deeply nested trees of messages are flattened. (An interesting search term might be ``depth > 1'' for when you're looking for something, and you know it was in a forwarded message, but you don't remember from whom.)
  
-          Deeply nested MIME structures (multipart/ forms) are also flattened. Content-Disposition is always assumed to be inline for purposes of indexing; we index the body of any part that is of a text type. There is no special handling for multipart/alternative forms: each part is indexed as for multipart/mixed.+  Deeply nested MIME structures (multipart/ forms) are also flattened. Content-Disposition is always assumed to be inline for purposes of indexing; we index the body of any part that is of a text type. There is no special handling for multipart/alternative forms: each part is indexed as for multipart/mixed.
  
-          A more formal representation might be+  A more formal representation might be
  
-              msg_desc      =  db_id *msg_header +                msg_desc  =  db_id *msg_header 
-               *link_part *addr_id_part +                                   *link_part *addr_id_part 
-               *msg_body +*msg_body 
-              msg_header    =  header_name header_body +  msg_header=  header_name header_body 
-              msg_body      =  text_part / link_part / +  msg_body  =  text_part / link_part / 
-                                 attach_part + attach_part 
-              header_name    keyword +  header_name    keyword 
-              header_body    text / *mailbox / +               header_body    text / *mailbox / 
-               *newsgroup / *msg_id / date+                *newsgroup / *msg_id / date
               mailbox        name address               mailbox        name address
               name          =  keyword               name          =  keyword
Line 187: Line 187:
               url            text               url            text
               attach_part    content_type               attach_part    content_type
-                 [attach_name]+                   [attach_name]
                                  [attach_desc]                                  [attach_desc]
                                  [attach_value]                                  [attach_value]
-              *link_part *addr_id_part+                *link_part *addr_id_part
               attach_name    text               attach_name    text
               attach_desc    text               attach_desc    text
Line 287: Line 287:
     For example, source code.     For example, source code.
  
-        * Site Map 
-        * Contact Us 
-        * Donate 
  
-    Copyright © 1998-2003 The Mozilla Organization+Copyright © 1998-2003 The Mozilla Organization. Last modified November 10, 1998 
  
-    Last modified November 10, 1998  
  • intertwingle.txt
  • Last modified: 2009-12-14 10:20
  • by nik