Threaded Mode | Linear Mode

judic · 12-10-2010, 08:31 PM

Right now we have central authorities that act as authoritative sources. My understanding is that routing can be determined based on these sources.

How will routing be determined in IDONS? A request would have to look "everywhere" to find the info needed--on which hosting servers a not-centrally-known domain resides. Is everyone building their own database from scratch?

Mark Gritter · 12-10-2010, 09:16 PM

Right now the Domain Name System is hierarchal--- there are authoritative servers for the root, which know the IP addresses of the .com, .net, .ru, ... servers. The .com servers in turn know the "authoritative" servers for all the domain names ending in .com.

There are many possible designs which do not have this same sort of hierarchy.

"Distributed hash table" systems such as Chord or Tapestry make each participant authoritative for some range of the space of keys. The participants also have information about which other servers are "nearby" in the key space (and perhaps what key ranges physically-nearby servers are responsible for as well.) You might take the name you want to look up (say cnn.com) and hash it to get a 128-bit key. Then the search proceeds by using the information on one node to locate a node "closer" (in the key space) to the authoritative server.

The Border Gateway Protocol, the basis of IP routing, provides another model of non-hierarchal lookup. A router doesn't contact a global map of the Internet--- it just uses information that its neighbors have advertised about which addresses they can reach, and how "far away" those addresses are. The "search" for the destination of an IP packet proceeds hop by hop across the routers along the path to the destination. Name lookup could work the same way.

Another possible model is just to adopt a flat (but comprehensive!) database. Lauren's examine of putting naming information into Google is one such approach--- everybody knows how to get to Google, and Google's cache contains information on how to get anywhere else. One could imaging scaling this up using (application-level, probably) multicast, so every participant could get all updates to naming information. Then any request could be answered locally (using the last pushed version).

On a different axis, you could adopt a model where there is still a hierarchy of name servers, but instead of having one source for data, permit anyone who wants to open a ".com" server to do so. Clients would ask a variety of servers at each level instead of just one, and compare results to detect tampering or censorship.

(This by no means exhausts the space of possibilities!)

bug1 · 12-11-2010, 02:59 AM

I think a PGP style public key could be used as a linchpin to map the "human" name to the machine/resource, the advantage of this is that the data that it maps to can be authenticated with with the key.

One problem is how to deal with duplicate human names without centralization ? (i.e. what if two public keys both want to map cnn.com)

An idea might be to allow duplicates but try and manage them, sort them with a trust/popularity measure, and display a visual of the public key. The user would have to select one either manually or via some automatic selection.

I would like to hear other ideas about dealing with duplicate human names in a decentralized environment.

rev412 · 12-12-2010, 03:50 PM

I would definitely like to get more into the details of how this might work. Over the years I've thought some about how my ideal name system would work, so I have ideas about requirements but when it comes to realizing those goals I've had less success.

For instance:

permanence - You get a name, you keep that name for as long as you want.
distributed - I assume this would be built with lots of little servers, none of which is sufficiently large to hold the entire database. Given the existence of Google, perhaps that assumption is no longer as valid as it once was.
replication - For robustness, any record should appear on multiple servers and be findable when the other name server[s] is[are] down.
transportability - If I decide for whatever reason that I no longer want to use the name server I was using, I want to be able to switch without having to change my permanent name.
reliability - In the absence of network communications failure, I expect a protocol that will converge and give a definite answer. That is, I don't think stochastic protocols are advisable.

In exchange:

not human friendly - The names at this level do not need to be human friendly names. They're not words or trademarks (especially not trademarks), can even be binary.
structure - Any structure to the names can be purely that which is useful for making the name lookups and management work more tractable.

So, I don't know enough about distributed hash tables to know how well they achieve these sorts of requirements. The way I picture being able to break up the key space to distribute it amongst a variety of servers would work badly with transportability. In fact, in general I expect that requirement to be one of the most difficult to fulfill.

My understanding of BGP is that while the routers may not have a full map of the Internet, they do have a full routing table. That is, every destination network is listed. Any aggregation that takes place is due solely to the hierarchical nature of IP networks and subnets and hierarchy in the namespace is one of the things IDOMS wants to avoid.

I'm not trying to be negative here, even though I can see that's how it came out, but in my thinking about the naming problem this is where I have always gotten stuck. So I'm asking for how it can work.

rev412 · 12-16-2010, 01:13 PM

I ended my previous comment saying I didn't wish to be negative and I'm beginning this one by saying it may sound even more negative. Again, that's not my goal here. I'm just looking at the problem and zeroing in where I get stuck, where I find a problem I don't know how to solve. I'm hoping that someone will come along and point out a great solution that I've just missed so far. So, with that introduction . . .

I finally got around to reading up a bit about distributed hash tables, chord, tapestry, and so on. Neat designs and my first reaction was to think this was great, we'll just pick up a library for one of those and go to town. Not so fast. I can see how the architecture uses the keyspace to zero in on a server that holds the answer in O(logN) operations even though each server only holds a limited amount of information. I can see how to make this type of architecture work with an arbitrary number of servers, a number that changes over time, how data can be replicated, how data can be automatically balanced around the system. It's all very slick. The thing is, when I apply this to an Internet naming system, I ask, "who runs the servers?" What if I, as a customer of the naming system, don't like whoever runs the server I end up being hosted on? Data records are grouped on servers by their key. You can't just arbitrarily move any record to any other server.

So how would this work in practice? I think various people would run these name servers and join in the system. They would have to cooperate to make it all work. They'd form some sort of forum that anyone who's involved would be encouraged and required to participate in. They would re-create ICANN by another name.

This isn't specific to distributed hash tables; as far as I can tell it will be a problem with any system that works by using a key to zero in on which server to talk to. The only other option I see is, as judic said, for a request to look everywhere. That's an answer that doesn't scale very well, to say the least.

So my questions to all the people out there:

Distributed hash tables are being used for various P2P applications. How are they working in real life? How do they work around malicious or annoying servers?
s there a social structure that could grow up around this sort of system that wouldn't evolve into yet another ICANN? Is there any way the underlying technology can help with that?
Is there another data structure that doesn't have this problem?
Is my analysis in error? Something I missed and this isn't a problem after all?

Mark Gritter · 12-16-2010, 01:41 PM

(12-16-2010 01:13 PM)rev412 Wrote: What if I, as a customer of the naming system, don't like whoever runs the server I end up being hosted on? Data records are grouped on servers by their key. You can't just arbitrarily move any record to any other server.

I share this concern with the various "unmanaged" approaches.

The usual response you will get is that you need to insert a variety of keys so that your information is not dependent upon just one person. For example, to store the addressing information from cnn.com, you don't just insert HASH(cnn.com), you insert HASH(cnn.com,0), HASH(cnn.com,1), HASH(cnn,com,2) and then clients pick some random subset of those to use. So while you cannot prevent a "bad guy" from serving some of your keys, you can (probabilistically) ensure that not all your keys are served by bad guys.

rev412 · 12-16-2010, 05:44 PM

Nice, I like that. I'd thought about using replication on different servers operated by different entities as a way of protecting against rogue servers, if you will. I like your idea of doing it explicitly with several well-known variants on the key you're looking up.

However, I was mainly thinking in a slightly different direction. At some point we'll need to address the question of the economic model of these new name servers. If we have any market where people pay for name service, we need also to allow for people to change service providers.

***lauren*** · 12-16-2010, 06:30 PM

On the subject of "economic models," please remember one primary IDONS concept on my list. There should be no IDONS functionalities that users -- who choose to do so -- cannot perform by themselves without paying anyone anything. Users who do not have suitable capabilities on their own, or choose to delegate functions to outside entities, of course might choose to pay for such services. But I do not want to see the creation of another "domain-industrial complex" under a different name, and I really want to avoid seeing users sucked into buying services related to IDONS that are unnecessary or overpriced.

--Lauren--

rev412 · 12-16-2010, 08:13 PM

Short aside, have you written this list down, Lauren? I'd be interested to read it.

Anyway, strong agreement with this. If that goal is met, I think my goal of transportability is completely satisfied. However, my comments and questions in post #5 were aimed at whether or not this is even possible without a central control of some sort.

What I'm thinking is that if the key is used to find the name server that contains your record, then you can't change servers without changing that key and the whole point of the key was that it was fixed and permanent. If you don't use the key to find the right name server, then you have to search all servers.

Tell me where I'm wrong. Please. I don't like this conclusion.

***lauren*** · 12-16-2010, 08:21 PM

Let's avoid conflating the concepts of identifiers and keys at this stage. As for my "goals list," it's in the original IDONS Forums announcement and blog item.

--Lauren--

rev412 · 12-17-2010, 05:15 AM

Well, yes, I was assuming there that the identifier would be used as the key in doing the Identifier -> IP Address part of the translation. That's the core issue, as far as I can see, in making this work at all. All the rest are details, important details to be sure, but details that can't really even be talked about sensibly until the overall architecture is set and that's going to be driven by how this Identifier -> IP Address lookup works.

rev412 · 12-19-2010, 02:35 PM

So not having thought my way out of the problem where whatever key is used for the database lookup has to change if you change name servers, I decided to try a different approach. I thought to just try to remove the data itself from the database (that makes sense in MY head) back to the users and then allow for multiple name-system databases so you're not stuck with one.

I'm not entirely sure it works but I toss it out there for discussion.
http://www.froghouse.org/~dab/idons/index.html

eternaleye · (This post was last modified: 12-24-2010 12:08 AM by eternaleye.)

(12-17-2010 05:15 AM)rev412 Wrote: Well, yes, I was assuming there that the identifier would be used as the key in doing the Identifier -> IP Address part of the translation. That's the core issue, as far as I can see, in making this work at all. All the rest are details, important details to be sure, but details that can't really even be talked about sensibly until the overall architecture is set and that's going to be driven by how this Identifier -> IP Address lookup works.

I think that part of the problem here is that the same term is being used for key as in 'key-value mapping' and key as in 'cryptographic key'. I think that for key-value mapping we could refer to it as a symbolic lookup and refer to cryptographic keys by that full name to avoid confusion.

EDIT: Forgot to mention that a logical consequence of calling it a symbolic lookup is that such 'keys' are now 'symbols'

EDIT 2: This is convenient, because if there is a non-human-readable identifier that can be mapped to by many human-readable names, but is a stable way to refer to an entity, we could call that a 'level one symbol' - which points to a physical resource, and any symbols that point to *that* symbol could be 'level two symbols', and so on.

EDIT 3: Just tossing this out there, a possible example of a level one symbol might be an entry in the DHT where the lookup is for HASH( pubkey_fingerprint, well_known_nonce ) [for several well-known nonces, for malicious-entity resistance] and the value is a physical resource spec (maybe IP address, maybe something else) which *must* be signed by the cryptographic key with the relevant fingerprint. This makes validation trivial.

EDIT 4: Sorry for all the edits, but it's probably okay as long as nobody's replied yet. To clarify, this would make identifiers 'level one symbols' and names 'level two symbols'. I'm mainly using the term symbol as an abstraction to avoid terminology collisions between the key-value type of key and the cryptographic type.

eternaleye · (This post was last modified: 12-24-2010 12:27 AM by eternaleye.)

Hmm, ran out of edit time right when I had an idea. Maybe one could impose an invariant that a level n+1 symbol must be a signed record, with the signing key being the same as the level n symbol it points to, with level 1 symbols being the terminating case where the symbol itself is intrinsically tied to the value. That would put a crimp in any attempt to hijack a lookup from anywhere but the name level, since the records pointing to the hijacked record would have a key that doesn't match the hijacked record. More thought is needed on how to prevent name-level hijacks by colluding malicious DHT nodes, though.